Wize Sleep is an iOS app that reads sleep data from your wearable via Apple Health and computes a comprehensive sleep score from 10 health metrics: total sleep, REM sleep, deep sleep, sleep efficiency, sleep latency, resting heart rate, HRV, respiratory rate, SpO2, and restfulness.

What devices work with Wize Sleep?

Wize Sleep works with any wearable that writes sleep data to Apple Health, including Apple Watch, Oura Ring, WHOOP, Ultra Human Ring, and NextSense SmartBuds.

What is the WIZE token?

WIZE is an SPL token on the Solana blockchain. You earn WIZE tokens by syncing your health data daily in the Wize Sleep app. Tokens are transferred directly to your Solana wallet — you own them outright, on-chain.

How do I earn WIZE tokens?

You earn 10 WIZE tokens every day you sync your health data. Open the app, let it sync from Apple Health, and receive your daily reward. Connect a Phantom wallet to receive tokens on-chain.

What is Phantom wallet?

Phantom is a popular Solana wallet app available on iOS and Android. To receive WIZE tokens on-chain, connect your Phantom wallet in the Wize Sleep app. Download Phantom from the App Store — it is free and takes 2 minutes to set up.

Yes, Wize Sleep is free during the beta period. It is available on iOS via Apple TestFlight and requires iOS 17 or later.

Where do the sleep benchmarks come from?

Benchmarks come from population health research databases including NHANES (National Health and Nutrition Examination Survey), the FRIEND Registry (Fitness Registry and the Importance of Exercise National Database), the Lifelines cohort study, the KORA-FF4 study, and the Apple Heart Study.

What does this study protocol propose to test?

Whether sleep stages can be predicted in real time from a consumer smartwatch’s heart-rate, accelerometer, and microphone signals, validated against simultaneously recorded ear-EEG as the reference standard, and how three sensor combinations compare. No data have been collected yet; execution is pending IRB approval.

What is the primary endpoint and success criterion?

Epoch-by-epoch agreement between watch-predicted and ear-EEG-reference stages, measured by Cohen’s kappa. A one-sided non-inferiority test evaluates whether kappa is non-inferior to a substantial-agreement criterion of 0.61, at alpha = 0.05 and 80% power, with up to 18 participants enrolled to yield at least 16 evaluable.

Real-Time Multimodal Sleep Staging from Consumer Wearable Sensors Validated Against Ear-EEG: A Study Protocol

1. Introduction

Sleep is increasingly recognized as central to cognitive, metabolic, and emotional health, yet the gold-standard tool for measuring it—polysomnography—remains confined to the laboratory. PSG is comprehensive but obtrusive, expensive, and typically limited to one or two nights, making it poorly suited to the longitudinal, ecological monitoring that both research and consumers increasingly want. Wearable sensors embedded in consumer devices have transformed access to sleep information, but two gaps persist. First, most consumer systems estimate sleep offline, after the night, whereas many of the most valuable applications—smart-alarm timing, closed-loop audio, just-in-time interventions—require staging in real time. Second, validation has overwhelmingly used wrist actigraphy against PSG, and non-EEG modalities alone struggle to resolve the full stage ladder.

Ear-EEG offers a practical reference standard that is itself wearable: in- and around-ear electrodes recover sleep-stage structure with agreement against PSG approaching expert inter-scorer reliability. This makes possible a study that would be impractical with PSG at scale: validating real-time, smartwatch-based staging against a comfortable neural reference across full nights at home or in a sleep-friendly setting. This protocol specifies such a study.

2. Objectives

Primary objective. To evaluate the feasibility of predicting sleep stages in real time from heart-rate, accelerometer, and microphone data recorded with a consumer smartwatch, by comparing watch-derived stage predictions against simultaneously recorded ear-EEG reference stages.

Secondary objective. To compare real-time staging performance across three sensor combinations: (i) heart rate + accelerometer; (ii) microphone; and (iii) heart rate + accelerometer + microphone.

3. Methods

3.1 Study design. Prospective, single-center, observational study. Each participant contributes one overnight recording session.

3.2 Participants. Adults aged ≥22 years with no known sleep disorders will be enrolled. Up to 18 participants will be enrolled to ensure a minimum of 16 evaluable participants, allowing for ~10% attrition or data loss. Inclusion/exclusion criteria, recruitment, and informed-consent procedures will be specified in the IRB-approved protocol.

3.3 Apparatus and data acquisition. Smartwatch signals will be acquired via standard mobile APIs: tri-axial accelerometer at 50–100 Hz and heart rate at ~1 Hz (CoreMotion), and stereo audio from the built-in microphones (AVFAudio). Simultaneously, an ear-EEG device will record overnight to provide the reference. All streams are timestamped to a common clock to permit realignment across their differing sampling rates. Four sleep stages will be derived: wake, light sleep, deep sleep, and REM.

3.4 Real-time staging pipeline. A staging model will be trained offline on existing labeled data and then applied online to incoming smartwatch streams. Per-epoch features (band-limited and statistical features from accelerometer, heart rate, and audio) feed a classifier producing stage estimates at the standard 30-second epoch cadence. The same trained pipeline will be evaluated under each of the three sensor-combination arms to isolate the contribution of each modality. Real-time operation tolerates modest latency, so on-device or phone-side inference is feasible.

3.5 Reference standard. Ear-EEG recordings will be scored into the four stages above to serve as the per-epoch reference against which watch predictions are compared. Scoring procedure, scorer training, and any consensus rules will be pre-specified in the IRB-approved protocol; current AASM-aligned guidance will be followed where applicable.

3.6 Endpoints. Primary endpoint. Epoch-by-epoch agreement between watch-predicted stages and ear-EEG reference stages, quantified by Cohen's κ. A κ greater than 0.61 is interpreted as substantial agreement. Secondary endpoints. Per-arm κ (the three sensor combinations); per-stage sensitivity and specificity and confusion matrices; and agreement on derived sleep parameters (total sleep time, sleep-onset latency, efficiency, wake after sleep onset) assessed with Bland–Altman analysis.

4. Statistical analysis plan

The primary analysis is a one-sided non-inferiority test evaluating whether the Cohen's κ for watch-based staging is non-inferior to an acceptance criterion of 0.61, at a significance level of α = 0.05 and statistical power of 0.80. Secondary analyses compare κ across the three sensor arms and summarize per-stage performance; derived sleep parameters are compared with Bland–Altman limits of agreement. Missing or unscorable epochs will be handled by a pre-specified rule and reported transparently.

Sample-size justification. The sample size derives from a power analysis for a one-sided non-inferiority test, N = (Zα + Zβ)² · σ² / d², with Zα = 1.96 (α = 0.05) and Zβ = 0.84 (power = 0.80), criterion κ = 0.61. Required N across plausible standard deviations (σ) and effect sizes (d):

σ	Effect size d (κ)	Required N
0.05	0.10 (0.71)	2
0.05	0.05 (0.69)	8
0.05	0.02 (0.63)	49
0.07	0.10 (0.71)	4
0.07	0.05 (0.69)	16
0.07	0.02 (0.63)	96

Assuming σ = 0.07 and d = 0.05 with a 10% dropout allowance, 18 participants will be enrolled to achieve a minimum of 16 evaluable subjects. The final number may be refined from pilot data.

5. Data management

All signals are timestamped to a common clock and stored with provenance. Smartwatch and ear-EEG streams, derived features, model predictions, and reference scores will be retained to permit re-analysis. Data handling, de-identification, retention, and security will follow the IRB-approved data-management plan; given the sensitivity of physiological data, on-device processing and data minimization are preferred where feasible.

6. Ethics, status, and timeline

This study requires Institutional Review Board (IRB) approval prior to any data collection; informed consent will be obtained from all participants. The indicative timeline once IRB is secured: device/app readiness and pilot (1–2 participants), enrollment of up to 18 participants (one overnight session each), scoring and analysis, and manuscript. This protocol may be deposited as a preprint and/or pre-registered prior to data collection.

Real-Time Multimodal Sleep Staging from Consumer Wearable Sensors Validated Against Ear-EEG: A Study Protocol

Abstract

1. Introduction

2. Objectives

3. Methods

4. Statistical analysis plan

5. Data management

6. Ethics, status, and timeline

Frequently asked questions

What does this study protocol propose to test?

Why use ear-EEG instead of polysomnography as the reference?

What is the primary endpoint and success criterion?

Acknowledgements

How to cite

References

More NextSense research