Your Apple Watch Knows You Have Sleep Apnea. It Just Won't Let That Affect Your Score.

On the morning of March 31st, my Apple Watch Ultra notified me that I may have sleep apnea and should speak with a doctor. It wasn't the first time. The notification is generated by watchOS's FDA-cleared sleep apnea detection feature, which uses the Watch's accelerometer to identify breathing irregularities during sleep. Apple earned that clearance. The feature works.

What the notification didn't tell me was what the preceding three nights looked like in the data the Watch had already collected. My blood oxygen averaged 91.4%, 92.0%, and 92.1% across March 29, 30, and 31 the worst three-night cluster in 95 days of continuous tracking. My respiratory rate hit 28.2 breaths per minute on March 30, the second-highest reading in that entire dataset. My heart rate variability collapsed to 22 milliseconds, roughly half what it should be for someone my age. The Watch saw all of it. It connected enough dots to file a clinical-grade alarm. And then, almost certainly, it gave me a Sleep Score that didn't reflect any of it.

I have a formal diagnosis of Central Sleep Apnea. I wear both an Apple Watch Ultra and an Oura Ring. I've spent more than two years collecting data from both, and what that data shows isn't just a gap between two consumer devices. It's a contradiction sitting inside a single piece of hardware, between two systems that have reached opposite conclusions about the same person's health on the same night.

Apple built a Sleep Score optimized to make you feel good about your sleep. That's a design choice. For most people, it's probably fine. For anyone with a diagnosed sleep disorder, it can quietly work against the clinical system Apple built right alongside it.

❗

All health data cited in this article is my own, collected over two years using an Apple Watch Ultra and an Oura Ring Gen 4. I carry a formal diagnosis of Central Sleep Apnea. Apple Health data was pulled directly via HealthKit. Oura data was retrieved from the full app data export via Oura Cloud API. No data was estimated or reconstructed.

Two Systems, One Device

Apple's sleep apnea detection feature arrived with watchOS 11, available on Apple Watch Series 9, Series 10, Ultra 2, and Ultra 3. It uses the accelerometer to detect wrist movements associated with breathing disturbances during sleep, a method Apple validated through clinical trials and submitted to the FDA for de novo clearance as a medical device feature. When the algorithm crosses its confidence threshold over a 30-day observation window, it surfaces a notification telling you to see a doctor.

That notification is serious. Apple didn't build it to engage you with the Health app. It built it because sleep apnea, left undetected, carries real cardiovascular risk. The FDA clearance exists precisely because the stakes justify regulatory oversight.

The Sleep Score is a different creature entirely. Introduced alongside the sleep apnea feature in watchOS 11, it distills a night of sleep into a single number between 0 and 100. Apple weights total sleep duration, efficiency, time in each stage, heart rate, and respiratory rate. The goal is clarity. Sleep is complicated, and a single score is easier to act on than a wall of metrics.

The problem is what that simplification costs when the sleeper has a diagnosed breathing disorder.

Apple Watch — two systems, one device

System 1 — FDA cleared

Sleep Apnea Detection

Uses accelerometer data over a 30-day window to detect breathing irregularities. Fires a clinical notification when confidence threshold is crossed.

Accelerometer 30-day window FDA cleared

System 2 — Consumer wellness

Sleep Score

Distills one night of sleep into a number from 0–100. Weights total duration, efficiency, stage breakdown, heart rate, and respiratory rate.

Nightly composite 0–100 scale Score not queryable via HealthKit

The contradiction

Both systems run on the same hardware, reading the same sensor data, on the same night. System 1 can determine you have a serious breathing disorder with enough confidence to file a clinical alert. System 2 can look at the same night and hand you a score in the 80s. Apple has never publicly addressed how these two outputs are reconciled.

Apple Watch Ultra 2 / Ultra 3 · watchOS 11 · Data: author's own Apple Health export

What Two Years of Data Actually Shows

Over 643 nights tracked by the Oura Ring between April 2024 and May 2026, and 95 consecutive nights tracked by Apple Watch from February through May 2026, a picture emerges that no press release walkthrough of either product's features would prepare you for.

The Oura data alone contains a story worth sitting with. My average sleep score across those 643 nights was 72.2. On 33% of all nights, the score fell below 70. On 69 nights it fell below 60. The lowest single score was 28, recorded on October 8, 2025, the kind of number that in any other context would prompt a follow-up conversation with a clinician.

Apple Watch Ultra · Feb 1 – May 6, 2026 · 95 nights

Blood oxygen during sleep — daily average SpO2

3-month avg

93.8%

Normal: 95–100%

Nights below 95%

93 / 95

98% of all nights

Longest streak <95%

48 days

Consecutive nights

Lowest recorded

91.1%

Mar 5, 2026

Daily avg SpO2 95% healthy threshold Sleep apnea notification (Mar 31)

Source: Apple Health · HealthKit oxygenSaturation · Author's own data

But the number that tells the real story isn't the sleep score. It's the Breathing Disturbance Index: the per-hour count of respiratory irregularities Oura tracks throughout each night. Oura's own documentation flags a BDI above 20 as a potential indicator of sleep-disordered breathing. My average across the full two-year dataset was 20.2. That average is itself sitting on the threshold. The distribution underneath it is what matters.

The Escalation Nobody Scored

From April 2024 through April 2025, my BDI averaged 16.5 per night. Elevated for a healthy adult, unsurprising for someone with CSA, but relatively stable. Then something shifted.

From May through November 2025, my monthly average BDI never dropped below 28.8. For seven straight months, breathing disturbances averaged 30 per hour. In September 2025, a single night hit a BDI of 68. Twenty-one nights across that stretch exceeded 40. On 67% of the nights in that seven-month window, the BDI crossed Oura's own warning threshold of 20.

This was a documented clinical deterioration. Not a bad week. Seven months of worsening sleep-disordered breathing, visible in the data, accumulating night after night.

Then in December 2025, the BDI collapsed. From 31.2 in November to 13.2 in December to 8.7 in January 2026. No deliberate intervention that I can identify. Central apnea fluctuates with stress load, cardiovascular changes, sleep position, and factors that often don't announce themselves. Whatever drove the escalation apparently resolved on its own.

Apple Watch Ultra · Feb 1 – May 6, 2026

HRV and respiratory rate — 95 nights

Avg HRV (SDNN)

34.5 ms

Median for age: ~55ms

Feb–Mar avg HRV

29.7 ms

Worst sustained period

Apr–May avg HRV

42.2 ms

+12.5ms improvement

Avg resp. rate

21.0 br/min

Healthy ceiling: 20

HRV SDNN (ms) — dashed red line = 40ms poor recovery threshold

Respiratory rate (br/min) — dashed green line = 20 br/min healthy ceiling

Source: Apple Health · heartRateVariabilitySDNN + respiratoryRate · Author's own data

What Oura's sleep score did during all of this is instructive. During the escalation period, the average score was 72.6. During the recovery period, it was 74.0. A difference of 1.4 points across a clinical arc that saw BDI drop by more than 20. The score did not track the deterioration. It did not track the recovery. It produced essentially the same number throughout a two-year period in which my breathing during sleep went from manageable to severely disrupted and back again.

To be clear: Oura isn't completely blind to the problem. My worst nights during the escalation did tend to score lower. The correlation exists. It's just weak, a correlation coefficient of -0.033 between sleep score and BDI across the full dataset, meaning BDI barely moves the needle. On 30% of the 91 nights when my BDI exceeded 30, Oura still scored my sleep above 75. On 14 of those nights, above 80. Neither platform is giving a fully honest accounting. The difference is that Oura doesn't also have an FDA-cleared clinical alarm sitting in the same app, and it doesn't advertise that alarm while simultaneously smoothing over the signals that drive it.

The Night the Watch Finally Said Something

The March 31st notification didn't arrive in a vacuum. It arrived during one of the worst physiological stretches in 95 days of Apple Watch data. Three consecutive nights with SpO2 in the low 92s. Respiratory rate peaking at 28.2 breaths per minute, a number more consistent with moderate physical exertion than sleep. HRV floored at 22 milliseconds across all three nights. Every metric the Watch tracks pointed in the same direction for 72 consecutive hours.

The morning after the notification, my Oura ring scored that night a 47. Readiness: 56. Oura was unambiguous, something was wrong, the body hadn't recovered, the day should be adjusted. The Watch had fired its clinical alarm the morning before. And yet neither system has a mechanism to connect those events in a way visible to the user. The notification happened. The low score happened. The Sleep Score whatever it showed, sat beside both of them, doing its own calculation.

Apple Watch Ultra + Oura Ring · March 29 – April 2, 2026

The worst five-night window in 95 days

Apple Watch fired a sleep apnea notification on March 31. Every metric in the dataset bottomed out simultaneously.

Mar 29

Apple SpO2

91.4%

Resp. rate

26.5

HRV

23ms

Resting HR

78 bpm

Oura ring not worn

Mar 30

Apple SpO2

92.0%

Resp. rate

28.2

HRV

22ms

Resting HR

84 bpm

Oura not worn · 28.2 br/min is the 2nd highest respiratory rate in 95 nights

Mar 31

Alert

⚠️

Apple Watch: Sleep Apnea Notification

"Signs of sleep apnea were detected while you were sleeping. You may want to speak with your doctor." — Morning of March 31, 2026.

Apple SpO2

92.1%

Resp. rate

25.9

HRV

30ms

Oura not worn

Apr 1

Ring on

Apple SpO2

92.6%

Oura score

Oura readiness

Apr 2

Recovery

Apple SpO2

93.9%

Oura score

Oura readiness

Source: Apple Health HealthKit + Oura Cloud API · Author's own data · Diagnosis: Central Sleep Apnea

A score of 47 from Oura isn't a yellow flag. It's a system telling you plainly that last night was bad. Apple's Sleep Score value for the same window is something I can't confirm, because Apple doesn't write Sleep Score values back to HealthKit as queryable data. The score lives inside the Sleep app and doesn't persist in a format that allows historical analysis. That's a design choice worth naming: the metric Apple puts most prominently in front of users is the one it makes hardest to audit over time.

What the Sensor Gap Adds

Comparing Oura and Apple Watch SpO2 on the 34 nights where both devices recorded blood oxygen produces a consistent gap. Oura averaged 95.4% on those nights. Apple averaged 93.9%. A 1.5 percentage point systematic difference, in the same direction, every time.

Finger-based optical sensors like Oura's are generally considered more accurate than wrist-based PPG for blood oxygen measurement. The wrist has lower capillary density, and the Watch's sensor has to contend with movement artifacts and skin contact variability in ways a ring does not. Apple has acknowledged wrist PPG limitations in its own device documentation.

Apple Watch Ultra + Oura Ring · 49 matched nights · Feb–May 2026

Oura sleep scores against Apple SpO2 — same nights

Oura avg score

74.8

Range: 47–90

Oura nights below 70

9 nights

18% of matched nights

Apple SpO2 avg

93.9%

Normal: 95–100%

SpO2 below 94%

32 / 49

65% of matched nights

Oura sleep score (bars, left axis 0–100) Apple SpO2 (line, right axis 90–98%) 95% threshold

Source: Oura Cloud API (daily_sleep) + Apple Health HealthKit (oxygenSaturation) · Author's own data · Diagnosis: Central Sleep Apnea

34 nights with both sensors recording SpO2 · Feb–Apr 2026

The sensor gap: Oura ring vs Apple Watch SpO2

Oura Ring Gen 4 — finger sensor

95.4%

Avg SpO2 · 34 nights

Min: 91.9% · Nights <95%: 11/34

1.5%

gap

Apple Watch Ultra — wrist PPG

93.9%

Avg SpO2 · same 34 nights

Min: 91.4% · Nights <95%: 32/34

Oura SpO2 Apple SpO2

Source: Oura Cloud API (daily_spo2) + Apple Health HealthKit · 34 nights with both sensors recording · Author's own data

If Apple's sensor reads 1.5 points lower than a more accurate reference device, the SpO2 data feeding into the Sleep Score algorithm is already starting from a depressed baseline. The Watch isn't scoring against what your oxygen saturation actually was. It's scoring against what its wrist sensor estimated, and that estimate runs consistently low. For a healthy sleeper, a 1.5-point gap at the 96-97% range doesn't change much. At 93-94%, where I sit chronically, it matters.

The Design Philosophy Problem

None of this means Apple's health engineering is careless. The sleep apnea detection feature is genuinely impressive work, and FDA de novo clearance for a consumer wearable is not a trivial achievement. Apple has invested seriously in turning the Watch into a clinical instrument for specific, high-stakes conditions.

The problem is the layer sitting on top of that work. The Sleep Score isn't a clinical instrument. It's a consumer engagement feature, designed with consumer psychology in mind. Scores that consistently land in the 60s cause users to disengage. Scores in the 80s keep them opening the app. Apple has every structural incentive to weight the algorithm toward the high end, and the input that would most reliably drag it down for someone with CSA breathing disturbance data the Watch is already collecting is precisely the input that doesn't appear to move the needle.

That's the contradiction Apple hasn't resolved. They built a feature that says you may be seriously ill. They built it on the same hardware as a feature that says you slept great. Both outputs exist. Only one is designed to keep you engaged with the product. For someone without a prior diagnosis, that dynamic isn't neutral. They receive the notification, feel the appropriate alarm, schedule a sleep study. And then every morning while they wait for that appointment, their Watch hands them a reassuring number that quietly tells them most nights are actually fine.

The score doesn't intend to undermine the notification. It just does.

What Better Would Look Like

Across 643 nights of Oura data, the BDI moved dramatically. My condition visibly worsened for seven months, then recovered. The sleep score barely registered either event. A scoring system that genuinely incorporated breathing disturbance data would have tracked that arc. It would have been lower during the escalation. It would have recovered when the BDI recovered. It would have given me information instead of reassurance.

Apple already has the data. The Watch tracks respiratory rate every night. The sleep apnea detection feature processes movement patterns associated with breathing irregularities. The raw ingredients for a more honest score exist inside the device I already wear.

Several changes would close the gap meaningfully. Surfacing the breathing disturbance count as a visible nightly metric the way both platforms surface time in each sleep stage would be a start. Allowing Sleep Score to be queried historically through HealthKit would let users actually audit the relationship between their physiology and their scores over time. And when a user has received a sleep apnea notification within the previous 30 days, the Sleep Score algorithm should weight respiratory metrics differently. The Watch already knows the clinical context. It is choosing not to use it in the output users see most.

The Apple Watch Ultra I wear has sensors precise enough to detect a respiratory event, run it through an FDA-cleared algorithm, and decide whether to file a medical-grade alert. That capability is real. The score that appears alongside it should reflect the same seriousness. Until it does, the Sleep Score isn't just incomplete for someone with a diagnosed sleep disorder. It's working in direct opposition to the feature Apple is most proud of building.

Your Apple Watch Knows You Have Sleep Apnea. It Just Won't Let That Affect Your Score.

Two Systems, One Device

What Two Years of Data Actually Shows

The Escalation Nobody Scored

The Night the Watch Finally Said Something

What the Sensor Gap Adds

The Design Philosophy Problem

What Better Would Look Like

Justin

Comments

Join the discussion

You might also like

Two newsletters. One publication.