This document describes the principles, processes, and best practices used in the development of the Hyfe cough detection algorithm.
The Hyfe cough detection algorithm is an on-device machine learning system that runs on a wrist-worn smartwatch. Its purpose is to detect cough events from a stream of real-time, continuous audio, and generate corresponding timestamps. The algorithm is optimized for:
The system continuously acquires audio from an onboard microphone, and then processes that audio through a two-step algorithm (first, a lightweight feature-extraction pipeline for the identification of onset/explosive events; second, an ML classifier which categorizes events as cough or non-cough) and, when a cough signature is detected, stores a timestamp locally.
Though not a medical device, the algorithm development process followed well-established machine learning and medical device software practices:
Algorithm development requires a large collection of raw acoustic data representing both coughs and non-cough events and contexts across a variety of environmental and physiological contexts. The training/development dataset consists of continuous audio recordings collected from individuals wearing/using an audio capture device (the ID206 device itself; phones; third-party wrist-worn audio recorders) in both real-world and controlled settings. Data sources included:
Strict controls governed data provenance:
High-quality labels are critical, as the algorithm’s performance depends directly on the accuracy and consistency of ground truth. Accordingly, the generation of labels is governed by a rigorous multi-step process:
The dataset used for algorithm training was partitioned using best practices that prevent overfitting and ensure generalization:
The classifier training dataset consisted of over 500,000 snippets derived from the labeling process. Ambiguous or low-confidence labels (ie, those labeled as “far” or “not sure”) were excluded entirely from training, reducing the risk of label noise affecting model performance. In testing, they were included, and accuracy statistics were computed for both their inclusion and exclusion.
The “holdout” dataset consists of data with the following characteristics and provenance:
The model architecture was optimized for:
Hyperparameters were tuned through systematic experimentation, always referencing the validation set—not the test set.
The algorithm was built specifically for deployment on a constrained wearable device. This meant that the strict conditions had to be met in regards to the following areas:
The cough detection algorithm was developed using a strict privacy-by-design approach in which the device never stores, transmits, or makes accessible any raw audio. All sound captured by the microphone is processed immediately through a lightweight feature-extraction pipeline, and only the resulting cough/non-cough decision and timestamp are retained. No audio files, waveforms, or spectral representations are written to persistent storage, and no audio leaves the device at any stage. This eliminates the possibility of reconstructing speech, background conversations, or other sensitive sounds, protecting users from inadvertent capture of personally identifiable information. By architecting the system so that the algorithm’s inputs exist only ephemerally and the outputs contain no acoustic content, Hyfe ensures that continuous monitoring can occur safely in intimate, everyday environments without compromising user privacy.
In the algorithm’s output:
Real-world audio contains substantial variability. To address this, the algorithm incorporates:
Training data was intentionally curated to include both challenging and typical environments, enabling stronger generalization.
The development of the Hyfe cough detection algorithm followed widely recognized industry best practices for machine learning systems specifically, and software development more generally. These practices emphasize reproducibility, transparency of development processes, data governance, and robustness in real-world deployment. Hyfe adhered to these principles in the following ways:
Collectively, these elements demonstrate alignment with modern development frameworks such as human-centered design principles, edge-AI safety guidelines, and emerging regulatory expectations for trustworthy and transparent machine-learning systems.