Contactless Breath Monitoring and Exercise (COPD)

This research group focuses on advancing contactless breath monitoring and exercise assessment for chronic respiratory conditions like COPD. Their work leverages ubiquitous sensing technologies, including smartphone microphones, smart speakers, earphones, and depth cameras, to create accessible and non-intrusive health tools. These innovations enable a range of functions, from detecting acute exacerbations by analyzing coughs and breathing sounds to guiding airway clearance therapy by localizing sputum. The goal is to move critical pulmonary rehabilitation and diagnostic capabilities out of the clinic and into low-resource community and home settings, thereby improving health equity and patient self-management.
Specific systems developed include EarSpiro and EasySpiro, which use earphones to perform accurate lung function tests without requiring maximal patient effort. Another system, DeepBreath, employs a depth camera to assess breathing exercises by simultaneously estimating lung volume and breathing mode, even with body movement. Further innovations include BreathMentor, which uses a smart speaker to monitor diaphragmatic breathing, and a novel method that uses audible music instead of ultrasound for respiration tracking. Collectively, these AI-driven solutions demonstrate high accuracy and robustness, offering a comprehensive, cost-effective framework for modern respiratory care.
Related publications:
[npj Digital Medicine, 2025] Y. Gong, C. Xu, C. Mo, J. Wu, H. Lin, H. Su, Q. Zhang, Q. Zhang, S. Yang. “AI-Driven Smartphone Screening for Acute COPD Exacerbations: A Non-Self-Report Approach to Improve Health Equity in Developing Regions”
Introduction: This study developed an AI-driven smartphone system for detecting acute exacerbations of chronic obstructive pulmonary disease (AECOPD), particularly targeting resource-limited regions. Utilizing standard smartphone microphones to analyze breathing and coughing sounds, the system operates without requiring subjective patient-reported symptoms, achieving a high area under the curve (AUC) of 0.955, indicating excellent diagnostic performance. Additionally, a health-economic model projected per capita savings of approximately 456.9 CNY, supporting the cost-effectiveness of implementing this technology in underserved areas. The research highlights the potential of AI solutions to enhance COPD management and promote health equity.
[ACM IMWUT 2025] Y. Gong, W. Xie, C. Xu, Q. Zhang, S. Yang. “SputumLocator: Enhancing Airway Clearance with Auscultation-based Sputum Localization”
Introduction: This paper introduces SputumLocator, a novel system that uses a digital stethoscope to accurately localize accumulated sputum in patients with Muco-Obstructive Lung Diseases (MOLDs). The system streamlines airway clearance by targeting percussion therapy only to affected areas, overcoming the limitations of costly imaging or expert-dependent auscultation. It employs a two-component model: SputumEmbedder extracts key sound features, and SputumClassifier maps them to specific back quadrants. Evaluated on 43 diverse patients, SputumLocator demonstrated high robustness and accuracy, achieving a sensitivity of 0.97, specificity of 0.82, and an F1-Score of 0.83, making it highly suitable for community and home care settings.
[ACM MobiCOM 2025] C. Xu, W. Xie, B. Yang, Y. Zhang, Y. Gong, J. Zhang, Wei Li, S. Yang, Q. Zhang. “EasySpiro: Assessing Lung Function via Arbitrary Exhalations on Commodity Earphones”
Introduction: This study introduces EasySpiro, a novel system that enables pulmonary function testing (PFT) using non-maximal exhalations, overcoming the challenge of requiring maximal efforts in unsupervised, at-home settings. The key innovation is a reconstruction model that predicts ideal maximal breathing patterns from submaximal ones, guided by body dynamics encoded via self-supervised learning. Implemented on instrumented earphones, the system was evaluated on 50 hospital patients and accurately predicted standard PFT indicators with a 7% error rate. The collected dataset has been open-sourced to advance further research in cost-effective, home-based respiratory monitoring.
[IEEE BSN, 2024] Y. Gong, W. Xie, Q. Zhang, and S. Yang, “Hypergradient Descent Based Multi-Task Learning on Auscultation Point Guided Respiratory Sound Classification”
Introduction: This study addresses limitations in automated respiratory sound classification by proposing a multi-task learning (MTL) model. The approach simultaneously learns to identify respiratory sound types and their corresponding auscultation points, overcoming the challenge where normal sounds vary by location and can be confused with pathologies. To enhance robustness, hyper-gradient descent is used to balance the tasks’ weights. The method achieves a state-of-the-art score of 62.98% on the ICBHI dataset, surpassing the baseline by 3.43%. This work demonstrates significant potential to improve diagnostic accuracy and can be effectively integrated with data augmentation techniques for broader clinical application.
[ACM IMWUT 2024] W. Xie, C. Xu, Y. Gong, Y. Wang, Y. Liu, J. Zhang, Q. Zhang, Z. Zheng, and S. Yang, “DeepBreath: Breathing Exercise Assessment with a Depth Camera”
Introduction: DeepBreath is a novel depth camera-based system that overcomes key limitations in existing contactless breathing exercise assessments for COPD patients. It simultaneously estimates breathing mode (chest/belly) and lung volume using a correlated multitask learning framework, which enhances classification accuracy. The system is calibration-free, employing a data-driven, lightweight UNet-based model for universal lung volume estimation. It is also resilient to involuntary body motions through a temporal-aware compensation algorithm. Validated with 36 subjects (healthy and COPD patients), DeepBreath achieves high accuracy in realistic conditions, surpassing previous methods that relied on restrictive assumptions like distinguishable patterns and motionless setups.
[ACM IMWUT 2023] W. Xie, Q. Hu, J. Zhang, Q. Zhang, “EarSpiro: Earphone-based Spirometry for Lung Function Assessment”, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 6 (4), 2022.
Introduction: This paper introduces EarSpiro, an innovative earphone-based system designed to perform spirometry by interpreting airflow sounds into a comprehensive flow-volume (F-V) curve. Unlike previous mobile solutions, it captures both expiratory and inspiratory measurements, which are crucial for diagnosing certain lung diseases. The system employs a CNN-RNN model to correlate sound with airflow speed and a clustering algorithm to detect weak inspiratory signals. Using transfer learning, it can adapt to common objects like funnels. Testing with 60 subjects demonstrated high accuracy, with a mean correlation of 0.94 to true F-V curves and a 7.3% average error for key lung function indices.
[ACM IMWUT 2022] Y. Gong, and Q. Zhang, “BreathMentor: Acoustic-based Diaphagmatic Breathing Monitor System”
Introduction: BreathMentor is an unobtrusive, privacy-preserving system that uses a smart speaker as an active sonar to monitor diaphragmatic breathing for COPD patients at home. It formulates breathing monitoring as a Temporal Action Localization task, leveraging breathing periodicity and a hybrid signal processing-deep learning architecture. The system detects respiration rate, derives breathing phases, and classifies breathing type. Evaluated on healthy subjects, it demonstrated robust performance with a median respiration rate error of 0.2 BPM, high accuracy for phase detection, and over 95% precision in identifying diaphragmatic breathing, enabling effective tracking of patient adherence and recovery progress.
[IoTJ 2020] W. Xie, R. Tian, J. Zhang, Q. Zhang, “Noncontact Respiration Detection Leveraging Music and Broadcast Signals”, IEEE Internet of Things Journal, 8 (4), 2020.
Introduction: This article proposes a novel acoustic-based respiration monitoring system that uses everyday audible sounds, such as music or radio broadcasts, instead of ultrasonic signals. This approach avoids potential audibility issues for children, pets, and plants associated with ultrasound. The system estimates the channel impulse response (CIR) from the audio to derive respiration rate, while specifically addressing challenges like intersymbol interference from random audio, multipath effects, and sampling frequency offsets. Extensive testing demonstrated the system’s high accuracy, achieving a mean error of less than 0.5 breaths per minute (BPM) across different types of audio signals.
Demos:
- EarSpiro
- BreathMentor
- DeepBreath
- EasySpiro
- AECOPD