Generative Reconstruction of Multimodal Cardiac Waveforms from a Single Vibrational Cardiography Sensor

Multimodal Cardiovascular Waveform Generation from a Single Vibrational Cardiography Sensor

Background

Cardiovascular disease (CVD) is one of the leading causes of morbidity and mortality worldwide, affecting hundreds of millions of patients each year and imposing a tremendous burden on global healthcare systems. According to literature, billions of dollars are spent annually on medical expenses for CVD, which also impacts patient productivity and quality of life. The widespread prevalence of risk factors such as hypertension, diabetes, obesity, and sedentary lifestyles further increase the challenge of CVD prevention and treatment.

Early detection and intervention are key to improving the efficiency of CVD management and reducing severe cardiovascular events, and wearable health monitoring technologies have become increasingly prominent in this context. By wearing devices that monitor cardiac physiological parameters in daily life, patients can proactively understand their health status and collaborate with physicians for precise, individualized management. The current mainstream cardiovascular monitoring technologies include:

  • Electrocardiography (ECG): By recording the electrical activity of the heart, ECG provides information on heart rate, rhythms, and disease diagnosis, and is considered the “gold standard” of cardiac monitoring.
  • Photoplethysmography (PPG): Utilizing optical sensors to detect blood volume changes, PPG is widely used in smartwatches and fitness bands for heart rate and blood oxygen saturation monitoring.
  • Impedance Cardiography (ICG): By measuring changes in thoracic impedance to assess aortic blood flow, ICG provides important hemodynamic parameters like cardiac output.
  • Non-Invasive Blood Pressure Monitoring (NIBP): Employs devices such as finger clips or wristbands to achieve continuous, non-invasive blood pressure monitoring.

Despite each technique’s unique advantages, real-time multimodal acquisition faces challenges such as numerous sensors, inconvenience in wearing, and difficulties in data synchronization. Since these technologies all originate from the physiological activities of the heart, scientists have proposed whether it is possible to infer or reconstruct other modalities’ signals using information from a single sensor, thus simplifying hardware deployment and enabling efficient multimodal cardiovascular monitoring. This forms the basis for the recent rise of modal transfer in cardiology.

With the help of deep learning and generative models, mappings between a variety of cardiac signals have been explored, such as using PPG to generate ECG, estimating NIBP, or achieving conversion between single-lead and 12-lead ECG. Generative adversarial networks (GAN), U-net, and long short-term memory networks (LSTM) have vastly expanded the technological frontiers of this field, but work has mainly focused on signal translation between pairs of modalities, with a lack of schemes for simultaneous reconstruction of multiple signals.

This study focuses on Vibrational Cardiography (VCG), which encompasses seismocardiography (SCG) and gyrocardiography (GCG). Using an inertial measurement unit (IMU) sensor in single-point contact at the xiphoid process of the sternum, VCG records the mechanical activity of the heart—including cardiac contraction, valvular movement, and blood flow. Previous studies have validated VCG’s ability to capture cardiac, respiratory, and hemodynamic information, as well as measure important indicators like heart rate, cardiac time intervals, cardiovascular disease parameters, and ejection volume.

The innovation of this research lies in the hypothesis that VCG signals, rich in information across multiple physiological domains, could enable the generation of ECG, PPG, ICG, and NIBP multimodal signals from only a single VCG sensor via generative machine learning models. If successful, this would greatly simplify the hardware requirements for wearable monitoring devices, enhancing the practicality and scalability of daily, continuous cardiovascular monitoring.

Paper Source and Author Information

This study was conducted by scholars James Skoric, Yannick D’Mello, and David V. Plant (IEEE Fellow) from the Department of Electrical and Computer Engineering at McGill University in Canada. The paper was published in September 2025 in the IEEE Journal of Biomedical and Health Informatics (Vol. 29, No. 9), a top professional journal in biomedical and health informatics. The research data and experiments were performed at McGill University and approved by the institutional ethics review board (Approval No.: 21-06-035).

Research Process

1. Experimental System and Equipment Design

To ensure high-quality VCG data acquisition, the authors built a custom cardiac vibration measurement system. The core sensor used was a commercial IMU (model MPU9250, InvenSense) capable of recording three-axis acceleration and three-axis gyroscope data (total 6 channels), installed at the xiphoid process and fixed with double-sided tape. IMU signals were sampled at approximately 300 Hz, collected by a Raspberry Pi Zero mini-computer, and saved as text files for later Wi-Fi transmission to an analysis computer. At the same time, all target reference signals (ECG, PPG, ICG, NIBP) were recorded using a Biopac MP160 professional acquisition device, achieving synchronous data collection with VCG—synchronization was ensured through a clock signal from the MP160 hardware-connected to the Raspberry Pi, guaranteeing the temporal consistency of multimodal signals.

2. Subjects and Experimental Participants

The experiment included 20 healthy volunteers (16 male, 4 female), with an average age of 23 years (SD 3.5 years), average height 178 cm, and average weight 76 kg. All subjects had no known cardiovascular, hemodynamic, or respiratory diseases. Some subjects returned for a second session after an interval of about 43 days, resulting in a total of 34 recording sessions and 2,686 minutes of cumulative data.

3. Experimental Procedure and Intervention Scheme

The experimental design incorporated a variety of interventions to cover different cardiac, hemodynamic, and respiratory states, as follows:

  1. Rest Recording: 7 minutes.
  2. High Lung Volume Breath Hold: Inhale maximally then hold breath, 5 times per subject, each for 2 minutes.
  3. Low Lung Volume Breath Hold: Exhale maximally then hold breath, 5 times per subject, each for 1 minute.
  4. Timed Deep Breathing: Alternate between deep inhalation (5 seconds) and deep exhalation (5 seconds) for 5 minutes.
  5. Free Paced Deep Breathing: Unrestricted deep breathing for 5 minutes.
  6. Rest Recording: Another 7 minutes.
  7. Cold Pressor Test: Right hand immersed in 3°C ice water for 1 minute, with a total of 5 minutes of data collection (1 minute rest + 1 minute cold stimulus + 3 minutes recovery).
  8. Extended Rest: 30 minutes, as an interval between cold pressor tests.
  9. Second Cold Pressor Test.

NIBP devices were calibrated before and after interventions; each step included collection of target signals, and subjects remained supine and stationary to reduce motion artifacts.

4. Data Preprocessing

All acquired signals were resampled to 200 Hz to suit model training needs (above the main signal frequency bands), and underwent third-order Butterworth bandpass filtering (0.8–50 Hz, PPG 0.8–8 Hz) to remove baseline drift and high-frequency noise. A sliding window method was used to segment the signals, each window of 512 samples (2.56 seconds), with 50% overlap between windows. NIBP data segments during device auto-recalibration and outliers (abnormally high/low systolic pressure) were conditionally discarded. All segmented signals were then z-score normalized, resulting in 118,772 valid segments for deep learning model training and evaluation.

5. Generative Model Architecture and Training

The authors used a Conditional Generative Adversarial Network (cGAN) and its one-dimensional Pix2Pix framework for multimodal signal generation, with details as follows:

  • Generator: Utilized a U-Net structure (encoder–decoder with skip connections), inputting 6-channel VCG signals (512×6) and outputting 4-channel target waveforms (ECG, ICG, NIBP, PPG; 512×4). The encoder comprised 8 convolutional modules with varying filter sizes (64-128-256-512-512-512-512-512), and the decoder used transposed convolution and skip connections to enhance utilization of high- and low-level features.
  • Discriminator: Employed a PatchGAN structure, judging the authenticity of blocks within the target waveform and consisting of 4 convolutional modules.
  • Loss Function: The generator maximized L1 loss (mean absolute error, reflecting amplitude and structural differences) and adversarial loss (binary cross entropy), while the discriminator also used binary cross entropy. The optimizer was Adam with a learning rate of 2e-4, batch size 32, and total of 5 epochs.

The model simultaneously produced all target signals in a single pass, and was trained and evaluated with leave-one-out cross-validation: 19 subjects’ data were used for training in each round, with the remaining one for independent testing, ensuring that the model never saw the test subject’s data and required no individual calibration—thus reflecting its true generalization capability in real-world scenarios.

6. Signal Evaluation and Analysis

Model output signals were evaluated for structural and amplitude similarity using Pearson’s correlation coefficient ® and Mean Absolute Error (MAE). All generated signals for each subject were segmented and assessed, with each subject’s median result summarized as the overall performance. For key cardiac fiducial points—such as P, Q, R, S, T peaks of ECG; B, C, X points of ICG; onset and peak of PPG; and systolic/diastolic peaks of NIBP—automated annotation algorithms (e.g., Neurokit2) were used. The accuracy of fiducial point detection on model-reconstructed signals was analyzed (with a 250-millisecond tolerance window; mismatches exceeding this were excluded), and the distribution and magnitude of errors were statistically assessed.

Additionally, the study examined the impact of different signal window lengths on model performance, with window sizes set at 2.56 seconds (512 points), 5.12 seconds, 10.24 seconds, and 20.48 seconds (4096 points) for model training and testing, to assess the model’s capability and robustness for generating long continuous signals and its practical adaptability for wearable usage scenarios.

Main Results

1. Multimodal Signal Reconstruction Quality

The model performed excellently in reconstructing all target signals:

  • Median correlation coefficients ®: ECG 0.808, NIBP 0.907, ICG 0.833, PPG 0.929
  • Median mean absolute errors (MAE): ECG 0.309, NIBP 0.275, ICG 0.401, PPG 0.255

Correlations and errors showed little variation across interventions, indicating that the model can stably handle morphological variations in signals under diverse physiological states. Resting-state correlation was highest (more stable vitals), with a slight decrease during breath hold and cold pressor tests, presumably related to physiological disturbances degrading signal shape, but overall performance remained high. Each intervention group could explain changes in target signals, and the model accurately captured timing changes in key fiducial points, reflecting its responsiveness to the entire dynamic physiological process.

In specific signal examples, the generated waveforms for ECG, NIBP, PPG, and ICG closely replicated original morphology; when raw signals contained noise, generated signals remained relatively stable, aiding subsequent signal denoising.

2. Fiducial Point Detection Ability

Detection performance of key events in each signal was as follows:

  • ECG R-wave MAE as low as 6.66 ms, which at 200 Hz is only 1–2 samples from the real peak, indicating extremely high temporal precision.
  • Other ECG peaks (P/Q/S/T): MAE ranged from 12 to 28 ms.
  • ICG C-point MAE was 15.11 ms; B and X points had slightly higher errors due to their variability and detection difficulty.
  • PPG and NIBP peak delays: MAE between 17–39 ms, mainly due to greater noise in distal pulse sampling.
  • No significant systematic bias was observed in fiducial point detection—generated outputs did not show directional errors. This accuracy provides a necessary foundation for clinical analysis of cardiac time intervals (e.g., LVET, PEP, PTT).

3. Long-Window Signal Reconstruction Performance

Model performance on segments up to 20 seconds long showed only slight reduction compared to short windows, with correlations still at high levels (0.789 for ECG, 0.891 for NIBP, 0.810 for ICG, 0.898 for PPG) and amplitude errors also stable. This demonstrates the model’s capacity for generating long-duration continuous signals, meeting the requirements for real-time wearable health monitoring.

Research Conclusions and Significance

This study innovatively proposes a method for simultaneous reconstruction of multiple critical cardiovascular signals from a single VCG sensor using a deep generative adversarial network. The high-dimensional structure of VCG signals is confirmed to contain sufficient physiological information to support generation of multimodal cardiac waveforms. The main significance is as follows:

  1. Hardware simplification and enhanced practicality: Traditional multi-sensor systems can be replaced with a single point sensor, greatly reducing device size, wearing burden, and synchronization difficulty, thereby improving the portability of daily health management.
  2. Enabling synchronized multimodal monitoring: Users need only wear a VCG sensor to simultaneously obtain all cardiovascular waveform data (ECG, ICG, NIBP, PPG), technically achieving the generation and clinical-friendly conversion of multiple signals.
  3. High accuracy in fiducial point detection: Extremely low errors in key events of multimodal signals meet the clinical requirements for analyzing various cardiac time intervals and functional indices, paving the way for non-invasive, continuous, and high-precision health monitoring.
  4. Strong model generalization: The model maintained high reconstruction accuracy for all subjects even without seeing any test subject’s data; individual factors such as age, gender, and body type had minor impacts, supporting broad population applicability.
  5. Excellent long-term monitoring capability: Robustness and stability in long-window signal generation satisfy the needs of continuous data in wearable applications.
  6. Technological foundation for innovative cardiovascular health management: It is suggested that future work combine this generative model with clinical intelligent diagnosis to assess the application value of the generated signals for disease detection and functional assessment.

Research Highlights and Technological Innovations

  • First demonstration that a single VCG signal can simultaneously reconstruct four types of cardiovascular waveforms, breaking the previous bottleneck of pairwise modal conversions.
  • Innovative use of a one-dimensional, multi-output Pix2Pix structure, enabling the generative model to process multiple channels efficiently and synchronously, enhancing reusability and generalization.
  • Detailed experimental and intervention design, ensuring model robustness to a variety of physiological conditions and real vital fluctuations, superior to models trained on single resting-state data only.
  • Highly automated signal annotation and evaluation system, integrating automatic fiducial point detection algorithms, elevating generative model performance to clinical-level accuracy.
  • A technological foundation for the development of noninvasive, continuous, high-dimensional medical monitoring devices.

Other Valuable Information

The study also notes that normalization of output signals enhanced training accuracy, but limited original amplitude recovery. Situations like precise determination of blood oxygenation still require further improvements in models or data; small decreases in accuracy for distal sampling signals (PPG, NIBP) provide guidance for sensor selection and data fusion in practice; some annotation algorithms (e.g., B-point detection) remain inconsistent in the academic community and need to be optimized in practical applications through combined strategies.

The data is not publicly available, but the code has been released on Github (https://github.com/jamesskoric/vcg-generative-reconstruction), laying the foundation for reproduction and further research in academia and industry.

Summary and Outlook

This study is a milestone in single-sensor wearable health monitoring and deep generative signal reconstruction, revealing the multimodal information potential of VCG signals and the capacity of AI models to deeply exploit this data. Future work may expand the model’s generalization on larger samples, more complex pathologies, and real-world clinical settings, assess the practical utility of generated signals in disease detection, and further enhance amplitude accuracy and noise robustness. Innovation is rooted not only in cutting-edge algorithms, but also drives hardware simplification, user experience, and clinical diagnostic paradigm shifts—heralding a new chapter for global cardiovascular health management.