Medical and healthcareNews

Machine learning reveals how metabolite profiles predict aging and health

Metabolite data and AI combine to redefine how we measure aging and predict health spans.

Study: Metabolomic age (MileAge) predicts health and life span: A comparison of multiple machine learning algorithms. Image Credit: Sergey Tarasov / ShutterstockStudy: Metabolomic age (MileAge) predicts health and life span: A comparison of multiple machine learning algorithms. Image Credit: Sergey Tarasov / Shutterstock

In a recent study published in the journal Science Advances, researchers at King’s College London explored metabolomic aging clocks using machine learning models trained on plasma metabolite data from the United Kingdom (U.K.) Biobank. The study aimed to assess the potential of metabolomic aging clocks in predicting health outcomes and life span by benchmarking their accuracy, robustness, and relevance to biological aging indicators beyond chronological age.

Background

Biological aging, distinct from chronological age, reflects molecular and cellular damage influencing health and disease susceptibility. Chronological age alone cannot capture the variability in aging-related physiological states among individuals. However, recent advances in omics technologies, particularly metabolomics, have offered insights into biological aging through molecular profiling.

Metabolites, or small molecules from metabolic pathways, can provide assessments of physiological health and are linked to aging-related outcomes, such as chronic diseases and mortality. Earlier studies have correlated metabolomic data with aging but have been constrained by limited sample sizes and markers.

Recent efforts to derive “aging clocks” using machine learning from omics data have demonstrated significant predictive power for health outcomes. However, there continue to be challenges in optimizing these models for accuracy and interpretability, especially using metabolomics.

The current study

The present study utilized nuclear magnetic resonance (NMR) spectroscopy to analyze plasma metabolite data from the U.K. Biobank, involving 225,212 participants between the ages of 37 and 73 years. The exclusion criteria included pregnancy, data inconsistencies, and extreme metabolite values. The dataset encompassed 168 metabolites representing lipid profiles, amino acids, and glycolysis products.

The researchers applied 17 machine learning algorithms, including linear regression, tree-based models, and ensemble techniques, to the dataset to develop metabolomic aging clocks. They also used a rigorous nested cross-validation approach to ensure robust model evaluation.

Some of the main preprocessing steps included handling outlier metabolite values and correcting age-prediction biases inherent to the models. The predictive models aimed to estimate chronological age using metabolite profiles, and the differences between predicted and actual ages were defined as the “MileAge delta.” Statistical corrections were extensively applied to remove systematic biases and enhance prediction accuracy, particularly for younger and older age ranges.

The models were evaluated for predictive accuracy using metrics such as mean absolute error (MAE), root mean square error (RMSE), and correlation coefficients. For example, the Cubist regression model achieved an MAE of 5.31 years, outperforming other models like multivariate adaptive regression splines (MAE = 6.36 years). Further analysis adjusted the predictions to remove systematic biases and improve their alignment with chronological age.

Study design and overview. (A) Overview of the nested cross-validation approach. MAE, mean absolute error; RMSE, root mean square error. (B) Histogram of the chronological age distribution of the analytical sample. The statistical mode (age, 61 years) is shown in red. (C) Distribution of metabolite levels by chronological age, showing scatter plots of all observations and smooth curves (note the difference in the y-axis scale). The smooth curves were estimated using generalized additive models, with shaded areas corresponding to 95% confidence intervals (CIs). GlycA, glycoprotein acetyls. (D) Scatter plot showing the hazard ratio (HR) for all-cause mortality and the beta for chronological age associated with a one SD difference in metabolite levels. Metabolites that had statistically significant associations with both chronological age and all-cause mortality are shown in purple.

Study design and overview. (A) Overview of the nested cross-validation approach. MAE, mean absolute error; RMSE, root mean square error. (B) Histogram of the chronological age distribution of the analytical sample. The statistical mode (age, 61 years) is shown in red. (C) Distribution of metabolite levels by chronological age, showing scatter plots of all observations and smooth curves (note the difference in the y-axis scale). The smooth curves were estimated using generalized additive models, with shaded areas corresponding to 95% confidence intervals (CIs). GlycA, glycoprotein acetyls. (D) Scatter plot showing the hazard ratio (HR) for all-cause mortality and the beta for chronological age associated with a one SD difference in metabolite levels. Metabolites that had statistically significant associations with both chronological age and all-cause mortality are shown in purple.

Results

The findings indicated that metabolomic aging clocks developed from plasma metabolite profiles could effectively differentiate biological aging from chronological aging. Of the various models tested in the study, the Cubist rule-based regression model provided the strongest predictive associations with health markers and mortality and outperformed the other algorithms in accuracy and robustness.

Additionally, positive MileAge delta values, which indicated accelerated aging, were linked to frailty, shorter telomeres, higher morbidity, and increased mortality risk. Specifically, a 1-year increase in MileAge delta corresponded to a 4% rise in all-cause mortality risk, with hazard ratios (HR) exceeding 1.5 in extreme cases.

Moreover, the study showed that individuals with accelerated aging were more likely to report poorer self-rated health and experience chronic illnesses. Associations with frailty and telomere attrition were particularly pronounced, with some differences being equivalent to an 18-year disparity in frailty index scores. Interestingly, women exhibited slightly higher MileAge deltas than men across most models.

The study also confirmed the non-linear nature of metabolite-age relationships and emphasized the utility of statistical corrections in enhancing prediction accuracy. Additionally, comparing existing aging markers showed that metabolomic aging clocks captured unique health-relevant signals and often outperformed the simpler predictors. However, the results highlighted that decelerated aging (negative MileAge deltas) did not consistently translate into better health outcomes, underscoring the complexity of biological aging metrics.

Conclusions

Overall, the study demonstrated the utility of metabolomic aging clocks in predicting biological aging and associated health outcomes. By benchmarking multiple machine learning algorithms, the findings also showed the superior performance of the Cubist rule-based model in linking metabolite-derived ages to health markers and mortality.

The results suggested that metabolomic aging clocks hold potential for proactive health management and risk stratification and emphasized the need for further validation across diverse populations and longitudinal data for broader clinical application. This study sets a new benchmark for algorithm development, illustrating how metabolomic profiles can offer actionable insights into aging and health.

Journal reference:

Story first appeared on News Medical

Leave a Reply

Your email address will not be published. Required fields are marked *