An integrative network approach for longitudinal stratification in Parkinson’s disease

Barry Ryan, Marioni, T. Ian Simpson

Abstract

Parkinson’s disease (PD) is a neurodegenerative disorder characterized by motor symptoms resulting from the loss of dopamine-producing neurons in the brain. Currently, there is no cure for the disease which is in part due to the heterogeneity in patient symptoms, trajectories and manifestations. There is a known genetic component of PD and genomic datasets have helped to uncover some aspects of the disease. Understanding the longitudinal variability of PD is essential as it has been theorised that there are different triggers and underlying disease mechanisms at different points during disease progression.

Introduction

Parkinson’s disease (PD) is a heterogenous, progressive, multisystem neurological disorder that affects the nervous system. It is most commonly characterised by a range of motor symptoms, primarily involving difficulties with movement, however a wide variety of non-motor symptoms also exist. PD has a complex pathophysiology, but these disease pathways culminate in the gradual death of neuronal cells, causing a deficit in dopamine [1].

Materials and method

Multi-Omic Graph Diagnosis (MOGDx)

MOGDx, shown in S1 Fig, is a flexible tool to integrate multiple omic measures and perform classification tasks. It uses a patient similarity measure to identify patients who have similar molecular, epigenetic, and demographic disease characteristics and performs node classification using a GCN. The performance of MOGDx was benchmarked on cancer data and achieved state-of-the-art performance compared to similar research [11].

Results

Performance and evaluation

The performance metrics used to compare the classification performance of MOGDx were accuracy, F1 score and improvement in accuracy. The F1 score was calculated by the mean F1 score of each class, weighted by the size of that class. Improvement in accuracy is a metric used to compare how much the accuracy improved compared to a baseline model which only predicts the most common class. Stratified k-fold cross validation was performed with 5 randomly generated splits to obtain the mean and standard deviation metrics reported. Within each split, the training set was further randomly split into training and validation sets to produce an overall train/validation/test split of 68%/12%/20% respectively.

Discussion

In this paper, we applied an integrative network framework and artificial intelligence to the PPMI dataset. The PPMI dataset is an observational, international study, consisting of multiple data modalities, with the goal of identifying markers of PD to accelerate disease modifying clinical trials [17]. We used clinical, genomic, and proteomic data to include numerous patient samples and conducted cross-sectional and longitudinal stratification of participants who have PD, have an early indication of developing PD (PL), or were a HC.

Conclusion

This study highlights the importance of flexible integrative approaches to the analysis of PD. We have shown that there is a signal for PD present in genomic and proteomic data obtained from whole-blood samples. We have shown this both in a homogeneous group with a clear genetic driver for the disease and also in a more heterogeneous idiopathic group. We have achieved non-zero improvements in accuracy which are comparable to the MDS-UPDRS assessment baseline in the idiopathic group and significantly improved on this baseline in the genetic group. 

Citation: Ryan B, Marioni R, Simpson TI (2025) An integrative network approach for longitudinal stratification in Parkinson’s disease. PLoS Comput Biol 21(3): e1012857. https://doi.org/10.1371/journal.pcbi.1012857

Editor: Sushmita Roy

Received: May 10, 2024; Accepted: February 6, 2025; Published: March 28, 2025

Copyright: © 2025 Ryan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data used in the preparation of this article were obtained on April, 5th 2022 from the Parkinson’s Progression Markers Initiative (PPMI) database (www.ppmi-info.org/access-data-specimens/download-data), RRID:SCR 006431. Using the following datasets: Project 133 RNA Sequencing Methods Project 133 Small RNA Transcriptome Sequencing Read Counts Project 140: Comprehensive Methylation Profiling of the PPMI Cohort Project 107: NeuroX SNP Data Project 151 Identification of proteins & protein networks & pQTL analysis in CSF All code is available for download from a dedicated GitHub repository - https://github.com/biomedicalinformaticsgroup/MOGDx-PPMI.

Funding: This work was supported by the UKRI Centre for Doctoral Training in Biomedical Artificial Intelligence (EP/S02431X/1, TIS & BR). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: I have read the journal's policy and the authors of this manuscript have the following competing interests: REM is a scientific advisor to Optima Partners and the Epigenetic Clock Development Foundation.