A connectivity gradient-based framework for reproducible biomarker discovery

Hong SJ†, Xu T†, Nikolaidis A, Smallwood J, Margulies DS, Bernhardt BC, Vogelstein J, Milham MP. “Toward a connectivity gradient-based framework for reproducible biomarker discovery”, BioRxiv († first co-author; preprint; under review) 2020. 10.1101/2020.04.15.043315

ABSTRACT. Despite myriad demonstrations of feasibility, the high dimensionality of fMRI data remains a critical barrier to its utility for reproducible biomarker discovery. Recent studies applying dimensionality reduction techniques to resting-state fMRI (R-fMRI) have unveiled neurocognitively meaningful connectivity gradients that are present in both human and primate brains, and appear to differ meaningfully among individuals and clinical populations. Here, we provide a critical assessment of the suitability of connectivity gradients for biomarker discovery. Using the Human Connectome Project (discovery subsample=209; two replication subsamples= 209×2) and the Midnight scan club (n=9), we tested the following key biomarker traits – reliability, reproducibility and predictive validity – of functional gradients. In doing so, we systematically assessed the effects of three analytical settings, including i) dimensionality reduction algorithms (i.e., linear vs. non-linear methods), ii) input data types (i.e., raw time series, (un-)thresholded functional connectivity), and iii) amount of the data (R-fMRI time-series lengths). We found that the reproducibility of functional gradients across algorithms and subsamples is generally higher for those explaining more variances of whole-brain connectivity data, as well as those having higher reliability. Notably, among different analytical settings, a linear dimensionality reduction (principal component analysis in our study), more conservatively thresholded functional connectivity (e.g., 95-97%) and longer time-series data (at least ≥20mins) was found to be preferential conditions to obtain higher reliability. Those gradients with higher reliability were able to predict unseen phenotypic scores with a higher accuracy, highlighting reliability as a critical prerequisite for validity. Importantly, prediction accuracy with connectivity gradients exceeded that observed with more traditional edge-based connectivity measures, suggesting the added value of a low-dimensional gradient approach. Finally, the present work highlights the importance and benefits of systematically exploring the parameter space for new imaging methods before widespread deployment.


  • There is a growing need to identify benchmark parameters in advancing functional connectivity gradients into a reliable biomarker.
  • Here, we explored multidimensional parameter space in calculating functional gradients to improve their reproducibility, reliability and predictive validity.
  • We demonstrated that more reproducible and reliable gradient markers tend to have higher predictive power for unseen phenotypic scores across various cognitive domains.
  • We showed that the low-dimensional connectivity gradient approach could outperform raw edge-based analyses in terms of predicting phenotypic scores.
  • We highlight the necessity of optimizing parameters for new imaging methods before their widespread deployment.

© 2020. All rights reserved by COMBINE Lab