The Manrai Lab is a team of machine learning scientists, clinicians, and biomedical data scientists working to improve medical decision making by developing computational approaches that incorporate rich and deep representations of clinical state and an individual's identity into care. Active projects include:

  1. Improving genetic variant classification and quantifying risk ("penetrance") in clinical genomics, with a focus on inherited heart disease (e.g. Manrai et al. NEJM 2016)
  2. Measuring "normal" variation for blood laboratory biomarkers across populations with a focus on creatinine and kidney disease (e.g. Manrai et al. JAMA 2018)/li>
  3. Developing semi-supervised learning approaches with applications including medical imaging and text (e.g. Melas-Kyriazi & Manrai 2020)/li>
  4. Modeling reproducibility in integrative biomedical studies using meta-science ("science of science") approaches (e.g. Manrai et al. AJE 2019)/li>

The group's research has been published in the New England Journal of Medicine and JAMA, presented at the National Academy of Sciences, and featured in the New York Times, Wall Street Journal, and NPR.


Arjun (Raj) Manrai is an Assistant Professor at Harvard Medical School and Faculty Member in the Computational Health Informatics Program (CHIP) at Boston Children’s Hospital. Manrai received an A.B. in Physics with Highest Honors from Harvard and earned his Ph.D. in Bioinformatics and Integrative Genomics from the Harvard-MIT Division of Health Sciences and Technology.


Publications powered by Harvard Catalyst Profiles

  1. Harmonizing the Collection of Clinical Data on Genetic Testing Requisition Forms to Enhance Variant Interpretation in Hypertrophic Cardiomyopathy (HCM): A Study from the ClinGen Cardiomyopathy Variant Curation Expert Panel. J Mol Diagn. 2021 Feb 22. View abstract
  2. Comparisons of Polyexposure, Polygenic, and Clinical Risk Scores in Risk Prediction of Type 2 Diabetes. Diabetes Care. 2021 Feb 09. View abstract
  3. Clinical Implications of Removing Race From Estimates of Kidney Function. JAMA. 2021 Jan 12; 325(2):184-186. View abstract
  4. In Search of a Better Equation - Performance and Equity in Estimates of Kidney Function. N Engl J Med. 2021 Feb 04; 384(5):396-399. View abstract
  5. Scalability and cost-effectiveness analysis of whole genome-wide association studies on Google Cloud Platform and Amazon Web Services. J Am Med Inform Assoc. 2020 Jul 27. View abstract
  6. Prediction of chronological and biological age from laboratory data. Aging (Albany NY). 2020 05 05; 12(9):7626-7638. View abstract
  7. Challenges to the Reproducibility of Machine Learning Models in Health Care. JAMA. 2020 01 28; 323(4):305-306. View abstract
  8. Signals Among Signals: Prioritizing Nongenetic Associations in Massive Data Sets. Am J Epidemiol. 2019 05 01; 188(5):846-850. View abstract
  9. Author Correction: Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes. Nat Genet. 2019 04; 51(4):764-765. View abstract
  10. Potential Excessive Testing at Scale: Biomarkers, Genomics, and Machine Learning. JAMA. 2019 Feb 26; 321(8):739-740. View abstract
  11. Repurposing large health insurance claims data to estimate genetic and environmental contributions in 560 phenotypes. Nat Genet. 2019 02; 51(2):327-334. View abstract
  12. Using Big Data to Determine Reference Values for Laboratory Tests-Reply. JAMA. 2018 10 09; 320(14):1496. View abstract
  13. In the Era of Precision Medicine and Big Data, Who Is Normal? JAMA. 2018 May 15; 319(19):1981-1982. View abstract
  14. Biomedical informatics and machine learning for clinical genomics. Hum Mol Genet. 2018 05 01; 27(R1):R29-R34. View abstract
  15. Adaptation and validation of the ACMG/AMP variant classification framework for MYH7-associated inherited cardiomyopathies: recommendations by ClinGen's Inherited Cardiomyopathy Expert Panel. Genet Med. 2018 03; 20(3):351-359. View abstract
  16. Association of Sex With Recurrence of Autism Spectrum Disorder Among Siblings. JAMA Pediatr. 2017 11 01; 171(11):1107-1112. View abstract
  17. Fostering reproducibility in industry-academia research. Science. 2017 08 25; 357(6353):759-761. View abstract
  18. Systematic correlation of environmental exposure and physiological and self-reported behaviour factors with leukocyte telomere length. Int J Epidemiol. 2017 02 01; 46(1):44-56. View abstract
  19. METHODS TO ENSURE THE REPRODUCIBILITY OF BIOMEDICAL RESEARCH. Pac Symp Biocomput. 2017; 22:117-119. View abstract
  20. Informatics and Data Analytics to Support Exposome-Based Discovery for Public Health. Annu Rev Public Health. 2017 Mar 20; 38:279-294. View abstract
  21. Genetic Misdiagnoses and the Potential for Health Disparities. N Engl J Med. 2016 Aug 18; 375(7):655-65. View abstract
  22. Clinical Genomics: From Pathogenicity Claims to Quantitative Risk Estimates. JAMA. 2016 Mar 22-29; 315(12):1233-4. View abstract
  23. METHODS TO ENHANCE THE REPRODUCIBILITY OF PRECISION MEDICINE. Pac Symp Biocomput. 2016; 21:180-2. View abstract
  24. REPRODUCIBLE AND SHAREABLE QUANTIFICATIONS OF PATHOGENICITY. Pac Symp Biocomput. 2016; 21:231-42. View abstract
  25. METHODS TO ENHANCE THE REPRODUCIBILITY OF PRECISION MEDICINE. Pac Symp Biocomput. 2016; 21:180-182. View abstract
  26. Development of exposome correlation globes to map out environment-wide associations. Pac Symp Biocomput. 2015; 231-42. View abstract
  27. Medicine's uncomfortable relationship with math: calculating positive predictive value. JAMA Intern Med. 2014 Jun; 174(6):991-3. View abstract
  28. Enriched protein screening of human bone marrow mesenchymal stromal cell secretions reveals MFAP5 and PENK as novel IL-10 modulators. Mol Ther. 2014 May; 22(5):999-1007. View abstract
  29. Urinary-cell mRNA and acute kidney-transplant rejection. N Engl J Med. 2013 11 07; 369(19):1859. View abstract
  30. CEAS: cis-regulatory element annotation system. Bioinformatics. 2009 Oct 01; 25(19):2605-6. View abstract
  31. Androgen receptor regulates a distinct transcription program in androgen-independent prostate cancer. Cell. 2009 Jul 23; 138(2):245-56. View abstract
  32. The geometry of multisite phosphorylation. Biophys J. 2008 Dec 15; 95(12):5533-43. View abstract
  33. Model-based analysis of two-color arrays (MA2C). Genome Biol. 2007; 8(8):R178. View abstract