Data Standards and Data Reproducibility

Evidence that some high-impact biomedical results are not reproducible has stimulated great interest in practices that generate findable, accessible, interoperable, and reusable (FAIR) data. Multiple reviews and perspectives have identified examples of irreproducibility, but practical ways to address the problem are relatively rare. LSP investigators have studied the irreproducibility of preclinical drug response and pharmacodynamic data in detail (Niepel, 2016) and developed multiple methods to address the problem (Hafner, 2016; Mills, 2021), commented on the importance of public data release for reproducibility (AlQuraishi, 2016), developed methods to liberate survival data about from clinical trials from pictorial representations (Plana, 2021), created specialized data processing pipelines to increase the reliability of complex data analysis (Schapiro, 2021a), and developed metadata scheme for standardizing the description of tissue images (Schapiro, 2021b). These efforts underline the lab’s strong commitment to FAIR data practices and data reproducibility in general.

AlQuraishi M, Sorger PK. Reproducibility will only come with data liberation. Sci Transl Med. 2016 May 18;8(339):339ed7. PMCID: PMC5084089.

Hafner M, Niepel M, Chung M, Sorger PK. Growth rate inhibition metrics correct for confounders in measuring sensitivity to cancer drugs. Nat Methods. 2016 Jun 1;13(6):521–527. PMCID: PMC4887336.

Mills CE, Subramanian K, Hafner M, Niepel M, Gerosa L, Chung M, Victor C, Gaudio B, Yapp C, Sorger PK. Multiplexed and reproducible high content screening of live and fixed cells using the Dye Drop method. 2021 Aug 28; Available from:

Niepel M, Hafner M, Mills CE, Subramanian K, Williams EH, Chung M, Gaudio B, Barrette AM, Stern AD, Hu B, Korkola JE, LINCS Consortium, Gray JW, Birtwistle MR, Heiser LM, Sorger PK. A Multi-center Study on the Reproducibility of Drug-Response Assays in Mammalian Cell Lines. Cell Syst. 2019 Jul 5;9(1):35-48.e5. PMCID: PMC6700527.

Plana D, Fell G, Alexander BM, Palmer AC, Sorger PK. Cancer patient survival can be accurately parameterized, revealing time-dependent therapeutic effects and doubling the precision of small trials. 2021 May 17; Available from:

Schapiro D, Sokolov A, Yapp C, Chen Y-A, Muhlich JL, Hess J, Creason AL, Nirmal AJ, Baker GJ, Nariya MK, Lin J-R, Maliga Z, Jacobson CA, Hodgman MW, Ruokonen J, Farhi SL, Abbondanza D, McKinley ET, Persson D, Betts C, Sivagnanam S, Regev A, Goecks J, Coffey RJ, Coussens LM, Santagata S, Sorger PK. MCMICRO: a scalable, modular image-processing pipeline for multiplexed tissue imaging. Nat Methods. 2021 Nov 25; PMID: 34824477.

Schapiro D, Yapp C, Sokolov A, Reynolds SM, Chen Y-A, Sudar D, Xie Y, Muhlich J, Arias-Camison R, Nikolov M, Tyler M, Lin J-R, Burlingame EA, Arena S, Network HTA, Chang YH, Farhi SL, Thorsson V, Venkatamohan N, Drewes JL, Pe’er D, Gutman DA, Herrmann MD, Gehlenborg N, Bankhead P, Roland JT, Herndon JM, Snyder MP, Angelo M, Nolan G, Swedlow J, Schultz N, Merrick DT, Mazzilli SA, Cerami E, Rodig SJ, Santagata S, Sorger PK. MITI Minimum Information guidelines for highly multiplexed tissue images. arXiv:210809499. 2021 Aug 21; Available from: