Data Standards and Data Reproducibility

The LSP is strongly committed to FAIR (Findable, Accessible, Interoperable and Reusable) research. 

We have studied and published on factors that influence the reproducibility of laboratory-based research findings, and run a seminar series that features speakers at the cutting edge of data and knowledge management. Whenever feasible, the lab’s results and methods are made available as open-source software, and data are made available under public domain (i.e., Creative Commons) licenses. 

LSP investigators have studied the irreproducibility of preclinical drug response and pharmacodynamic data in detail (Niepel, 2016) and developed multiple methods to address the problem (Hafner, 2016; Mills, 2021), commented on the importance of public data release for reproducibility (AlQuraishi, 2016), developed methods to liberate survival data about from clinical trials from pictorial representations (Plana, 2021), created specialized data processing pipelines to increase the reliability of complex data analysis (Schapiro, 2021a), and developed metadata scheme for standardizing the description of tissue images (Schapiro, 2021b). These efforts underline the lab’s strong commitment to FAIR data practices and data reproducibility in general.


Relevant publications:

 

AlQuraishi M, Sorger PK. Reproducibility will only come with data liberation. Sci Transl Med. 2016 May 18;8(339):339ed7. PMCID: PMC5084089.

Hafner M, Niepel M, Chung M, Sorger PK. Growth rate inhibition metrics correct for confounders in measuring sensitivity to cancer drugs. Nat Methods. 2016 Jun 1;13(6):521–527. PMCID: PMC4887336.

Mills CE, Subramanian K, Hafner M, Niepel M, Gerosa L, Chung M, Victor C, Gaudio B, Yapp C, Sorger PK. Multiplexed and reproducible high content screening of live and fixed cells using the Dye Drop method. 2021 Aug 28; Available from: http://biorxiv.org/lookup/doi/10.1101/2021.08.27.457854

Niepel M, Hafner M, Mills CE, Subramanian K, Williams EH, Chung M, Gaudio B, Barrette AM, Stern AD, Hu B, Korkola JE, LINCS Consortium, Gray JW, Birtwistle MR, Heiser LM, Sorger PK. A Multi-center Study on the Reproducibility of Drug-Response Assays in Mammalian Cell Lines. Cell Syst. 2019 Jul 5;9(1):35-48.e5. PMCID: PMC6700527.

Plana D, Fell G, Alexander BM, Palmer AC, Sorger PK. Cancer patient survival can be accurately parameterized, revealing time-dependent therapeutic effects and doubling the precision of small trials. 2021 May 17; Available from: http://biorxiv.org/lookup/doi/10.1101/2021.05.14.442837.

Schapiro D, Sokolov A, Yapp C, Chen Y-A, Muhlich JL, Hess J, Creason AL, Nirmal AJ, Baker GJ, Nariya MK, Lin J-R, Maliga Z, Jacobson CA, Hodgman MW, Ruokonen J, Farhi SL, Abbondanza D, McKinley ET, Persson D, Betts C, Sivagnanam S, Regev A, Goecks J, Coffey RJ, Coussens LM, Santagata S, Sorger PK. MCMICRO: a scalable, modular image-processing pipeline for multiplexed tissue imaging. Nat Methods. 2021 Nov 25; PMID: 34824477.

Schapiro D, Yapp C, Sokolov A, Reynolds SM, Chen Y-A, Sudar D, Xie Y, Muhlich J, Arias-Camison R, Nikolov M, Tyler M, Lin J-R, Burlingame EA, Arena S, Network HTA, Chang YH, Farhi SL, Thorsson V, Venkatamohan N, Drewes JL, Pe’er D, Gutman DA, Herrmann MD, Gehlenborg N, Bankhead P, Roland JT, Herndon JM, Snyder MP, Angelo M, Nolan G, Swedlow J, Schultz N, Merrick DT, Mazzilli SA, Cerami E, Rodig SJ, Santagata S, Sorger PK. MITI Minimum Information guidelines for highly multiplexed tissue images. arXiv:210809499. 2021 Aug 21; Available from: http://arxiv.org/abs/2108.09499