Skip to content

01_Baseline

Michael Bornholdt edited this page Sep 8, 2021 · 5 revisions

The aim of this first part is to decide on useful evaluation metrics and thus create a baseline of how well the class pipeline performs. Classical feature extraction is done by employing CellProfiler and following the postprocessing steps found in the LINCS repository.

Content

  • LINCS data
  • Images & location files
  • Pycytominer pipeline
  • Evaluation metrics

LINCS dataset

Details on the LINCS dataset can be found in the repository but the most important details are:

  • A549 cell line
  • 1,571 compounds/perturbations
  • 6 doses per compound
  • 5 replicas of each dose and compound

Size:

  • 5 batches
  • 136 plates
  • 386 wells per plate
  • 9 sites per well

With each well containing ~2,000 cells and with ~52,000 wells in total, the LINCS database holds around 10 million cells.

The different stages of processing, from images of each site to the aggregated profile (feature space) of a compound, are described in the profiles section of the LINCS repository.

The following chapter will go into more detail on the pipeline and decisions made on the post-processing of profiles.

Clone this wiki locally