# 1. Before You Screen ## Resources We highly recommend reviewing the following materials before screening: - Feldman, D., Funk, L., Le, A. et al. Pooled genetic perturbation screens with image‑based phenotypes. *Nat Protoc* 17, 476–512 (2022). [doi.org/10.1038/s41596-021-00653-8](https://doi.org/10.1038/s41596-021-00653-8) - Walton, R. T., Singh, A., Blainey, P. C. et al. Pooled genetic screens with image‑based profiling. *Mol Syst Biol* 18, e10768 (2022). [doi.org/10.15252/msb.202110768](https://doi.org/10.15252/msb.202110768) - Feldman, D., Singh, A., Schmid‑Burgk, J. L. et al. Optical pooled screens in human cells. *Cell* 179, 787–799.e17 (2019). [doi.org/10.1016/j.cell.2019.09.016](https://doi.org/10.1016/j.cell.2019.09.016) - Blainey Lab [OPS Protocols](https://blainey.mit.edu/protocols/). **Note:** Ignore the GitHub Repository as this is outdated! ## OPS Data Collection ### Data Structure During a brieflow run, microscope files are loaded from datafames (SBS/phenotype samples dfs) that specify a file's path and metadata (plate, tile, well, etc). The file paths and metadata are parsed from a data directory using a regex expression. See the [0.configure_preprocess_params.ipynb](https://github.com/cheeseman-lab/brieflow-analysis/blob/main/analysis/0.configure_preprocess_params.ipynb) notebook for more information on how this works. With this setup, data can be saved anywhere with any format. But we recommend: - saving in a safe (protected) location outside of the brieflow-analysis directory such that one can distinguish between inputs and intermediate outputs - the following hierarchical structure and naming conventions for simplicity: #### Hierarchical Structure ``` screen_name/ └── plate_1/ ├── input_sbs/ │ ├── c1/ │ ├── c2/ │ └── ... (cycles of sequencing data) └── input_ph/ ├── round_1/ ├── round_2/ └── ... (rounds of phenotyping data) ├── plate_2/ └── ... (plates of pooled screening data) ``` #### SBS (Sequencing) Files - Files should be located within cycle directories (`c1`, `c2`, etc.) - Naming format should include plate number, well ID, and tile number information - Channel information should be included if each ND2 image is for an individual channel. Otherwise, channel information can be included, but is not necessary. - When generating SBS data, one can include additional cycles or channels that assist with segmenting cells. Brieflow's preprocessing and SBS modules have methods for using these additional cycles/channels. **Example SBS filename:** `screen_name/plate_1/c2/P001_Wells-A3_Points-214.nd2` #### Phenotype Files - Files should be located within round directories (`round_1`, `round_2`, etc.) - Naming format should include plate number, well ID, and tile number information - Channel information should be included if each ND2 image is for an individual channel. Otherwise, channel information can be included, but is not necessary. **Example Phenotype filename:** `screen_name/plate_1/input_ph/round_1/P001_Wells-B1_Points-585.nd2` ### Common Pitfalls Here are some common pitfalls wtih OPS data collection and how to avoid them. #### Alignment Verification **Critical Quality Control Step:** Check alignment between: - Rounds of phenotyping data (`round_1` vs. `round_2` vs. …) - Cycles of sequencing data for each cycle (`c1` vs. `c2` vs. …) Alignment verification should be performed before proceeding with downstream analysis. Downstream analysis will be impossible with large spatial shifts in alignment.