{ "cells": [ { "cell_type": "markdown", "id": "05d573cf", "metadata": {}, "source": [ "# How to run `arraylib` on the command line\n", "\n", "To run `arraylib` on a library deconvolution experiment with default parameters run:\n", "\n", "```\n", "arraylib-run -c -gb -br -t -bu -bd \n", "```\n", "\n", "## Input parameters\n", "\n", "Required parameters:\n", "\n", "* input_dir: path to directory holding the input fastq files\n", "* exp_design: path to csv file indicating experimental design (values should be separated by a comma). The experimental design file \n", " should have columns, Filename, Poolname and Pooldimension. (see example in tests/test_data/full_exp_design.csv)\n", " * Filename should contain all the unqiue input fastq filenames.\n", " * Poolname should indicate to which pool a given file belongs. Multiple files per poolname are allowed.\n", " * Pooldimension indicates the pooling dimension a pool belongs to. All pools sharing the same pooling dimension should have the same string in the Pooldimension column.\n", " \n", "\n", "An example of how an exp_design file could look like:\n", "\n", "| Filename | Poolname | Pooldimension |\n", "| :---------------: | :-------------: | :------------: |\n", "| column1.fastq | column1 | columns |\n", "| column2.fastq | column2 | columns |\n", "| row1.fastq | row1 | rows |\n", "| row2.fastq | row2 | rows |\n", "| platerow1.fastq | platerow1 | platerows |\n", "| platerow2.fastq | platerow2 | platerows |\n", "| platecol1.fastq | platecol1 | platecols |\n", "| platecol2.fastq | platecol2 | platecols |\n", "\n", "* -gb path to genbank reference file\n", "* -br path to bowtie index files, ending with the basename of your index (if the basename of your index is UTI89 and you store your bowtie2 [[1]](#1) references in bowtie_ref it should be bowtie_ref/UTI89). Please visit https://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#the-bowtie2-build-indexer for a manual how to create bowtie2 indices.\n", "* -t transposon sequence (e.g. AGATGTGTATAAGAGACAG)\n", "* -bu upstream sequence of barcode (e.g. CGAGGTCTCT)\n", "* -bd downstream sequence of barcode (e.g. CGTACGCTGC)\n", "\n", "Optional parameters:\n", "\n", "* -mq minimum bowtie2 alignment quality score for each base to include read\n", "* -sq minimum phred score for each base to include read\n", "* -tm number of transposon mismatches allowed\n", "* -thr threshold for local filter (e.g. a threshold of 0.05 would filter out all reads < 0.05 of the maximum read count for a given mutant)\n", "* -g\\_thr threshold for global filter (all reads below g_thr will be set to 0) \n", "\n", "## Run only on barcodes\n", "If you want to run arraylib-solve only on barcodes without alignment to the reference genome use the following command:\n", "\n", "```\n", "arraylib-run_on_barcodes -c -bu -bd \n", "```\n", "\n", "Optional parameters:\n", "\n", "* -thr threshold for local filter (e.g. a threshold of 0.05 would filter out all reads < 0.05 of the maximum read count for a given mutant)\n", "* -g\\_thr threshold for global filter (all reads below g_thr will be set to 0) \n", "\n", "## Output\n", "\n", "`arraylib-solve` outputs 4 files: \n", "* count_matrix.csv: Read counts per pool for each mutant, normalized and filtered.\n", "* mutant_location_summary.csv: A summary of mutants found in the well plate grid, where each row corresponds to a different mutant.\n", "* well_location_summary.csv: A summary of the deconvolved well plate grid, where each row corresponds to a different well.\n", "\n", "\n", "\n", "## References\n", "[1] \n", "Langmead, B. and Salzberg, S.L., 2012. Fast gapped-read alignment with Bowtie 2. Nature methods, 9(4), pp.357-359." ] } ], "metadata": { "kernelspec": { "display_name": "spyder-env", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }