spatula draw-sge¶
Summary¶
spatula draw-sge is a tool to visualize the spatial gene expression based on specific combination of colors and gene set.
IMPORTANT ImageMagick must be installed to use this tool.
A typical use case is as follows:
- Input: Takes a SGE matrix from
dge2sgeand the list of genes (with designated colors) to visualize the spatial gene expression. - Output: Produces an 2D image that plots the spatial distribution of expression of selected genes.
A typical example is as follows:
spatula draw-sge --manifest /path/to/combine/sbcds/output/dir/manifest.tsv \
--sge /path/to/dge2sge/output/dir \
--color-gene 320000:_all_:1 \
--color-gene 003200:Glul,Cyp2e1:1 \
--color-list 000064:/path/to/custom/gene_list.tsv \
--out /path/to/output/image.png
--auto-adjust
See below for a more detailed usage description.
Required options¶
--sge: The directory containing SGE matrix, typically fromdge2sgecommand.--out: The output filename of the image. Currently,.pngis supported.
Options to Specify Genes to Visualize¶
--color-gene: (Allows multiple uses) A string formatted as [RGB_Hex_Code]:[gene1],[gene2],...:[idx].- The
[RGB_Hex_Code]is the RGB hex color code (RRGGBBformat) for each observation of the specified genes. - The
[gene1],[gene2],...is the list of genes to color with the specified color. Either gene ID (e.g.ENSMUSG00000026473) or gene symbol (e.g.Glul) may be used._all_can be used to color all genes with the specified color. - The
[idx]is optional field to specify the index of the gene expression in the SGE matrix. (e.g. 1: Gene, 2: GeneFull, 3: Spliced, 4: Unspliced, 5: Velocyto). The default is 1. - For example,
--color-gene 003200:Glul,Cyp2e1:1will color Glul and Cyp2e1 genes with green (increasing intensity by 50 for each observation).
- The
--color-regex: (Allows multiple uses) A string formatted as [RGB_Hex_Code]:[regex_pattern]:[idx].- The
[RGB_Hex_Code]is the RGB hex color code (RRGGBBformat) for each observation of the specified genes. - The
[regex_patternis the regulat expression pattern to specify the list of genes. Here are some examples: - You can specify multiple genes like
^(Glul|Cyp2e1)$ - You can specify genes with a common prefix like
^mt-.*$(for mitochondrial genes) - The
[idx]is optional field to specify the index of the gene expression in the SGE matrix. (e.g. 1: Gene, 2: GeneFull, 3: Spliced, 4: Unspliced, 5: Velocyto). The default is 1. - For example,
--color-gene 003200:^(Glul|Cyp2e1):1will color Glul and Cyp2e1 genes with green (increasing intensity by 50 for each observation).
- The
--color-list: (Allows multiple uses) A string formatted as [RGB_Hex_Code]:[path_to_gene_list]:[default_idx].- The
[RGB_Hex_Code]is the RGB hex color code (RRGGBBformat) for each observation of the specified genes. - The
[path_to_gene_list]is the path to the gene list file. The file should contain a list of gene names of gene IDs (without headers) - The
[default_idx]is optional field to specify the index of the gene expression in the SGE matrix. (e.g. 1: Gene, 2: GeneFull, 3: Spliced, 4: Unspliced, 5: Velocyto). This is only in effect whenidxcolumn is not present in the[path_to_gene_list]file. - If
[default_idx]is absent, it will be defaulted to 1 (Gene) - For example,
--color-list 000064:/path/to/custom/gene_list.tsvwill color genes in the/path/to/custom/gene_list.tsvfile with blue, withdefault_idx=1, increasing intensity by 100 for each observation. - Here is an example content of the gene list file
mt-genes.tsv, which contains gene ID for mitochondrial genes in mouse:mt-Nd1 mt-Nd2 mt-Co1 mt-Co2 mt-Atp8 mt-Atp6 mt-Co3 mt-Nd3 mt-Nd4l mt-Nd4 mt-Nd5 mt-Nd6 mt-Cytb
- The
Additional Options¶
--manifest: Themanifest.tsvfile from thecombine-sbcdsfile that contains the summary of the spatial coordinate of a Seq-Scope Chip. It must containxmin,xmax,ymin, andymaxcolumns. This option is REQUIRED unless--auto-adjustis used.--auto-adjust: Automatically adjust the intensity of the color based on the maximum count. The default is OFF.--adjust-quantile: The quantile of pixel to use for auto-adjustment among non-zero pixels. The default is 0.99.--coord-per-pixel: The number of coordinates to be collapsed into a pixel as a factor to divide the input coordinate with. The default is 1000.0.--bcd: The barcode file name in the SGE directory. The default isbarcodes.tsv.gz.--ftr: The feature file name in the SGE directory. The default isfeatures.tsv.gz.--mtx: The matrix file name in the SGE directory. The default ismatrix.mtx.gz.
Expected Output¶
The output [out] will be created as a PNG file containing the image of the input points.
Full Usage¶
The full usage of spatula draw-sge can be viewed with the --help option:
$ ./spatula draw-sge --help
[./spatula draw-sge] -- Draw the image of spatial gene expression (SGE) data
Copyright (c) 2022-2024 by Hyun Min Kang
Licensed under the Apache License v2.0 http://www.apache.org/licenses/
Detailed instructions of parameters are available. Ones with "[]" are in effect:
Available Options:
== Input files ==
--minmax [STR: ] : Bounding box information. Expects xmin/xmax/ymin/ymax (tall or wide format)
--sge [STR: ] : SGE directory
--bcd [STR: barcodes.tsv.gz] : Barcode file name
--ftr [STR: features.tsv.gz] : Feature file name
--mtx [STR: matrix.mtx.gz] : Matrix file name
== Genes to visualize ==
--color-gene [V_STR: ] : [color_code]:[gene1],[gene2],... as a visualization unit. Adding :[idx] at the end is optional
--color-regex [V_STR: ] : [color_code]:[regex](:[idx]) as a visualization unit. [regex] is a regulat expression. Adding :[idx] at the end is optional
--color-list [V_STR: ] : [color_code]:[list_file](:[idx]) as a visualization unit
== Output options ==
--coord-per-pixel [FLT: 1000.00] : Number of coordinate units per pixel
--auto-adjust [FLG: OFF] : Automatically adjust the intensity of the color based on the maximum count
--adjust-quantile [FLT: 0.99] : Quantile of pixel to use for auto-adjustment among non-zero pixels
== Output Options ==
--out [STR: ] : Output file name
NOTES:
When --help was included in the argument. The program prints the help message but do not actually run