Rule sdgeAR_featurefilter:¶
Purpose¶
The sdgeAR_featurefilter filters the spatial digital gene expression (SGE) matrix by gene types, gene names, or the number of UMIs per gene.
Input Files¶
- A Tab-delimited Feature File
Required the feature file from Rule
sdgeAR_reformat.
Output Files¶
The rule generates the following output in the specified directory path:
1 | |
(1) A Tab-delimited Clean Feature File¶
Description: A clean feature file (*.feature.clean.tsv.gz) that counts UMIs for features after gene-filtering.
File Naming Convention:
1 | |
File Format: Those two feature files share the same format:
1 2 3 4 | |
gene_id: Gene Ensemble IDgene: Gene symbolgn: the count per gene per barcode for Genegt: the count per gene per barcode for GeneFullspl: the count per gene per barcode for Splicedunspl: the count per gene per barcode for Unsplicedambig: the count per gene per barcode for Ambiguous
Output Guidelines¶
No action is required.
Parameters¶
1 2 3 4 5 | |
-
The
keep_gene_typeParameter Specifies the types of genes to retain during gene filtering. -
The
rm_gene_regexParameter Defines the types of genes to be excluded during gene filtering. -
The
min_ct_per_featureParameter Defines the minimal UMI count for genes. Genes of which number of UMI is smaller than this cutoff will be removed.
Info
It is important to note that both keep_gene_type and rm_gene_regex parameters utilizes regular expressions.
Dependencies¶
Given sdgeAR_featurefilter requires input from Rule sdgeAR_reformat, Rule sdgeAR_featurefilter can only execute after sdgeAR_reformat and its prerequisite rules have successfully completed their operations. See an overview of the rule dependencies in the Workflow Structure.
Code Snippet¶
The code for this rule is provided in c03_sdgeAR_featurefilter.smk.