Rule sdgeAR_featurefilter
:¶
Purpose¶
The sdgeAR_featurefilter
filters the spatial digital gene expression (SGE) matrix by gene types, gene names, or the number of UMIs per gene.
Input Files¶
- A Tab-delimited Feature File
Required the feature file from Rule
sdgeAR_reformat
.
Output Files¶
The rule generates the following output in the specified directory path:
1 |
|
(1) A Tab-delimited Clean Feature File¶
Description: A clean feature file (*.feature.clean.tsv.gz
) that counts UMIs for features after gene-filtering.
File Naming Convention:
1 |
|
File Format: Those two feature files share the same format:
1 2 3 4 |
|
gene_id
: Gene Ensemble IDgene
: Gene symbolgn
: the count per gene per barcode for Genegt
: the count per gene per barcode for GeneFullspl
: the count per gene per barcode for Splicedunspl
: the count per gene per barcode for Unsplicedambig
: the count per gene per barcode for Ambiguous
Output Guidelines¶
No action is required.
Parameters¶
1 2 3 4 5 |
|
-
The
keep_gene_type
Parameter Specifies the types of genes to retain during gene filtering. -
The
rm_gene_regex
Parameter Defines the types of genes to be excluded during gene filtering. -
The
min_ct_per_feature
Parameter Defines the minimal UMI count for genes. Genes of which number of UMI is smaller than this cutoff will be removed.
Info
It is important to note that both keep_gene_type
and rm_gene_regex
parameters utilizes regular expressions.
Dependencies¶
Given sdgeAR_featurefilter
requires input from Rule sdgeAR_reformat
, Rule sdgeAR_featurefilter
can only execute after sdgeAR_reformat
and its prerequisite rules have successfully completed their operations. See an overview of the rule dependencies in the Workflow Structure.
Code Snippet¶
The code for this rule is provided in c03_sdgeAR_featurefilter.smk
.