NovaScope Workflows¶
Below is the description of the main and plus workflows of NovaScope using rule graphs.
For further details on the rules, their execution, and the workflow's structure, please refer to the NovaScope Full Documentation.
What is a 'workflow' and a 'rule'?
In Snakemake, a workflow is a series of steps for processing data, defined by rules. Each rule specifies a task that transforms input files into output files, such as aligning sequences or filtering data. A rule file details the required inputs, resulting outputs, and necessary actions.
What are 'rule dependencies' and a 'rule graph'?
Rule dependencies, determined by input and output files, ensure tasks are executed in the correct sequence within a workflow. Below, we use rule graphs illustrate these dependencies in the main and plus workflows, showing how tasks are interconnected and the order of execution.
Main Workflow¶
For the main function of NovaScope, the mapping of each step to its specific rule is as follows:
- Generate spatial maps from the 1st sequencing data (Rules
a01_fastq2sbcd
,a02_sbcd2chip
) - Map the 2nd sequencing data with the spatial map (Rule
a03_smatch
) - Align the 2nd sequencing reads to the reference genome (Rule
a04_align
) - Generate a spatial digital gene expression (SGE) matrix, indexed by transcripts, at submicron resolution (Rule
a05_dge2sdge
) - (Optional) Visualize the spatial expression of specific genes (Rule
b01_sdge_visual
)
The rule graph illustrates the relationships between rules:
Figure 1: Main workflow rule graph. Each node represents a specific rule in the Snakemake workflow, and arrows indicate dependencies, pointing from prerequisite to dependent rules. Prerequisite rules must be completed before the dependent rule can commence.
Plus Workflow¶
In addition to the main functions, NovaScope offers additional capabilities as outlined below:
- Histology alignment (Rule
b02_historef
) - Spatial map layout examination (Rule
b03_sbcd_layout
) - SGE matrix filtering by gene type, gene name, UMI count, or UMI density (Rules
c03_sdgeAR_featurefilter
andc03_sdge_polygonfilter
) - SGE matrix reformatting from 10x Genomics format to a TSV format compatible with FICTURE (Rule
c02_sdgeAR_reformat
) - SGE matrix segmentation from transcript-indexed to hexagon-indexed in 10x Genomics or FICTURE-compatible TSV format (Rules
c04_sdgeAR_segment_10x
andc04_sdgeAR_segment_ficture
)
Figure 2: Plus workflow rule graph. The prerequisite rules for sdgeAR_segment_10x
and sdgeAR_segment_ficture
vary based on the need for SGE matrix filtering. This example workflow shows a job requesting a filtered hexagon-indexed SGE in FICTURE-compatible format, but a raw hexagon-indexed SGE in 10x Genomics format. See more details in the Execution Flow by Request.