NovaScope Workflows¶

Below is the description of the main and plus workflows of NovaScope using rule graphs.

For further details on the rules, their execution, and the workflow's structure, please refer to the NovaScope Full Documentation.

What is a 'workflow' and a 'rule'?

In Snakemake, a workflow is a series of steps for processing data, defined by rules. Each rule specifies a task that transforms input files into output files, such as aligning sequences or filtering data. A rule file details the required inputs, resulting outputs, and necessary actions.

What are 'rule dependencies' and a 'rule graph'?

Rule dependencies, determined by input and output files, ensure tasks are executed in the correct sequence within a workflow. Below, we use rule graphs illustrate these dependencies in the main and plus workflows, showing how tasks are interconnected and the order of execution.

Main Workflow¶

For the main function of NovaScope, the mapping of each step to its specific rule is as follows:

Generate spatial maps from the 1st sequencing data (Rules a01_fastq2sbcd, a02_sbcd2chip)
Map the 2nd sequencing data with the spatial map (Rule a03_smatch)
Align the 2nd sequencing reads to the reference genome (Rule a04_align)
Generate a spatial digital gene expression (SGE) matrix, indexed by transcripts, at submicron resolution (Rule a05_dge2sdge)
(Optional) Visualize the spatial expression of specific genes (Rule b01_sdge_visual)

The rule graph illustrates the relationships between rules:

Figure 1: Main workflow rule graph. Each node represents a specific rule in the Snakemake workflow, and arrows indicate dependencies, pointing from prerequisite to dependent rules. Prerequisite rules must be completed before the dependent rule can commence.

Plus Workflow¶

In addition to the main functions, NovaScope offers additional capabilities as outlined below:

Histology alignment (Rule b02_historef)
Spatial map layout examination (Rule b03_sbcd_layout)
SGE matrix filtering by gene type, gene name, UMI count, or UMI density (Rules c03_sdgeAR_featurefilter and c03_sdge_polygonfilter)
SGE matrix reformatting from 10x Genomics format to a TSV format compatible with FICTURE (Rule c02_sdgeAR_reformat)
SGE matrix segmentation from transcript-indexed to hexagon-indexed in 10x Genomics or FICTURE-compatible TSV format (Rules c04_sdgeAR_segment_10x and c04_sdgeAR_segment_ficture)

Figure 2: Plus workflow rule graph. The prerequisite rules for sdgeAR_segment_10x and sdgeAR_segment_ficture vary based on the need for SGE matrix filtering. This example workflow shows a job requesting a filtered hexagon-indexed SGE in FICTURE-compatible format, but a raw hexagon-indexed SGE in 10x Genomics format. See more details in the Execution Flow by Request.