Seq-Scope Starter Tutorial¶
Input Data¶
This tutorial uses an example SGE from mouse hippocampus, extracted via spatial masking from a Seq-Scope coronal brain slice.
File Format
Actual input formats are platform-dependent. Please refer to the Vignettes for detailed input specifications by each platform.
SeqScope provides SGE with three files:
barcodes.tsv.gz
– spatial barcode metadata
1 2 3 |
|
- Column 1: Sorted spatial barcodes
- Column 2: 1-based integer index of spatial barcodes, used in
matrix.mtx.gz
- Column 3: 1-based integer index from the full barcode that is in the STARsolo output
- Column 4: Lane ID (fixed as
1
) - Column 5: Tile ID (fixed as
1
) - Column 6: X-coordinates
- Column 7: Y-coordinates
- Column 8: Five comma-separated numbers denote the count per spatial barcode for "Gene", "GeneFull", "Spliced", "Unspliced", and "Ambiguous".
features.tsv.gz
– feature metadata
1 2 3 |
|
- Column 1: Feature ID
- Column 2: Feature symbol
- Column 3: 1-based integer index of genes, used in
matrix.mtx.gz
- Column 4: Five comma-separated numbers denote the count per gene "Gene", "GeneFull", "Spliced", "Unspliced", and "Ambiguous".
matrix.mtx.gz
– expression count matrix
1 2 3 4 5 |
|
Header
: Initial lines form the header, declaring the matrix's adherence to the Market Matrix (MTX) format, outlining its traits. This may include comments (lines beginning with%
) for extra metadata, all marked by a “%”.Dimensions
: Following the header, the first line details the matrix dimensions: the count of rows (features), columns (barcodes), and non-zero entries.Data Entries
: Post-dimensions, subsequent lines enumerate non-zero entries in seven columns: row index (feature index), column index (barcode index), and five values (expression levels) corresponds to "Gene", "GeneFull", "Spliced", "Unspliced", and "Ambiguous".- "Gene": represents unique, confidently mapped transcript count ("gene name"-based);
- "GeneFull": denotes total transcript count assigned to gene (includes ambiguities).
Data Access
The example data is hosted on Zenedo (10.5281/zenodo.15701394).
Follow the commands below to download the example data.
1 2 3 4 |
|
Set Up the Environment¶
Define paths to all required binaries and resources, and target AWS S3 bucket. Optionally, specify a fixed color map for consistent rendering.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
Define data ID and analysis parameters:
1 2 3 4 5 6 7 8 |
|
How to Define Scaling Factors for Seq-Scope
The latest SeqScope with an Illumina NovaSeq 6000 uses NovaScope
pipeline to process sequencing data. NovaScope
defaults to generate SGE at nanometer (nm) resolution, meaning each pixel corresponds to 1 nm.
Thus, use 1000 as scaling factor from coordinate to micrometer since 1000 nm = 1 µm.
SGE Format Conversion¶
Convert the raw input to the unified SGE format. See more details in SGE Format Conversion.
1 2 3 4 5 6 7 8 9 10 |
|
Parameter | Required | Type | Description |
---|---|---|---|
--platform |
required | string | Platform (options: "10x_visium_hd ", "seqscope ", "10x_xenium ", "bgi_stereoseq ", "cosmx_smi ", "vizgen_merscope ", "pixel_seq ", "generic ") |
--in-mex |
required | string | Path to the input MEX directory containing gene × barcode matrix |
--units-per-um |
required | float | Scale to convert coordinates to microns (default: 1.0 ) |
--out-dir |
required | string | Output directory for the converted SGE files |
--makefn |
string | File name for the generated Makefile (default: sge_convert.mk ) |
|
--exclude-feature-regex |
regex | Pattern to exclude control features | |
--sge-visual |
flag | Enable SGE visualization step (generates diagnostic image) (default: FALSE ) |
|
--spatula |
string | Path to the spatula binary (default: spatula ) |
|
--n-jobs |
int | Number of parallel jobs for processing (default: 1 ) |
FICTURE
analysis¶
Compute spatial factors using punkst
(FICTURE2 mode). See more details in Reference page.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Parameter | Required | Type | Description |
---|---|---|---|
--main |
required 1 | flag | Enable cartloader to run all five steps |
--in-transcript |
required | string | Path to input transcript-level SGE file |
--out-dir |
required | string | Path to output directory |
--width |
required | int or comma-separated list | LDA training hexagon width(s) |
--n-factor |
required | int or comma-separated list | Number of LDA factors |
--makefn |
string | File name for the generated Makefile (default: run_ficture2.mk ) |
|
--in-feature |
string | Path to input feature file | |
--in-minmax |
string | Path to input coordinate min/max file | |
--cmap-file |
string | Path to color map file | |
--exclude-feature-regex |
regex | Pattern to exclude features | |
--spatula |
string | Path to the spatula binary (default: spatula ) |
|
--ficture2 |
string | Path to the punkst directory (defaults to punkst repository within submodules directory of cartloader ) |
|
--n-jobs |
int | Number of parallel jobs (default: 1 ) |
|
--threads |
int | Number of threads per job (default: 1 ) |
1: cartloader
requires the user to specify at least one action. Available actions includes: --tile
to run tiling step; --segment
to run segmentation step; --init-lda
to run LDA training step; --decode
to run decoding step; --summary
to run summarization step; --main
to run all above five actions.
cartloader
Compilation¶
Generate pmtiles and web-compatible tile directories. See more details in Reference page.
1 2 3 4 5 6 7 8 9 10 |
|
Parameter | Required | Type | Description |
---|---|---|---|
--fic-dir |
required | string | Path to the input directory containing FICTURE2 output |
--out-dir |
required | string | Path to the output directory for PMTiles and web tiles |
--id |
required | string | Dataset ID used for naming outputs and metadata |
--makefn |
string | File name for the generated Makefile (default: run_cartload2.mk ) |
|
--spatula |
string | Path to the spatula binary (default: spatula ) |
|
--pmtiles |
string | Path to the pmtiles binary (default: pmtiles ) |
|
--tippecanoe |
string | Path to the tippecanoe binary (default: tippecanoe ) |
|
--n-jobs |
int | Number of parallel jobs (default: 1 ) |
|
--threads |
int | Number of threads per job (default: 1 ) |
Upload to Data Repository¶
AWS Uploads¶
Copy the generated cartloader outputs to your designated AWS S3 catalog path:
1 2 3 4 5 |
|
Parameter | Required | Type | Description |
---|---|---|---|
--in-dir |
required | string | Path to the input directory containing the cartloader compilation output |
--s3-dir |
required | string | Path to the target S3 directory for uploading |
--aws |
string | Path to the AWS CLI binary | |
--n-jobs |
int | Number of parallel jobs |