Skip to content

Visium HD End-to-End Pipeline

Overview

This tutorial walks through end‑to‑end processing of 10x Visium HD data with CartLoader: converting inputs, running FICTURE, importing cell results and histology, packaging assets, and uploading for sharing.

Input Data


Data Structure and Format

See Space Ranger output details in the official documentation: Space Ranger Outputs

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
├── binned_outputs
│   ├── square_002um
│   │   ├── filtered_feature_bc_matrix
│   │   │   ├── barcodes.tsv.gz
│   │   │   ├── features.tsv.gz
│   │   │   └── matrix.mtx.gz
│   │   ├── ...
│   │   └── spatial
│   │       ├── aligned_fiducials.jpg
│   │       ├── aligned_tissue_image.jpg
│   │       ├── cytassist_image.tiff
│   │       ├── detected_tissue_image.jpg
│   │       ├── scalefactors_json.json
│   │       ├── tissue_hires_image.png
│   │       ├── tissue_lowres_image.png
│   │       └── tissue_positions.parquet
│   ├── square_008um
│   │   ├── analysis
│   │   │   ├── clustering
│   │   │   │   ├── gene_expression_graphclust
│   │   │   │   │   └── clusters.csv
│   │   │   │   └── ...
│   │   │   ├── diffexp
│   │   │   │   ├── gene_expression_graphclust
│   │   │   │   │   └── differential_expression.csv
│   │   │   │   └── ...
│   │   │   ├── pca
│   │   │   │   └── gene_expression_10_components
│   │   │   │       ├── projection.csv
│   │   │   │       ├── variance.csv
│   │   │   │       └── ...
│   │   │   └── umap
│   │   │       └── gene_expression_2_components
│   │   │           └── projection.csv
│   │   ├── filtered_feature_bc_matrix
│   │   │   └── ...     # Mirrors binned_outputs/square_002um/filtered_feature_bc_matrix
│   │   ├── ...
│   │   └── spatial
│   │       └── ...     # Mirrors binned_outputs/square_002um/spatial
│   └── square_016um
│       └── ...         # Mirrors square_008um (analysis, spatial, filtered_feature_bc_matrix)
├── segmented_outputs
│   ├── cell_segmentations.geojson
│   └── ...             # Mirrors binned_outputs/square_008um (analysis, spatial, filtered_feature_bc_matrix)
└── ...

Visium HD slides use a 2×2 µm grid of barcoded squares (square_002um) for high-resolution spatial gene mapping. Use filtered_feature_bc_matrix to only process tissue-associated signals.

ATTENTION

The file‑format examples below use a sample Visium HD dataset. Paths, IDs, and values are illustrative and may not match the dataset used in this tutorial.

SGE comprises several key files as below:

filtered_feature_bc_matrix/barcodes.tsv.gz – spatial barcode for tissue locations
1
2
3
s_002um_00639_00600-1
s_002um_00923_01639-1
s_002um_01050_01530-1
  • Column 1: Spatial barcodes corresponding to specific locations on the tissue section.
filtered_feature_bc_matrix/features.tsv.gz – feature metadata
1
2
3
ENSMUSG00000051951  Xkr4    Gene Expression
ENSMUSG00000025900  Rp1     Gene Expression
ENSMUSG00000025902  Sox17   Gene Expression
  • Column 1: Feature ID
  • Column 2: Feature symbol
  • Column 3: Feature type
filtered_feature_bc_matrix/matrix.mtx.gz – expression count matrix
1
2
3
4
5
6
%%MatrixMarket matrix coordinate integer general
%
19059 869411 11376563
3606 1 1
8957 1 1
9733 1 1
  • Header: Initial lines form the header, declaring the matrix's adherence to the Market Matrix (MTX) format, outlining its traits. This may include comments (lines beginning with %) for extra metadata, all marked by a “%”.
  • Dimensions: Following the header, the first line details the matrix dimensions: the count of rows (features), columns (barcodes), and non-zero entries.
  • Data Entries: Post-dimensions, subsequent lines enumerate non-zero entries in seven columns: row index (feature index), column index (barcode index), and one value presenting the expression count per barcode per feature.
tissue_positions.parquet – spatial barcode metadata
1
2
barcode                 in_tissue   array_row   array_col   pxl_row_in_fullres  pxl_col_in_fullres
s_002um_00434_01637-1   1           434         1637        3396.371014         9125.919898
  • barcode: Unique spatial barcode associated with each capture spot.
  • in_tissue: Binary flag (1 = in tissue, 0 = background) indicating whether the spot falls within the tissue boundary.
  • array_row, array_col: Integer indices representing the position of the spot on the capture array grid.
  • pxl_row_in_fullres, pxl_col_in_fullres: Floating point coordinates locating the spot in full-resolution tissue image pixels.
scalefactors_json.json – pixel-to-micrometer scaling factors
1
2
3
4
5
6
7
8
9
{
    "spot_diameter_fullres": 7.303953797779634,
    "bin_size_um": 2.0,
    "microns_per_pixel": 0.2738242950835738,
    "regist_target_img_scalef": 0.2505533,
    "tissue_lowres_scalef": 0.02505533,
    "fiducial_diameter_fullres": 1205.1523766336395,
    "tissue_hires_scalef": 0.2505533
}
  • spot_diameter_fullres: Estimated diameter of a barcoded spot in full-resolution pixels.
  • bin_size_um: Physical size (in micrometers) of the smallest bin, typically 2.0 µm for Visium HD.
  • microns_per_pixel: Resolution of the full-res image, used to convert pixel distances to micrometers.
  • regist_target_img_scalef: Scaling factor applied during image registration to the target image.
  • tissue_lowres_scalef: Downscaling factor from full-res to low-resolution tissue image.
  • fiducial_diameter_fullres: Diameter of fiducial markers in full-resolution pixels, useful for alignment.
  • tissue_hires_scalef: Downscaling factor from full-res to high-resolution tissue image.

Cell Segmentation Mask

Location: segmented_outputs/cell_segmentations.geojson

1
2
3
"cell_id","x_centroid","y_centroid","transcript_counts","control_probe_counts","genomic_control_counts","control_codeword_counts","unassigned_codeword_counts","deprecated_codeword_counts","total_counts","cell_area","nucleus_area","nucleus_count","segmentation_method"
"aaaagkdm-1",170.85508728027344,2017.2412109375,1,0,0,0,0,0,1,46.285157930105925,NaN,0,"Segmented by boundary stain (ATP1A1+CD45+E-Cadherin)"
"aaaamcnn-1",141.60569763183594,2481.442138671875,341,0,0,0,0,1,342,111.5359415486455,50.484689332544804,1,"Segmented by boundary stain (ATP1A1+CD45+E-Cadherin)"

Cell Feature Matrix

Location: segmented_outputs/filtered_feature_cell_matrix

This contains barcodes.tsv.gz,features.tsv.gz, matrix.mtx.gz, providing cell id, feature information, and expression count matrix, respectively.

Clusters

Location: segmented_outputs/analysis/clustering/gene_expression_graphclust/clusters.csv

1
2
3
Barcode,Cluster
cellid_000000001-1,22
cellid_000000002-1,22

Differentially Expressed Genes

Location: segmented_outputs/analysis/diffexp/gene_expression_graphclust/differential_expression.csv

1
2
3
Feature ID,Feature Name,Cluster 1 Mean Counts,Cluster 1 Log2 fold change,Cluster 1 Adjusted p value,Cluster 2 Mean Counts,Cluster 2 Log2 fold change,Cluster 2 Adjusted p value,Cluster 3 Mean Counts,Cluster 3 Log2 fold change,Cluster 3 Adjusted p value,Cluster 4 Mean Counts,Cluster 4 Log2 fold change,Cluster 4 Adjusted p value,Cluster 5 Mean Counts,Cluster 5 Log2 fold change,Cluster 5 Adjusted p value,Cluster 6 Mean Counts,Cluster 6 Log2 fold change,Cluster 6 Adjusted p value,Cluster 7 Mean Counts,Cluster 7 Log2 fold change,Cluster 7 Adjusted p value,Cluster 8 Mean Counts,Cluster 8 Log2 fold change,Cluster 8 Adjusted p value,Cluster 9 Mean Counts,Cluster 9 Log2 fold change,Cluster 9 Adjusted p value,Cluster 10 Mean Counts,Cluster 10 Log2 fold change,Cluster 10 Adjusted p value,Cluster 11 Mean Counts,Cluster 11 Log2 fold change,Cluster 11 Adjusted p value,Cluster 12 Mean Counts,Cluster 12 Log2 fold change,Cluster 12 Adjusted p value,Cluster 13 Mean Counts,Cluster 13 Log2 fold change,Cluster 13 Adjusted p value,Cluster 14 Mean Counts,Cluster 14 Log2 fold change,Cluster 14 Adjusted p value,Cluster 15 Mean Counts,Cluster 15 Log2 fold change,Cluster 15 Adjusted p value,Cluster 16 Mean Counts,Cluster 16 Log2 fold change,Cluster 16 Adjusted p value,Cluster 17 Mean Counts,Cluster 17 Log2 fold change,Cluster 17 Adjusted p value,Cluster 18 Mean Counts,Cluster 18 Log2 fold change,Cluster 18 Adjusted p value,Cluster 19 Mean Counts,Cluster 19 Log2 fold change,Cluster 19 Adjusted p value,Cluster 20 Mean Counts,Cluster 20 Log2 fold change,Cluster 20 Adjusted p value,Cluster 21 Mean Counts,Cluster 21 Log2 fold change,Cluster 21 Adjusted p value,Cluster 22 Mean Counts,Cluster 22 Log2 fold change,Cluster 22 Adjusted p value,Cluster 23 Mean Counts,Cluster 23 Log2 fold change,Cluster 23 Adjusted p value,Cluster 24 Mean Counts,Cluster 24 Log2 fold change,Cluster 24 Adjusted p value
ENSMUSG00000051951,Xkr4,0.0020893375742615837,0.542453177216764,0.16305286911324085,0.0019306072327678393,0.34251290321116556,0.628624629681027,0.0009020306879791494,-0.677062452566048,0.6057595443651351,0.0008575468486789855,-0.53825692400323,1,0.0018835945742411177,0.3347722133550999,1,0.0017232930683496192,0.30174489068131116,1,0.0008765004215231041,-0.6062657172996246,1,0.0006958304617047569,-0.669599363115374,1,0.0021901445815864866,0.5424639235287998,0.6460045298567026,0,-0.513593267969263,1,0.000450562773540615,-0.8809942923847434,1,0.0006856395158652269,-0.2651228613954135,1,0.0008800111512863451,-0.49971125727530463,1,0.0026227703011305675,1.0167399161570962,1,0.002277387540951688,0.7536928538206205,1,0,-1.333742893536682,1,0.0007018464992057343,-0.23097850703201495,1,0.0005602354699616488,-0.560849410646286,1,0.0015894120085537071,0.37398251412889394,1,0,1.526558986825357,1,0,-0.5313075119163866,1,0,4.9659921506021,1,0,1.1981124565354708,1,0,1.1280527529013042,1
ENSMUSG00000089699,Gm1992,0.00003369899313325135,2.372528175774452,0.49812531917443914,0,1.0054779159335947,1,0,2.907900048155108,1,0,3.8795955908826674,1,0,2.9741824980986316,1,0,3.8867073914024672,1,0,3.4811971239507127,1,0,4.171702890865568,1,0,2.5799386289474633,1,0,5.929350227879466,1,0,4.553633935251982,1,0,5.16950536624131,1,0,3.9181412576105927,1,0,5.104202757407435,1,0,4.56960978938165,1,0,5.109200602312047,1,0,5.203649720604709,1,0,4.873778816990438,1,0,4.791835029014791,1,0,7.969502482674086,1,0,5.911635983932342,1,0,11.408935646450828,1,0,7.641055952384198,1,0,7.570996248750033,1


Data Access

Downloaded the ST data from 10x Genomics Dataset portal.


Set Up the Environment

Pre-installed tools

Please ensure you have installed all required tools (See Installation).

Define paths to all required binaries and resources. Optionally, specify a fixed color map for consistent rendering.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# ====
# Replace each placeholder with the actual path on your system.  
# ====

work_dir=/path/to/work/directory        # path to work directory that contains the downloaded input data
cd $work_dir

# Define paths to required binaries and resources
spatula=/path/to/spatula/binary         # path to spatula executable
punkst=/path/to/punkst/binary           # path to FICTURE2 (punkst) executable
tippecanoe=/path/to/tippecanoe/binary   # path to tippecanoe executable
pmtiles=/path/to/pmtiles/binary         # path to pmtiles executable
aws=/path/to/aws/cli/binary             # path to AWS CLI binary

# (Optional) Define path to color map. 
cmap=/path/to/color/map                 # Path to fixed color map. `CartLoader` includes one at cartloader/assets/fixed_color_map_256.tsv.

# Number of jobs
n_jobs=10                               # If not specified, the number of jobs defaults to 1.

# Activate the bioconda environment
conda activate ENV_NAME                 # replace ENV_NAME with your conda environment name

Define data ID and analysis parameters:

1
2
3
4
5
6
7
8
9
# Unique identifier for your dataset
DATA_ID="visiumhd_3prime_mouse_brain"    # change this to reflect your dataset name

# LDA parameters
train_width=18                           # define LDA training hexagon width (comma-separated if multiple widths are applied)
n_factor=6,12                            # define number of factors in LDA training (comma-separated if multiple n-factor are applied)

# Path to AWS S3 directory
S3_DIR=/s3/path/to/s3/dir                # Recommend to use DATA_ID as directory name, such as s3://bucket-name/visiumhd-3prime-mouse-brain

How to define scaling for Visium HD?

10x Visium HD provides scalefactors_json.json (pixel‑to‑µm). CartLoader accepts it via --scale-json and computes the scaling automatically, so you don’t need to manually specify --units-per-um.

Alternatively, provide the scale directly with --units-per-um.

Run Pipelines

The example below runs all modules together. Customize actions with flags.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
cartloader run_visiumhd \
  --load-space-ranger \
  --sge-convert \
  --run-ficture2 \
  --import-cells \
  --import-images \
  --run-cartload2 \
  --upload-aws \
  --space-ranger-dir /path/to/space/ranger/output \
  --out-dir /path/to/out/dir \
  --s3-dir ${S3_DIR} \
  --width ${train_width} \
  --n-factor ${n_factor} \
  --id ${DATA_ID} \
  --spatula ${spatula} \
  --ficture2 ${punkst} \
  --pmtiles ${pmtiles} \
  --tippecanoe ${tippecanoe} \
  --aws ${aws} \
  --n-jobs ${n_jobs} \
  --threads ${n_jobs}

Action Flags to Enable Modules

Actions

run_visiumhd runs multiple CartLoader modules together; enable and combine actions with flags.

CartLoader Modules Flags in run_visiumhd Actions Prerequisites
load_space_ranger --load-space-ranger Summarize Space Ranger outputs into JSON Space Ranger Output files
sge_convert --sge-convert Convert SGE to CartLoader format; optional density filter and visuals Space Ranger assets JSON (from load_space_ranger) or transcript CSV/Parquet
run_ficture2 --run-ficture2 FICTURE analysis SGE (from sge_convert); FICTURE parameters (--width, --n-factor)
import_space_cell --import-cells Import cell points, boundaries, cluster, de; Space Ranger assets JSON or manual CSVs; also --cell-id
import_image --import-images Import background images (BTF-TIFF) → PNG/PMTiles; Space Ranger assets JSON or --tifs; also --image-ids/--all-images
run_cartload2 --run-cartload2 Package SGE, optional FICTURE/cells/images into PMTiles; write catalog.yaml SGE, optional FICTURE assets or any imported cell/image assets; --id
upload_aws --upload-aws Upload catalog and PMTiles to S3 catalog.yaml (from run_cartload2); --s3-bucket, --id
upload_zenodo --upload-zenodo Upload catalog and PMTiles to Zenodo catalog.yaml (from run_cartload2); --zenodo-token

Parameter Requirements by Action Flag

Below are explanations of the parameters used in the example. For the full list, see the run_visiumhd reference page.

Parameter Required when flags Description
--space-ranger-dir --load-space-ranger Space Ranger output directory to scan.
--space-ranger-assets optional --load-space-ranger Output JSON manifest path (defaults under --out-dir).
--width --run-ficture2 Hexagon width(s) in µm for training/projection.
--n-factor --run-ficture2 Factor count(s) for FICTURE training.
--cell-id --import-cells Asset ID/prefix for cell outputs.
--image-ids or --all-images --import-images Choose specific image IDs or import all detected.
--id --run-cartload2 Catalog ID (used in catalog.yaml and outputs).
--s3-dir --upload-aws Destination S3 path (e.g., s3://bucket/prefix).
--out-dir any action Output root directory for generated artifacts.
--dry-run, --restart optional (any) Control execution (preview or rerun ignoring existing outputs).
--n-jobs, --threads optional (any) Parallelism for Make/GDAL/tippecanoe steps.

Outputs

See more details of output at the Reference pages for run_ficture2 and run_cartload2.