Skip to content

CosMX SMI Starter Tutorial

Input Data

The input data is from an adult mouse hippocampus, extracted by masking a coronal brain section. The original full-section

File Format

The CosMx SMI by NanoString generates high-resolution spatial transcriptomics data with single-molecule resolution with a comma-separated values (CSV) table.

CSV File Format

1
2
3
4
"fov","cell_ID","x_global_px","y_global_px","x_local_px","y_local_px","z","target","CellComp"
64,0,-473043,7954.533,4015.3,4246.2,1,"Gfap","None"
64,0,-473022.9,7902.723,4035.48,4194.39,1,"Fth1","None"
64,0,-473132,7836.476,3926.34,4128.143,1,"Ptn","None"
  • fov: The field of view (FOV) number.
  • cell_ID: Unique identifier for a single cell within a given FOV. 0 if background or unassigned molecules.
  • x_global_px, y_global_px: Global pixel coordinates relative to the tisse.
  • x_local_px, y_local_px: The x or y position (in pixels) relative to the given FOV.
  • z: Z-plane index representing the depth (optical section) where the transcript was detected.
  • target: Name of the target.
  • CellComp: Subcellular location of target.

Data Access

The example data is hosted on Zenedo ().

Follow the commands below to download the example data.

1
2
3
4
work_dir=/path/to/work/directory
cd $work_dir
wget  https://zenodo.org/records/15786632/files/cosmxsmi_starter.raw.tar.gz
tar --strip-components=1 -zxvf cosmxsmi_starter.raw.tar.gz

Set Up the Environment

Define paths to all required binaries and resources. Optionally, specify a fixed color map for consistent rendering.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# ====
# Replace each placeholder with the actual path on your system.  
# ====

work_dir=/path/to/work/directory        # path to work directory that contains the downloaded input data
cd $work_dir

# Define paths to required binaries and resources
spatula=/path/to/spatula/binary         # path to spatula executable
punkst=/path/to/punkst/binary           # path to FICTURE2/punkst executable
tippecanoe=/path/to/tippecanoe/binary   # path to tippecanoe executable
pmtiles=/path/to/pmtiles/binary         # path to pmtiles executable
aws=/path/to/aws/cli/binary             # path to AWS CLI binary

# (Optional) Define path to color map. 
cmap=/path/to/color/map                 # Path to the fixed color map for rendering. cartloader provides a fixed color map at cartloader/assets/fixed_color_map_256.tsv.

# Number of jobs
n_jobs=10                               # If not specify, the number of jobs defaults to 1.

# Activate the bioconda environment
conda activate ENV_NAME                 # replace BIOENV_NAME with your bioconda environment name

Define data ID and analysis parameters:

1
2
3
4
5
6
7
8
# Unique identifier for your dataset
DATA_ID="cosmxsmi_hippo"                # change this to reflect your dataset name
PLATFORM="cosmx_smi"                    # platform information
SCALE=$(echo 1000/120|bc -l)              # scale from coordinate to micrometer

# LDA parameters
train_width=12                           # define LDA training hexagon width (comma-separated if multiple widths are applied)
n_factor=6,12                            # define number of factors in LDA training (comma-separated if multiple n-factor are applied)

How to Define Scaling Factors for CosMX SMI?

According to the README.html provided with the Pixel-seq dataset, each pixel has an edge length of 120 nm. To calculate the number of pixels per micrometer, use the formula: scale = 1000 / 120.

SGE Format Conversion

Convert the raw input to the unified SGE format. See more details in Reference page.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
cartloader sge_convert \
  --makefn sge_convert.mk \
  --platform ${PLATFORM} \
  --in-csv ./input.tsv.gz \
  --units-per-um ${SCALE} \
  --out-dir ./sge \
  --exclude-feature-regex '^(BLANK|NegCon|NegPrb)' \
  --sge-visual \
  --spatula ${spatula} \
  --n-jobs ${n_jobs}
Parameter Required Type Description
--platform required string Platform (options: "10x_visium_hd", "seqscope", "10x_xenium", "bgi_stereoseq", "cosmx_smi", "vizgen_merscope", "pixel_seq", "generic")
--in-csv required string Path to the input TSV/CSV file
--units-per-um required float Scale to convert coordinates to microns (default: 1.0)
--out-dir required string Output directory for the converted SGE files
--makefn string File name for the generated Makefile (default: sge_convert.mk)
--exclude-feature-regex regex Pattern to exclude control features
--sge-visual flag Enable SGE visualization step (generates diagnostic image) (default: FALSE)
--spatula string Path to the spatula binary (default: spatula)
--n-jobs int Number of parallel jobs for processing (default: 1)

FICTURE Analysis

Compute spatial factors using punkst (FICTURE2 mode). See more details in Reference page.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
cartloader run_ficture2 \
  --makefn run_ficture2.mk \
  --main \
  --in-transcript ./sge/transcripts.unsorted.tsv.gz \
  --in-feature ./sge/feature.clean.tsv.gz \
  --in-minmax ./sge/coordinate_minmax.tsv \
  --cmap-file ${cmap} \
  --exclude-feature-regex '^(mt-.*$|Gm\d+$)' \
  --out-dir ./ficture2 \
  --width ${train_width} \
  --n-factor ${n_factor} \
  --spatula ${spatula} \
  --ficture2 ${punkst} \
  --n-jobs ${n_jobs} \
  --threads ${n_jobs}
Parameter Required Type Description
--main required 1 flag Enable cartloader to run all five steps
--in-transcript required string Path to input transcript-level SGE file
--out-dir required string Path to output directory
--width required int or comma-separated list LDA training hexagon width(s)
--n-factor required int or comma-separated list Number of LDA factors
--makefn string File name for the generated Makefile (default: run_ficture2.mk )
--in-feature string Path to input feature file
--in-minmax string Path to input coordinate min/max file
--cmap-file string Path to color map file
--exclude-feature-regex regex Pattern to exclude features
--spatula string Path to the spatula binary (default: spatula)
--ficture2 string Path to the punkst directory (defaults to punkst repository within submodules directory of cartloader)
--n-jobs int Number of parallel jobs (default: 1)
--threads int Number of threads per job (default: 1)

1: cartloader requires the user to specify at least one action. Available actions includes: --tile to run tiling step; --segment to run segmentation step; --init-lda to run LDA training step; --decode to run decoding step; --summary to run summarization step; --main to run all above five actions.

cartloader Compilation

Generate pmtiles and web-compatible tile directories. See more details in Reference page.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
cartloader run_cartload2 \
  --makefn run_cartload2.mk \
  --fic-dir ./ficture2 \
  --out-dir ./cartload2 \
  --id ${DATA_ID} \
  --spatula ${spatula} \
  --pmtiles ${pmtiles} \
  --tippecanoe ${tippecanoe} \
  --n-jobs ${n_jobs} \
  --threads ${n_jobs}
Parameter Required Type Description
--fic-dir required string Path to the input directory containing FICTURE2 output
--out-dir required string Path to the output directory for PMTiles and web tiles
--id required string Dataset ID used for naming outputs and metadata
--makefn string File name for the generated Makefile (default: run_cartload2.mk)
--spatula string Path to the spatula binary (default: spatula)
--pmtiles string Path to the pmtiles binary (default: pmtiles)
--tippecanoe string Path to the tippecanoe binary (default: tippecanoe)
--n-jobs int Number of parallel jobs (default: 1)
--threads int Number of threads per job (default: 1)

Upload to Data Repository

Choose a data repository to host/share your output

cartloader supports two upload options (AWS and Zenodo) for storing PMTiles of SGE and spatial factors in a data repository.

Choose the one that best suits your needs.

AWS Uploads

Upload the generated cartloader outputs to your designated AWS S3 directory:

1
2
3
4
5
6
7
8
# AWS S3 target location for cartostore
AWS_BUCKET="EXAMPLE_AWS_BUCKET"         # replace EXAMPLE_AWS_BUCKET with your actual S3 bucket name

cartloader upload_aws \
  --in-dir ./cartload2 \
  --s3-dir "s3://${AWS_BUCKET}/${DATA_ID}" \
  --aws ${aws} \
  --n-jobs ${n_jobs}
Parameter Required Type Description
--in-dir required string Path to the input directory containing the cartloader compilation output
--s3-dir required string Path to the target S3 directory for uploading
--aws string Path to the AWS CLI binary
--n-jobs int Number of parallel jobs

Zenodo Uploads

Upload the generated cartloader outputs to your designated Zenodo deposition or a new deposition.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
zenodo_token=/path/to/zenodo/token/file    # replace /path/to/zenodo/token/file by path to your zenodo token file

cartloader upload_zenodo \
  --in-dir ./cartload2 \
  --upload-method catalog \
  --zenodo-token $zenodo_token \
  --create-new-deposition \
  --title  "Yur Title" \
  --creators "Your Name" \
  --description "This is an example description"
Parameter Required Type Description
--in-dir required string Path to the input directory containing the cartloader compilation output
--upload-method required string Method to determine which files to upload. Options: all to upload all files in --in-dir; catalog to upload files listed in a catalog YAML file, user_list to upload files explicitly listed via --in-list
--catalog-yaml string Required if --upload-method catalog. Path to the catalog.yaml file generated in run_cartload2. If absent, will use the catalog.yaml in the input directory specified by --in-dir.
--zenodo-token required string Path to your Zenodo access file
--create-new-deposition flag a new Zenodo deposition will be created.
--title required string Required if --create-new-deposition. Title for the new Zenodo deposition.
--creators required list of str List of creators in "Lastname, Firstname" format.

Output Data

See more details of output at the Reference pages for run_ficture2 and run_cartload2.

Spatial Factor Inference from FICTURE

Below is an example of spatial factor inference results produced by FICTURE using a training width of 12, 12 factors, a fit width of 12, and an anchor resolution of 6.

FICTURE cmap

Factor RGB Weight PostUMI TopGene_pval TopGene_fc TopGene_weight
1 237,238,0 0.17207 548469625 Tmsb4x,Calm2,Snap25,Calm1,Ywhaz,Slc17a7,Cck,Dnm1,Ywhag,Aldoa,Hsp90ab1,Calm3,Atp2b1,Nell2,Rtn3,Snca,Gpm6a,Prkcb,Prkca,Cyfip2 Cck,Camk4,Prkca,Camk1d,Slc17a7,Ywhag,Epha6,Atp2b1,Calm2,Sv2b,Snca,Syn2,Prkcb,Lpl,Nell2,Snap25,Stmn2,Pde1a,Kcnq5,Cnksr2 Tmsb4x,Snap25,Calm2,Calm1,Ywhaz,Atp1b1,Aldoa,Hsp90ab1,Dnm1,Slc17a7,Rtn3,Calm3,Camk2a,Ywhag,Gpm6a,Cck,Ppp3ca,Rtn1,Nrgn,Cpe
0 255,101,101 0.16 510000383 Camk2b,Adcy1,Ppp3ca,Kalrn,Nrgn,Olfm1,Ppfia2,Ctxn1,Gria1,Gria2,Prkce,Ryr2,Camk2a,Grin2a,Gabrb3,Zbtb20,Wasf1,Tiam1,Ntrk3,Mapk1 Ppfia2,Adcy1,Sema5a,Camk2b,Kalrn,Plxna4,Calb1,Tiam1,Jun,Rasgrf2,Nedd4l,Dapk1,Grin2a,Npy1r,Itga8,Ryr2,Auts2,Ppp3ca,Dgkg,Grm7 Malat1,Camk2b,Camk2a,Calm1,Ppp3ca,Tmsb4x,Adcy1,Nrgn,Calm2,Olfm1,Atp1b1,Ndrg4,Gria2,Ywhaz,Kalrn,Gria1,Ctxn1,Prkce,Aldoa,Snap25
3 101,254,255 0.11601 369772892 Malat1,Hnrnpa2b1,Tardbp,Tnik,Grk3,Mid1,Ptprs,Nf1,Prkn,Vegfa,Kidins220,Tnrc6a,Dmd,Camk2g,Uggt2,Pou2f1,Meg3,Sema6d,Spag9,Ntsr2 Gal,Malat1,Avp,Cd109,Grk3,Lilra5,C2,Prkn,Crhr1,Ngfr,Gcgr,Dmd,Vegfa,C3,Dapk2,Pcsk6,Nrf1,Uggt2,P2rx4,Slc18a2 Malat1,Hnrnpa2b1,Hmgb1,Meg3,Gria2,Ptprs,Tardbp,Gnao1,Tnik,Apoe,Kidins220,Slc1a2,Ahcyl1,Nf1,Bcan,Mid1,Spag9,Camk2g,Fus,Tuba1a/b/c
5 255,101,254 0.09996 318610467 Nap1l5,Atp1b1,Gad1,Ndrg4,Meg3,Cnr1,Sncb,Gad2,Sst,Snap25,Syt1,Zwint,Tmsb10,Snrpn,Gnas,Npy,Sv2a,Cacnb4,Ldhb,Mdh1 Sst,Gad2,Npy,Cnr1,Nap1l5,Gad1,Pvalb,Slc32a1,Vip,Gap43,Adgrl2,Cntn4,Sncb,Atp2b4,Cckbr,Grik1,Grm8,Rims2,Cacnb4,Tmsb10 Atp1b1,Malat1,Snap25,Ndrg4,Meg3,Nap1l5,Ckb,Rtn3,Snrpn,Gnas,Calm1,Syt1,Zwint,Mdh1,Dnm1,Calm2,Aldoa,Sncb,Gad1,Dynll2
2 101,255,101 0.09064 288906127 Plp1,Ptgds,Scd2,Apod,Mag,Cryab,Ndrg1,Mog,Ugt8a,Bin1,Gpr37,Aspa,Mobp,Fa2h,Gsn,Mbp,Abca2,Cntn2,Slc44a1,Jam3 Plp1,Mag,Mog,Ugt8a,Aspa,Ptgds,Cryab,Apod,Fa2h,Myrf,Ndrg1,Gpr37,Gjb1,Pde8a,Cntn2,Gsn,Fgfr2,Scd2,Jam3,Abca2 Plp1,Malat1,Ptgds,Scd2,Mbp,Fth1,Glul,Apod,Mobp,Mag,Cryab,Bin1,Ndrg1,Gpm6b,Mog,Ywhaq,Cd81,Gpr37,Tmsb4x,Hmgb1
4 101,101,255 0.08492 270686250 Apoe,Clu,Slc1a2,Glul,Aldoc,Atp1a2,Slc1a3,Cst3,Gja1,Plpp3,Ckb,Ndrg2,Sparcl1,Gpr37l1,Gfap,Gstm1,Aqp4,Mt1,Mt3,Mfge8 Gja1,Slc1a3,Aldoc,Apoe,Atp1a2,Clu,Gpr37l1,Aqp4,Plpp3,Slc1a2,Gfap,Mt1,Mfge8,Gstm1,Slc6a11,Ndrg2,Glul,Ednrb,Ntsr2,Agt Apoe,Clu,Glul,Slc1a2,Cst3,Aldoc,Malat1,Ckb,Atp1a2,Sparcl1,Slc1a3,Mt3,Cpe,Ndrg2,Plpp3,Scd2,Tspan7,Gja1,Ntrk2,Dbi
6 255,178,101 0.08061 256947664 Pcp4,Itm2c,Calb2,Nnat,Pcsk1n,Rtn1,Map1b,Cacna1e,Thy1,Clstn1,Cbln2,Gabbr2,Tcf7l2,Bex1/2,Rit2,Kcnd2,Slc2a13,Apba1,Tac1,Nsg1 Calb2,Cbln2,Tac1,Nnat,Oprm1,Tcf7l2,Pcp4,Apba1,Slc5a7,Rit2,Slc2a13,Sstr2,Rgs6,Itm2c,Cartpt,Synpr,Gabbr2,Pcsk1n,Gria4,Fgf1 Itm2c,Rtn1,Pcp4,Malat1,Map1b,Pcsk1n,Cpe,Atp1b1,Aldoa,Tuba1a/b/c,Hsp90ab1,Thy1,Psap,Rtn3,Eif4a2,Clstn1,Gnao1,Meg3,Cacna1e,Slc25a4
7 178,255,101 0.04347 138561767 Bsg,Slc2a1,Cldn5,Flt1,Itm2a,Rgs5,Pltp,Igfbp7,Serinc3,Esam,Vim,Slc7a5,Acta2,Igf1r,Id1,Fn1,Sptbn1,Srgn,Pecam1,St3gal6 Flt1,Cldn5,Itm2a,Rgs5,Pltp,Acta2,Fn1,Pecam1,Esam,Slc2a1,Igfbp7,Myl9,Lsr,Emcn,Slc7a5,Kdr,Tagln,Srgn,St3gal6,Vim Bsg,Malat1,Slc2a1,Tmsb4x,Serinc3,Cldn5,Sptbn1,H3f3b,Flt1,Myl6,Aplp2,Itm2a,Gnb1,Pltp,Rgs5,Cpe,Itm2b,Hmgb1,Calm1,Igfbp7
11 255,153,204 0.0429 136738469 Ptprz1,Cspg5,Pdgfra,Olig1,Gpr17,Epn2,Tnr,Pllp,Vcan,Fyn,Serpine2,Cd9,Tuba1a/b/c,Olig2,Pcdh15,S100a16,S100b,Sulf2,Serinc5,Cntn1 Vcan,Pdgfra,Gpr17,Pcdh15,Olig2,Tnr,S100b,Olig1,Ptprz1,Cspg5,Fyn,Epn2,Itpr2,Serpine2,Cd9,S100a16,Sox6,Megf11,Pllp,Sulf2 Malat1,Tuba1a/b/c,Ptprz1,Cspg5,Olig1,Epn2,Pllp,Camk2a,Cd9,Serpine2,Calm1,Ncam1,Fyn,Bcan,Ckb,Tnr,Pdgfra,Cntn1,Hnrnpa2b1,Hmgb1
10 178,101,255 0.03812 121510243 Ttr,Psap,Cab39l,Chchd10,Lamp2,Cox8a,Htr2c,Slc12a2,Ppp1r1b,Timp2,Ctsd,Mdh1,Gpi1,Bsg,Ndufa4,Itpr1,Ftl1,App,Sem1,Itgb8 Ttr,Htr2c,Cab39l,Ppp1r1b,Lepr,Slc12a2,Col4a5,Foxj1,Lamp2,Msr1,Dlk1,Timp2,Ctnna1,Chchd10,Maob,Sil1,Itpr1,Tspo,Cd55,Dcn Ttr,Psap,Malat1,Atp1b1,Cox8a,Mdh1,Bsg,Gpi1,Ctsd,Cab39l,Chchd10,App,Ndufa4,Clu,Ckb,Hspa8,Lamp2,Ptgds,Cst3,Slc12a2
9 101,178,255 0.03663 116755577 Cst3,Camk2a,C1qc,Hexb,Ctsd,C1qa,C1qb,Ctss,Csf1r,Vtn,P2ry12,Sparc,Selplg,Cx3cr1,Hmgb1,Tyrobp,Itm2b,Rps9,Tmem119,Tgfbr1 C1qc,Csf1r,Ctss,C1qb,C1qa,Hexb,Selplg,P2ry12,Tmem119,Cx3cr1,Tyrobp,Vtn,Trem2,Ptprc,Csf3r,Tgfbr1,Lyz1/2,Cd84,Higd1b,Epb41l2 Cst3,Camk2a,Tmsb4x,Hmgb1,Ctsd,Itm2b,Malat1,Rps9,Sparc,Hexb,Glul,C1qc,C1qa,Fau,C1qb,Ctsb,Ctss,Csf1r,Vtn,Fth1
8 0,223,95 0.03466 110484904 Mbp,Fth1,Mobp,Cpe,Bcas1,Hipk2,Kif5c,Pink1,Dync1li2,Gfap,Rims1,Cd6,Frs2,Lpar1,Ptpn11,Gpm6b,Map4k4,Ndrg1,S100b,Drd1 Mbp,Mobp,Bcas1,Fth1,Hipk2,Pink1,Cd6,Cpe,Dync1li2,Kif5c,Gfap,Rims1,Frs2,Lpar1,Ptpn11,S100b,Map4k4,Drd1,Gpm6b,Ndrg1 Mbp,Fth1,Mobp,Cpe,Malat1,Kif5c,Clu,Apoe,Glul,Plp1,Bcas1,Tmsb4x,Dync1li2,Hipk2,Pink1,Gfap,Hsp90ab1,Slc1a2,Gpm6b,Calm1

Packed SGE and Spatial Factor Outputs from run_cartload2

The packed SGE data and spatial factor inferences generated by FICTURE are available in PMTile format on Zenodo: DOI:10.5281/zenodo.15824926.

These datasets can also be loaded directly using the following catalog YAML file:
https://zenodo.org/records/15824927/catalog.yaml