Quick Start: Run Locally¶

This tutorial walks you through running the CartLoader workflow using a minimal example dataset from the mouse hippocampus.

Use Cases

This tutorial is ideal for users who want to:

Take full control over the environment
Customize workflow
Stay up-to-date with the latest development versions

Requirements

Users will need to:

Set up CartLoader and its dependencies locally (see Installation guide).

Question

I am not sure whether to run with Docker or run locally. Which one should I choose?

Set Up the Environment¶

# ====
# Replace each placeholder with the actual path on your system.  
# ====

work_dir=/path/to/work/directory        # path to work directory that contains the downloaded input data
cd $work_dir

# Define paths to required binaries and resources
spatula=/path/to/spatula/binary         # path to spatula executable
punkst=/path/to/punkst/binary           # path to FICTURE2 (punkst) executable
tippecanoe=/path/to/tippecanoe/binary   # path to tippecanoe executable
pmtiles=/path/to/pmtiles/binary         # path to pmtiles executable
aws=/path/to/aws/cli/binary             # path to AWS CLI binary

# (Optional) Define path to color map. 
cmap=/path/to/color/map                 # Path to fixed color map. `CartLoader` includes one at cartloader/assets/fixed_color_map_256.tsv.

# Number of jobs
n_jobs=10                               # If not specified, the number of jobs defaults to 1.

# Activate the bioconda environment
conda activate ENV_NAME                 # replace ENV_NAME with your conda environment name

Prepare Input¶

Data Access¶

The input example data is hosted on Zenodo DOI: 10.5281/zenodo.15701393.

Download the example data:

mkdir -p ${work_dir}/sge && cd ${work_dir}/sge

wget https://zenodo.org/records/17953582/files/seqscope_starter.std.tar.gz
tar -zxvf seqscope_starter.std.tar.gz

File Format¶

The input is a mouse hippocampus SGE, already converted to a format compatible with FICTURE using sge_convert in CartLoader.

transcripts.unsorted.tsv.gz: transcript-indexed SGE in TSV

X        Y        gene     count
29   1422.35  Myo3a    0
54  1110.72  Med14    1
54  1110.72  Ntpcr    1

X: X coordinates in um
Y: Y coordinates in um
gene: gene symbols
count: expression count per pixel per gene

feature.clean.tsv.gz: UMI counts on a per-gene basis in TSV

gene           gene_id             count
Gm29155        ENSMUSG00000100764  1
Pcmtd1         ENSMUSG00000051285  431
Gm26901        ENSMUSG00000097797  1

* gene: gene symbols * gene_id: gene IDs * count: expression count per gene

coordinate_minmax.tsv: X Y min/max coordinates

xmin    0.14
xmax    2359.90
ymin    0.23
ymax    1439.95

xmin xmax: min and max X coordinates in um
ymin ymax: min and max Y coordinates in um

Define ID and Parameters¶

cd ${work_dir}

# Unique identifier for your dataset
DATA_ID="seqscope_hippo"                # change this to reflect your dataset name
PLATFORM="seqscope"                     # platform information

# LDA parameters
train_width=18                            # define LDA training hexagon width (comma-separated if multiple widths are applied)
n_factor=6,12                             # define number of factors in LDA training (comma-separated if multiple n-factor values are provided)

SGE Format Conversion¶

The example dataset is already provided in FICTURE-compatible SGE format, so this conversion step is not required in this quickstart.

`FICTURE` Analysis¶

Compute spatial factors using punkst (FICTURE2). See more details on the Reference page.

cartloader run_ficture2 \
  --makefn run_ficture2.mk \
  --main \
  --in-transcript ./sge/transcripts.unsorted.tsv.gz \
  --in-feature ./sge/feature.clean.tsv.gz \
  --in-minmax ./sge/coordinate_minmax.tsv \
  --cmap-file ${cmap} \
  --exclude-feature-regex '^(mt-.*$|Gm\d+$)' \
  --out-dir ./ficture2 \
  --width ${train_width} \
  --n-factor ${n_factor} \
  --spatula ${spatula} \
  --ficture2 ${punkst} \
  --n-jobs ${n_jobs} \
  --threads ${n_jobs}

Parameter	Required	Type	Description
`--main`	required ¹	flag	Enable `CartLoader` to run all five steps
`--in-transcript`	required	string	Path to input transcript-level SGE file
`--out-dir`	required	string	Path to output directory
`--width`	required	int or comma-separated list	LDA training hexagon width(s)
`--n-factor`	required	int or comma-separated list	Number of LDA factors
`--makefn`		string	File name for the generated Makefile (default: `run_ficture2.mk` )
`--in-feature`		string	Path to input feature file
`--in-minmax`		string	Path to input coordinate min/max file
`--cmap-file`		string	Path to color map file
`--exclude-feature-regex`		regex	Pattern to exclude features
`--spatula`		string	Path to the `spatula` binary (default: `spatula`)
`--ficture2`		string	Path to the `punkst` directory (defaults to `punkst` repository within `submodules` directory of `CartLoader`)
`--n-jobs`		int	Number of parallel jobs (default: `1`)
`--threads`		int	Number of threads per job (default: `1`)

_{¹: CartLoader requires the user to specify at least one action. Available actions include: --tile to run tiling step; --segment to run segmentation step; --init-lda to run LDA training step; --decode to run decoding step; --summary to run summarization step; --main to run all above five actions.}

`CartLoader` Asset Packaging¶

Generate pmtiles and web-compatible tile directories. See more details in Reference page.

run_cartload2 with FICTURE outputrun_cartload2 with sge only

# Example A: With FICTURE outputs (integrates factors + joins)
cartloader run_cartload2 \
  --makefn run_cartload2.mk \
  --fic-dir ./ficture2 \
  --out-dir ./cartload2 \
  --id ${DATA_ID} \
  --spatula ${spatula} \
  --pmtiles ${pmtiles} \
  --tippecanoe ${tippecanoe} \
  --n-jobs ${n_jobs} \
  --threads ${n_jobs}

# Example B: SGE-only (package molecules without FICTURE)
cartloader run_cartload2 \
  --makefn run_cartload2.mk \
  --sge-dir ./sge_convert \
  --out-dir ./cartload2 \
  --id ${DATA_ID} \
  --spatula ${spatula} \
  --pmtiles ${pmtiles} \
  --tippecanoe ${tippecanoe} \
  --n-jobs ${n_jobs} \
  --threads ${n_jobs}

Parameter	Required	Type	Description
`--out-dir`	required	string	Path to the output directory for PMTiles and web tiles
`--id`	required	string	Dataset ID used for naming outputs and metadata
`--fic-dir`		string	Path to FICTURE outputs (enables factor layers + molecule–factor joins)
`--sge-dir`		string	Path to SGE outputs from `sge_convert` (enables SGE-only packaging)
`--in-sge-assets`		string	File name of SGE assets JSON/YAML in `--sge-dir` (default: `sge_assets.json`)
`--in-fic-params`		string	File name of FICTURE params JSON/YAML in `--fic-dir` (default: `ficture.params.json`)
`--makefn`		string	File name for the generated Makefile (default: `run_cartload2.mk`)
`--spatula`		string	Path to the `spatula` binary (default: `spatula`)
`--pmtiles`		string	Path to the `pmtiles` binary (default: `pmtiles`)
`--tippecanoe`		string	Path to the `tippecanoe` binary (default: `tippecanoe`)
`--n-jobs`		int	Number of parallel jobs (default: `1`)
`--threads`		int	Number of threads per job (default: `4`)

Upload to Data Repository¶

Choose a data repository to host/share your output

CartLoader supports two upload options (AWS and Zenodo) for storing PMTiles of SGE and spatial factors in a data repository.

Choose the one that best suits your needs.

AWS UploadsZenodo Uploads

Upload the generated CartLoader outputs to your designated AWS S3 directory:

# AWS S3 target location
S3_DIR=/s3/path/to/s3/dir              # Recommend to use DATA_ID as directory name, such as s3://bucket_name/test-data

cartloader upload_aws \
  --in-dir ./cartload2 \
  --s3-dir "${S3_DIR}" \
  --aws ${aws} \
  --n-jobs ${n_jobs}

Parameter	Required	Type	Description
`--in-dir`	required	string	Path to the input directory containing the `CartLoader` asset packaging output
`--s3-dir`	required	string	Path to the target S3 directory for uploading
`--aws`		string	Path to the AWS CLI binary
`--n-jobs`		int	Number of parallel jobs

Upload the generated CartLoader outputs to your designated Zenodo deposition or a new deposition.

zenodo_token=/path/to/zenodo/token/file    # replace /path/to/zenodo/token/file with the path to your Zenodo token file

cartloader upload_zenodo \
  --in-dir ./cartload2 \
  --upload-method catalog \
  --zenodo-token $zenodo_token \
  --title  "Your Title" \
  --creators "Your Name" \
  --description "This is an example description"

Parameter	Required	Type	Description
`--in-dir`	required	string	Path to the input directory containing the `CartLoader` asset packaging output
`--upload-method`	required	string	Method to determine which files to upload. Options: `all` to upload all files in `--in-dir`; `catalog` to upload files listed in a catalog YAML file; `user_list` to upload files explicitly listed via `--in-list`
`--catalog-yaml`		string	Required if `--upload-method catalog`. Path to `catalog.yaml` generated in `run_cartload2`. If absent, uses the catalog in the input directory specified by `--in-dir`.
`--zenodo-token`	required	string	Path to your Zenodo access token file
`--title`	required	string	Required when creating a new deposition (i.e., if `--zenodo-deposition-id` is omitted). Title for the new Zenodo deposition.
`--creators`	required	list of str	List of creators in "Lastname, Firstname" format.

Output Data¶

View/Explore¶

The outputs are available in both CartoScope and Zenodo.

Explore in CartoScope

Download from Zenodo

See output details in the reference pages for run_ficture2 and run_cartload2.