Skip to content

Data Repository Upload

Overview

cartloader offers two modules (upload_aws and upload_zenodo) to upload all generated outputs —- including rasterized SGE tiles, decoded spatial factor maps, and molecule-factor joins —- to a target data repository for sharing or deployment. It supports both AWS and Zenodo as upload backends, allowing users to choose their preferred platform.

Requirements

  • A completed run of run_cartload2, which produces:
    • Rasterized SGE tiles
    • Decoded spatial factor maps
    • Joined molecule-factor outputs
    • A catalog file (catalog.yaml) summarizing the output structure and metadata
  • For AWS uploads:
    • AWS CLI installed and configured (e.g., via aws configure)
  • For Zenodo uploads:

    • A personal access token saved in a file, required for authentication with the Zenodo API
    • (Optional) A Zenodo deposition ID if the user prefers to upload files to an existing deposition.
    What are the Zenodo token and deposition ID, why do you need them, and how do you get them?

    Zenodo Token File

    cartloader leverages the Zenodo API for uploading files, which provides a streamlined and efficient experience. To authenticate uploads via the API, Zenodo requires an access token for secure.

    To obtain a token for use with cartloader:

    1. Log in to Zenodo.
    2. Go to your applications page.
    3. Click "New Token" and select appropriate scopes (e.g., deposit:write, deposit:actions).
    4. Copy the generated token.
    5. Save the token in a plain text file, and pass the file path to the --token-file option when running cartloader.

    Zenodo Deposition ID

    A deposition ID is a unique numeric identifier assigned to a deposition (i.e., a dataset record) you create on Zenodo. This ID specifies where your uploaded files will be stored. If you’ve already created a deposition, you can find its ID at the end of the URL. For example:

    1
    2
    3
    https://zenodo.org/deposit/1234567
                                    ↑
                            This is the deposition ID
    

!!! info "Choose a data repository to host/share your output"

 `cartloader` supports two upload options (`AWS` and `Zenodo`) for storing PMTiles of SGE and spatial factors in a data repository.

 **Choose the one that best suits your needs.**

### AWS Uploads

Upload the generated cartloader outputs to your designated AWS S3 directory:

1
2
3
4
5
6
7
8
# AWS S3 target location for cartostore
AWS_BUCKET="EXAMPLE_AWS_BUCKET"         # replace EXAMPLE_AWS_BUCKET with your actual S3 bucket name

cartloader upload_aws \
  --in-dir ./cartload2 \
  --s3-dir "s3://${AWS_BUCKET}/${DATA_ID}" \
  --aws ${aws} \
  --n-jobs ${n_jobs}
Parameter Required Type Description
--in-dir required string Path to the input directory containing the cartloader compilation output
--s3-dir required string Path to the target S3 directory for uploading
--aws string Path to the AWS CLI binary
--n-jobs int Number of parallel jobs

### Zenodo Uploads

Upload the generated cartloader outputs to your designated Zenodo deposition or a new deposition.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
zenodo_token=/path/to/zenodo/token/file    # replace /path/to/zenodo/token/file by path to your zenodo token file

cartloader upload_zenodo \
  --in-dir ./cartload2 \
  --upload-method catalog \
  --zenodo-token $zenodo_token \
  --create-new-deposition \
  --title  "Yur Title" \
  --creators "Your Name" \
  --description "This is an example description"

Parameter Required Type Description
--in-dir required string Path to the input directory containing the cartloader compilation output
--upload-method required string Method to determine which files to upload. Options: all to upload all files in --in-dir; catalog to upload files listed in a catalog YAML file, user_list to upload files explicitly listed via --in-list
--catalog-yaml string Required if --upload-method catalog. Path to the catalog.yaml file generated in run_cartload2. If absent, will use the catalog.yaml in the input directory specified by --in-dir.
--zenodo-token required string Path to your Zenodo access file
--create-new-deposition flag a new Zenodo deposition will be created.
--title required string Required if --create-new-deposition. Title for the new Zenodo deposition.
--creators required list of str List of creators in "Lastname, Firstname" format.

-->

AWS Uploads

We recommend to create create a directory within your AWS S3 bucket using a data id as directory name.

1
2
3
4
5
6
7
8
AWS_BUCKET="EXAMPLE_AWS_BUCKET"         # replace EXAMPLE_AWS_BUCKET with your actual S3 bucket name
DATA_ID="EXAMPLE_ID"                    # change EXAMPLE_ID to reflect your dataset name

cartloader upload_aws \
  --in-dir /path/to/run_cartload2/output/directory \
  --s3-dir "s3://${AWS_BUCKET}/${DATA_ID}" \
  --aws /path/to/your/aws/binary \
  --n-jobs 10
  • --in-dir (str): Path to the input directory containing the cartloader compilation output
  • --s3-dir (str): Path to the target S3 directory for uploading.
  • --aws (str): Path to the AWS CLI binary
  • --n-jobs (str): Number of parallel jobs
  • --catalog-yaml (str): Path to the catalog.yaml file generated in run_cartload2. If absent, will use the catalog.yaml in the input directory specified by --in-dir.

Zenodo Uploads

Define Zenodo deposition

You must choose one of the following modes:

  • Use an existing deposition via --zenodo-deposition-id
  • Create a new deposition via --create-new-deposition

(1) To upload files to an exist deposition ID:

1
2
3
4
5
6
7
zenodo_depostion_ID=DEPOSTION_ID                # Replace DEPOSTION_ID with yours

cartloader upload_zenodo \
    --in-dir /path/to/run_cartload2/output/directory \
    --upload-method catalog \
    --zenodo-token /path/to/zenodo_token.txt \
    --zenodo-deposition-id ${zenodo_depostion_ID}

(2) To upload files to a new deposition ID:

1
2
3
4
5
6
7
8
cartloader upload_zenodo \
  --upload-method catalog \
  --in-dir /path/to/run_cartload2/output/directory \
  --zenodo-token /path/to/zenodo_token.txt \
  --create-new-deposition \
  --title  "Title Info" \        
  --creators "Creator Name" \   
  --description "Description Info"

Input Parameters

  • --in-dir (str):Path to the input directory containing the run_cartload2 output files.
  • --upload-method (str, default: all): Method to determine which files to upload. Options:
    • all: Upload all files in --in-dir
    • catalog: Upload files listed in a catalog YAML file
    • user_list: Upload files explicitly listed via --in-list
  • --in-list (list of str): Required if if using --upload-method user_list. Allow multiple filenames.
  • --catalog-yaml (str): Required if --upload-method catalog. Path to the catalog.yaml file generated in run_cartload2. If absent, will use the catalog.yaml in the input directory specified by --in-dir.

Zenodo Configuration

Must specify exactly one of --zenodo-deposition-id or --create-new-deposition.

  • --zenodo-token (str): Path to your Zenodo access token file.
  • --zenodo-deposition-id (str): A Zenodo deposition ID to upload files to.
  • --create-new-deposition (flag): If set, a new Zenodo deposition will be created.
  • --create-new-version (flag): If set, a new version will be created for the provided deposition ID. This is useful to update the files for a published deposition. When enabled, --zenodo-deposition-id must be applied.

Deposition Metadata

Required only if creating a new deposition.

  • --title (str): Title for the new Zenodo deposition.
  • --upload-type (str, default: dataset): Type of deposition. Options: dataset, software, publication, poster, presentation, image, video, lesson, other
  • --creators (list of str): List of creators in "Lastname, Firstname" format.

Behavior Flags:

  • --overwrite (flag): If set, overwrite existing files in the Zenodo deposition.
  • --dry-run (flag): If set, simulate the upload without modifying the Zenodo deposition.