Accessing Example Datasets¶

We have made available three datasets associated with the NovaScope protocol to be used as input for the NovaScope Exemplary Downstream Analysis (NEDA).

Data Overview¶

Created using NovaScope, these datasets originate from FASTQ files derived from the same liver tissue section of an 8-week-old C57BL/6 wild-type male mouse.

Minimal Test Run Dataset¶

The minimal test run dataset was created using a subset of a liver section with a relatively shallow depth in the 2nd sequencing. This dataset is intended for preliminary testing of NEDA, primarily for validating NEDA scripts' functionality. It is not designed to yield biological insights.

Shallow Liver Section Dataset¶

This dataset was generated by a Seq-Scope dataset for a tissue section, characterized by a relatively shallow 2nd-Seq library sequencing depth (i.e., approximately 163 million paired-end reads). The dataset from FASTQ files with such depth should be sufficient to investigate major cell types alongside marker genes pertaining to liver cell diversity and perform basic pixel-level decoding of the spatial transcriptome. This dataset comes along with a set of aligned Hematoxylin and Eosin (H&E) stained histology images.

Deep Liver Section Dataset¶

The initial examination of the shallow dataset was encouraging, prompting a more extensive sequencing of the sample tissue to fully saturate the library (i.e., approximately 2.61 billion paired-end reads). The deep dataset was produced using all pairs of 2nd-seq FASTQ files. While datasets with shallower sequencing depths offer valuable insights, deep sequencing allows for a more thorough exploration of the data. This dataset also includes a set of aligned H&E stained histology images.

Download Datasets¶

All datasets are provided under a single DOI, accessible via this URL: https://doi.org/10.5281/zenodo.10841777. The most recent version of the dataset is version 3.

Since Pixel-level Analysis and Cell Segmentation-based Analysis require different input files, we have provided these files in separate tarball archives. Only the shallow liver dataset and deep liver dataset include input files for the Cell Segmentation-based Analysis due to the availability of histology files.

Input for Spatial Transcriptomic analysis¶

The tarball files with this naming convention is input files for Pixel-level Analysis:

1	`<prefix>_pixel_<release_date>.tar.gz`

Minimal Test Run Dataset :

## To download the tarball from Zenodo, you can use the following command.
curl -o minimal_pixel_20240718.tar.gz https://zenodo.org/records/12773392/files/minimal_pixel_20240718.tar.gz?download=1

## (Optional) Verify the integrity of the tarball file.
curl -o minimal_pixel_20240718.tar.gz.md5 https://zenodo.org/records/12773392/files/minimal_pixel_20240718.tar.gz.md5?download=1
md5sum -c minimal_pixel_20240718.tar.gz.md5

## Uncompress the tarball using the following command.
tar -zxvf minimal_pixel_20240718.tar.gz

Shallow Liver Section Dataset:

## To download the tarball from Zenodo, you can use the following command.
curl -o  shallow_pixel_20240718.tars.gz https://zenodo.org/records/12773392/files/shallow_pixel_20240718.tar.gz?download=1

## (Optional) Verify the integrity of the tarball file.
curl -o  shallow_pixel_20240718.tar.gz.md5 https://zenodo.org/records/12773392/files/shallow_pixel_20240718.tar.gz.md5?download=1
md5sum -c  shallow_pixel_20240718.tar.gz.md5

## Uncompress the tarball using the following command.
tar -zxvf  shallow_pixel_20240718.tar.gz

Deep Liver Section Dataset:

## To download the tarball from Zenodo, you can use the following command.
curl -o deep_pixel_20240718.tar.gz https://zenodo.org/records/12773392/files/deep_pixel_20240718.tar.gz?download=1

## (Optional) Verify the integrity of the tarball file.
curl -o deep_pixel_20240718.tar.gz.md5 https://zenodo.org/records/12773392/files/deep_pixel_20240718.tar.gz.md5?download=1
md5sum -c deep_pixel_20240718.tar.gz.md5

## Uncompress the tarball using the following command.
tar -zxvf deep_pixel_20240718.tar.gz

Input for Cell Segmentation-based Analysis¶

The tarball files with this naming convention is input files for Cell Segmentation-based Analysis:

1	`<prefix>_cellseg_<release_date>.tar.gz`

Shallow Liver Section Dataset:

## To download the tarball from Zenodo, you can use the following command.
curl -o  shallow_cellseg_20240718.tars.gz https://zenodo.org/records/12773392/files/shallow_cellseg_20240718.tar.gz?download=1

## (Optional) Verify the integrity of the tarball file.
curl -o  shallow_cellseg_20240718.tar.gz.md5 https://zenodo.org/records/12773392/files/shallow_cellseg_20240718.tar.gz.md5?download=1
md5sum -c  shallow_cellseg_20240718.tar.gz.md5

## Uncompress the tarball using the following command.
tar -zxvf  shallow_cellseg_20240718.tar.gz

Deep Liver Section Dataset:

## To download the tarball from Zenodo, you can use the following command.
curl -o deep_cellseg_20240718.tar.gz https://zenodo.org/records/12773392/files/deep_cellseg_20240718.tar.gz?download=1

## (Optional) Verify the integrity of the tarball file.
curl -o deep_cellseg_20240718.tar.gz.md5 https://zenodo.org/records/12773392/files/deep_cellseg_20240718.tar.gz.md5?download=1
md5sum -c deep_cellseg_20240718.tar.gz.md5

## Uncompress the tarball using the following command.
tar -zxvf deep_cellseg_20240718.tar.gz