2 Quick Access

The data from the recount3 project is accessible through R/Bioconductor packages as well as raw files. You can access the data through:

  • recount3: this R/Bioconductor package retrieves the data from the IDIES servers and builds RangedSummarizedExperiment objects by study that can be used with many Bioconductor analysis packages for downstream analyses (DOI: 10.1038/nmeth.3252).
  • snapcount: this R/Bioconductor package also provides access to data from recount2 and recount3 through a query-based interface.
  • raw files: these are text files that are hosted by IDIES and can be used with other programming software outside of R.

As noted in the Bioconductor chapter, longer tutorials (called vignette documents in R/Bioconductor) showing how to install and use our R/Bioconductor packages are available from the Bioconductor landing pages (that is, the Bioconductor website for each package), such as the one called recount3 quick start guide.

2.1 Quick recount3

If you want to access the data recount3, here’s some R code that you can use. See the Bioconductor chapter for the full output of these commands and more details.

## Install the recount3 R/Bioconductor package
if (!requireNamespace("BiocManager", quietly = TRUE))

## Load recount3 R package

## Find all available human projects
human_projects <- available_projects()

## Find the project you are interested in,
## here we use SRP009615 as an example
proj_info <- subset(
    project == "SRP009615" & project_type == "data_sources"

## Create a RangedSummarizedExperiment (RSE) object at the gene level
rse_gene_SRP009615 <- create_rse(proj_info)

## Explore that RSE object

2.2 Quick snapcount

If you want to access the data through queries such as a set of regions or annotation features instead of downloading the data at a project-level, we recommend using snapcount. It is an R/Bioconductor package for interfacing with Snaptron’s REST API. Here’s some example code you can use to get started with snapcount.

## Install the snapcount R/Bioconductor package
if (!requireNamespace("BiocManager", quietly = TRUE))

## Load snapcount R package

## snapcount can be used with either a procedural interface
query_jx(compilation = "gtex", regions = "CD99")
query_jx(compilation = "gtex", regions = "CD99", range_filters = samples_count == 10)

## or using the query-builder class
sb <- SnaptronQueryBuilder$new()

2.3 Quick raw files

The raw recount3 files are hosted by IDIES and are publicly available. We separated every piece of information into its own file. These files can be accessed without using R through your own favorite programming solution. For example, the files for human study SRP009615 annotated with GENCODE v26 are:

Metadata files:

  1. http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.sra.SRP009615.MD.gz
  2. http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_project.SRP009615.MD.gz
  3. http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_qc.SRP009615.MD.gz
  4. http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_seq_qc.SRP009615.MD.gz
  5. http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_pred.SRP009615.MD.gz

Gene files:

  1. http://duffel.rail.bio/recount3/human/data_sources/sra/gene_sums/15/SRP009615/sra.gene_sums.SRP009615.G026.gz

Annotation files:

  1. http://duffel.rail.bio/recount3/human/annotations/gene_sums/human.gene_sums.G026.gtf.gz

If you are interested in exploring what are the available projects in recount3, you might be interested in accessing:

  1. http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/sra.recount_project.MD.gz
  2. http://duffel.rail.bio/recount3/human/data_sources/gtex/metadata/gtex.recount_project.MD.gz
  3. http://duffel.rail.bio/recount3/human/data_sources/tcga/metadata/tcga.recount_project.MD.gz

For more details about the structure of these files, check the Raw Files chapter.