2 Quick Access
The data from the recount3
project is accessible through R/Bioconductor packages as well as raw files. You can access the data through:
- recount3: this R/Bioconductor package retrieves the data from the IDIES servers and builds
RangedSummarizedExperiment
objects by study that can be used with many Bioconductor analysis packages for downstream analyses (DOI: 10.1038/nmeth.3252). - snapcount: this R/Bioconductor package also provides access to data from
recount2
andrecount3
through a query-based interface. - raw files: these are text files that are hosted by IDIES and can be used with other programming software outside of R.
As noted in the Bioconductor chapter, longer tutorials (called vignette documents in R/Bioconductor) showing how to install and use our R/Bioconductor packages are available from the Bioconductor landing pages (that is, the Bioconductor website for each package), such as the one called recount3
quick start guide.
2.1 Quick recount3
If you want to access the data recount3, here’s some R code that you can use. See the Bioconductor chapter for the full output of these commands and more details.
## Install the recount3 R/Bioconductor package
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("recount3")
## Load recount3 R package
library("recount3")
## Find all available human projects
human_projects <- available_projects()
## Find the project you are interested in,
## here we use SRP009615 as an example
proj_info <- subset(
human_projects,
project == "SRP009615" & project_type == "data_sources"
)
## Create a RangedSummarizedExperiment (RSE) object at the gene level
rse_gene_SRP009615 <- create_rse(proj_info)
## Explore that RSE object
rse_gene_SRP009615
2.2 Quick snapcount
If you want to access the data through queries such as a set of regions or annotation features instead of downloading the data at a project-level, we recommend using snapcount. It is an R/Bioconductor package for interfacing with Snaptron’s REST API. Here’s some example code you can use to get started with snapcount
.
## Install the snapcount R/Bioconductor package
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("snapcount")
## Load snapcount R package
library("snapcount")
## snapcount can be used with either a procedural interface
query_jx(compilation = "gtex", regions = "CD99")
query_jx(compilation = "gtex", regions = "CD99", range_filters = samples_count == 10)
## or using the query-builder class
sb <- SnaptronQueryBuilder$new()
sb$compilation("gtex")$regions("CD99")$query_jx()
2.3 Quick raw files
The raw recount3
files are hosted by IDIES and are publicly available. We separated every piece of information into its own file. These files can be accessed without using R through your own favorite programming solution. For example, the files for human study SRP009615
annotated with GENCODE v26 are:
Metadata files:
- http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.sra.SRP009615.MD.gz
- http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_project.SRP009615.MD.gz
- http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_qc.SRP009615.MD.gz
- http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_seq_qc.SRP009615.MD.gz
- http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_pred.SRP009615.MD.gz
Gene files:
Annotation files:
If you are interested in exploring what are the available projects in recount3
, you might be interested in accessing:
- http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/sra.recount_project.MD.gz
- http://duffel.rail.bio/recount3/human/data_sources/gtex/metadata/gtex.recount_project.MD.gz
- http://duffel.rail.bio/recount3/human/data_sources/tcga/metadata/tcga.recount_project.MD.gz
For more details about the structure of these files, check the Raw Files chapter.