6 Raw Files

Explain the raw file formats we have and how they are organized

6.1 Data source vs collection

A data source specifies where the data is hosted at, most commonly from the Sequence Read Archive (SRA). A collection is a manually curated set of samples from one or more studies. A collection has a custom metadata file where the curator(s) can specify metadata variables for the collection. In other words:

  • data_source: samples from the original data origin
  • collection: manually selected samples with curated collection-specific sample metadata

6.2 Annotation files

Here are the direct links in case you are interested in downloading the annotation files directly.

(#tab:ann_files)Annotation files
organism type annotation file_extension URL
human gene gencode_v26 G026 http://duffel.rail.bio/recount3/human/annotations/gene_sums/human.gene_sums.G026.gtf.gz
human gene gencode_v29 G029 http://duffel.rail.bio/recount3/human/annotations/gene_sums/human.gene_sums.G029.gtf.gz
human gene fantom6_cat F006 http://duffel.rail.bio/recount3/human/annotations/gene_sums/human.gene_sums.F006.gtf.gz
human gene refseq R109 http://duffel.rail.bio/recount3/human/annotations/gene_sums/human.gene_sums.R109.gtf.gz
human gene ercc ERCC http://duffel.rail.bio/recount3/human/annotations/gene_sums/human.gene_sums.ERCC.gtf.gz
human gene sirv SIRV http://duffel.rail.bio/recount3/human/annotations/gene_sums/human.gene_sums.SIRV.gtf.gz
human exon gencode_v26 G026 http://duffel.rail.bio/recount3/human/annotations/exon_sums/human.exon_sums.G026.gtf.gz
human exon gencode_v29 G029 http://duffel.rail.bio/recount3/human/annotations/exon_sums/human.exon_sums.G029.gtf.gz
human exon fantom6_cat F006 http://duffel.rail.bio/recount3/human/annotations/exon_sums/human.exon_sums.F006.gtf.gz
human exon refseq R109 http://duffel.rail.bio/recount3/human/annotations/exon_sums/human.exon_sums.R109.gtf.gz
human exon ercc ERCC http://duffel.rail.bio/recount3/human/annotations/exon_sums/human.exon_sums.ERCC.gtf.gz
human exon sirv SIRV http://duffel.rail.bio/recount3/human/annotations/exon_sums/human.exon_sums.SIRV.gtf.gz
mouse gene gencode_v23 M023 http://duffel.rail.bio/recount3/mouse/annotations/gene_sums/mouse.gene_sums.M023.gtf.gz
mouse exon gencode_v23 M023 http://duffel.rail.bio/recount3/mouse/annotations/exon_sums/mouse.exon_sums.M023.gtf.gz

On the R package, you can use recount3::locate_url_ann() to obtain these URLs.

The URL structure is:

<recount3_url>/<organism>/annotations/<gene|exon>_sums/<organism>.<gene|exon>_sums.<annotation file extension>.gtf.gz

These are the annotation file extensions; human:

  • Gencode v26: G026
  • Gencode v29: G029
  • RefSeq: ERCC
  • FANTOM6_cat: F006
  • ERCC: R109
  • SIRV: SIRV

Mouse:

  • Gencode v23: M023

6.3 Project-level count files

For every project, we have files at the gene, exon, and exon-exon junction expression levels. For genes and exons, we provide a file for each of the annotations. That is, for every project we provide:

  • gene files: one count matrix per annotation
  • exon files: one count matrix per annotation
  • 3 exon-exon junction files: the sparse count matrix data in Matrix Market (MM) format, the small list of sample identifiers (IDs), and the exon-exon junctions coordinate information (RR file)

All these files can be located with recount3::locate_url(). The following R code creates a table with links to the files for the default annotation for each organism. Note that you can replace the annotation file extension (like G026) for the corresponding one for annotation annotation shown in the previous section (or use recount3::annotation_ext() to see available options).

## Obtain all available projects
projects <- rbind(
    recount3::available_projects("human"),
    recount3::available_projects("mouse")
)

## Locate the project raw files at the gene level using the default annotation
projects$gene <- apply(projects, 1, function(x)
    locate_url(
        project = x["project"],
        project_home = x["project_home"],
        type = "gene",
        organism = x["organism"],
        annotation = annotation_options(x["organism"])[1] # Use default annotation
    ))

## Locate the project raw files at the exon level using the default annotation
projects$exon <- apply(projects, 1, function(x)
    locate_url(
        project = x["project"],
        project_home = x["project_home"],
        type = "exon",
        organism = x["organism"],
        annotation = annotation_options(x["organism"])[1] # Use default annotation
    ))

## Locate the project raw exon-exon junction files
projects <-
    cbind(projects, do.call(rbind, apply(projects, 1, function(x) {
        x <-
            locate_url(
                project = x["project"],
                project_home = x["project_home"],
                type = "jxn",
                organism = x["organism"]
            )
        res <- data.frame(t(x))
        colnames(res) <-
            paste0("jxn_", gsub("^.*\\.", "", gsub("\\.gz", "", colnames(res))))
        return(res)
    })))
rownames(projects) <- NULL

## Dimensions of the table
dim(projects)
# [1] 18830    11

## Export
write.csv(projects, file = "recount3_raw_project_files_with_default_annotation.csv", row.names = FALSE)

As a teaser, here you can see the first 20 rows of this long table. Or you can download the CSV file to your computer from GitHub.

(#tab:project_raw_files_table)First 20 raw project files
project organism file_source project_home project_type gene exon jxn_MM jxn_RR jxn_ID
SRP107565 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP107565/sra.gene_sums.SRP107565.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP107565/sra.exon_sums.SRP107565.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP107565/sra.junctions.SRP107565.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP107565/sra.junctions.SRP107565.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP107565/sra.junctions.SRP107565.ALL.ID.gz
SRP149665 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP149665/sra.gene_sums.SRP149665.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP149665/sra.exon_sums.SRP149665.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP149665/sra.junctions.SRP149665.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP149665/sra.junctions.SRP149665.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP149665/sra.junctions.SRP149665.ALL.ID.gz
SRP017465 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP017465/sra.gene_sums.SRP017465.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP017465/sra.exon_sums.SRP017465.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP017465/sra.junctions.SRP017465.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP017465/sra.junctions.SRP017465.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP017465/sra.junctions.SRP017465.ALL.ID.gz
SRP119165 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP119165/sra.gene_sums.SRP119165.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP119165/sra.exon_sums.SRP119165.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP119165/sra.junctions.SRP119165.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP119165/sra.junctions.SRP119165.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP119165/sra.junctions.SRP119165.ALL.ID.gz
SRP133965 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP133965/sra.gene_sums.SRP133965.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP133965/sra.exon_sums.SRP133965.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP133965/sra.junctions.SRP133965.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP133965/sra.junctions.SRP133965.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP133965/sra.junctions.SRP133965.ALL.ID.gz
SRP096765 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP096765/sra.gene_sums.SRP096765.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP096765/sra.exon_sums.SRP096765.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP096765/sra.junctions.SRP096765.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP096765/sra.junctions.SRP096765.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP096765/sra.junctions.SRP096765.ALL.ID.gz
SRP124965 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP124965/sra.gene_sums.SRP124965.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP124965/sra.exon_sums.SRP124965.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP124965/sra.junctions.SRP124965.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP124965/sra.junctions.SRP124965.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP124965/sra.junctions.SRP124965.ALL.ID.gz
SRP189165 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP189165/sra.gene_sums.SRP189165.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP189165/sra.exon_sums.SRP189165.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP189165/sra.junctions.SRP189165.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP189165/sra.junctions.SRP189165.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP189165/sra.junctions.SRP189165.ALL.ID.gz
SRP050365 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP050365/sra.gene_sums.SRP050365.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP050365/sra.exon_sums.SRP050365.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP050365/sra.junctions.SRP050365.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP050365/sra.junctions.SRP050365.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP050365/sra.junctions.SRP050365.ALL.ID.gz
SRP123065 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP123065/sra.gene_sums.SRP123065.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP123065/sra.exon_sums.SRP123065.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP123065/sra.junctions.SRP123065.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP123065/sra.junctions.SRP123065.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP123065/sra.junctions.SRP123065.ALL.ID.gz
SRP162465 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP162465/sra.gene_sums.SRP162465.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP162465/sra.exon_sums.SRP162465.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP162465/sra.junctions.SRP162465.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP162465/sra.junctions.SRP162465.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP162465/sra.junctions.SRP162465.ALL.ID.gz
SRP178865 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP178865/sra.gene_sums.SRP178865.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP178865/sra.exon_sums.SRP178865.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP178865/sra.junctions.SRP178865.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP178865/sra.junctions.SRP178865.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP178865/sra.junctions.SRP178865.ALL.ID.gz
SRP032165 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP032165/sra.gene_sums.SRP032165.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP032165/sra.exon_sums.SRP032165.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP032165/sra.junctions.SRP032165.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP032165/sra.junctions.SRP032165.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP032165/sra.junctions.SRP032165.ALL.ID.gz
SRP125965 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP125965/sra.gene_sums.SRP125965.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP125965/sra.exon_sums.SRP125965.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP125965/sra.junctions.SRP125965.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP125965/sra.junctions.SRP125965.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP125965/sra.junctions.SRP125965.ALL.ID.gz
SRP120165 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP120165/sra.gene_sums.SRP120165.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP120165/sra.exon_sums.SRP120165.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP120165/sra.junctions.SRP120165.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP120165/sra.junctions.SRP120165.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP120165/sra.junctions.SRP120165.ALL.ID.gz
SRP044265 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP044265/sra.gene_sums.SRP044265.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP044265/sra.exon_sums.SRP044265.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP044265/sra.junctions.SRP044265.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP044265/sra.junctions.SRP044265.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP044265/sra.junctions.SRP044265.ALL.ID.gz
SRP014565 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP014565/sra.gene_sums.SRP014565.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP014565/sra.exon_sums.SRP014565.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP014565/sra.junctions.SRP014565.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP014565/sra.junctions.SRP014565.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP014565/sra.junctions.SRP014565.ALL.ID.gz
SRP057065 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP057065/sra.gene_sums.SRP057065.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP057065/sra.exon_sums.SRP057065.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP057065/sra.junctions.SRP057065.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP057065/sra.junctions.SRP057065.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP057065/sra.junctions.SRP057065.ALL.ID.gz
SRP117665 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP117665/sra.gene_sums.SRP117665.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP117665/sra.exon_sums.SRP117665.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP117665/sra.junctions.SRP117665.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP117665/sra.junctions.SRP117665.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP117665/sra.junctions.SRP117665.ALL.ID.gz
SRP049465 human sra data_sources/sra data_sources http://idies.jhu.edu/recount3/data/human/data_sources/sra/gene_sums/65/SRP049465/sra.gene_sums.SRP049465.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/exon_sums/65/SRP049465/sra.exon_sums.SRP049465.G026.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP049465/sra.junctions.SRP049465.ALL.MM.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP049465/sra.junctions.SRP049465.ALL.RR.gz http://idies.jhu.edu/recount3/data/human/data_sources/sra/junctions/65/SRP049465/sra.junctions.SRP049465.ALL.ID.gz

The URL structure is:

  • gene: <recount3_url>/<organism>/data_sources/<data_source>/gene_sums/<last 2 project letters or digits>/<project>/<data_source>.gene_sums.<project>.<annotation file extension>.gz
  • exon: <recount3_url>/<organism>/data_sources/<data_source>/exon_sums/<last 2 project letters or digits>/<project>/<data_source>.exon_sums.<project>.<annotation file extension>.gz
  • junctions: <recount3_url>/<organism>/data_sources/<data_source>/junctions/<last 2 project letters or digits>/<project>/<data_source>.junctions.<project>.<junction type: typically ALL>.<junction file extension: RR, MM or ID>.gz 3

6.4 Project-level metadata files

Every project from an original data source has 5 different sample metadata tables. These are:

  • project_meta (sra, gtex, tcga): information mostly used by the R interface for locating files
  • recount_project: information downloaded from the original data source, such as the SRA Run Table selector
  • recount_qc: quality check fields using the QC annotation
  • recount_seq_qc: sequence quantily check fields
  • recount_pred: curated and predicted sample information described in the recount3 manuscript

You can use the following R code to obtain the links to all these raw metadata files or use recount3::locate_url().

## Obtain all the metadata files
metadata_files <- do.call(rbind, apply(projects, 1, function(x) {
    x <-
        locate_url(
            project = x[["project"]],
            project_home = x[["project_home"]],
            type = "metadata",
            organism = x[["organism"]]
        )
    res <- data.frame(t(x))
    colnames(res) <-
        gsub("\\..*", "", gsub("^[a-z]+\\.", "", colnames(res)))
    
    colnames(res)[colnames(res) %in% unique(projects$file_source)] <-
        "project_meta"
    return(res)
}))
dim(metadata_files)
# [1] 18830     5

## Export
write.csv(metadata_files, file = "recount3_metadata_files.csv", row.names = FALSE)

As a teaser, here you can see the first 6 rows of this long table. Or you can download the CSV file to your computer from GitHub. If you want to, you can combine it with the project raw files table from the previous section.

##                                                                                              project_meta
## 1 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP107565/sra.sra.SRP107565.MD.gz
## 2 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP149665/sra.sra.SRP149665.MD.gz
## 3 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP017465/sra.sra.SRP017465.MD.gz
## 4 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP119165/sra.sra.SRP119165.MD.gz
## 5 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP133965/sra.sra.SRP133965.MD.gz
## 6 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP096765/sra.sra.SRP096765.MD.gz
##                                                                                                       recount_project
## 1 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP107565/sra.recount_project.SRP107565.MD.gz
## 2 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP149665/sra.recount_project.SRP149665.MD.gz
## 3 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP017465/sra.recount_project.SRP017465.MD.gz
## 4 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP119165/sra.recount_project.SRP119165.MD.gz
## 5 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP133965/sra.recount_project.SRP133965.MD.gz
## 6 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP096765/sra.recount_project.SRP096765.MD.gz
##                                                                                                       recount_qc
## 1 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP107565/sra.recount_qc.SRP107565.MD.gz
## 2 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP149665/sra.recount_qc.SRP149665.MD.gz
## 3 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP017465/sra.recount_qc.SRP017465.MD.gz
## 4 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP119165/sra.recount_qc.SRP119165.MD.gz
## 5 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP133965/sra.recount_qc.SRP133965.MD.gz
## 6 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP096765/sra.recount_qc.SRP096765.MD.gz
##                                                                                                       recount_seq_qc
## 1 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP107565/sra.recount_seq_qc.SRP107565.MD.gz
## 2 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP149665/sra.recount_seq_qc.SRP149665.MD.gz
## 3 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP017465/sra.recount_seq_qc.SRP017465.MD.gz
## 4 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP119165/sra.recount_seq_qc.SRP119165.MD.gz
## 5 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP133965/sra.recount_seq_qc.SRP133965.MD.gz
## 6 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP096765/sra.recount_seq_qc.SRP096765.MD.gz
##                                                                                                       recount_pred
## 1 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP107565/sra.recount_pred.SRP107565.MD.gz
## 2 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP149665/sra.recount_pred.SRP149665.MD.gz
## 3 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP017465/sra.recount_pred.SRP017465.MD.gz
## 4 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP119165/sra.recount_pred.SRP119165.MD.gz
## 5 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP133965/sra.recount_pred.SRP133965.MD.gz
## 6 http://idies.jhu.edu/recount3/data/human/data_sources/sra/metadata/65/SRP096765/sra.recount_pred.SRP096765.MD.gz

The URL structure is:

<recount3_url>/<organism>/data_sources/<data_source>/metadata/<last 2 project letters or digits>/<project>/<data_source>.<table name>.<project>.MD.gz

6.5 Sample-level BigWig files

Each sample in recount3 has bigWig file publicly available and whose URL can be obtained using recount3::locate_url(). Below we show the URL for one such sample.

locate_url(
    "SRP009615",
    "data_sources/sra",
    type = "bw",
    sample = "SRR387777"
)
##                                                                                    sra.base_sums.SRP009615_SRR387777.ALL.bw 
## "http://duffel.rail.bio/recount3/human/data_sources/sra/base_sums/15/SRP009615/77/sra.base_sums.SRP009615_SRR387777.ALL.bw"

The URL structure is:

<recount3_url>/<organism>/data_sources/<data_source>/base_sums/<last 2 project letters or digits>/<project>/<last 2 sample letters or digits>/<data_source>.base_sums.<project>_<sample>.ALL.bw

Valid recount3_url options we support are http://duffel.rail.bio/recount3 and http://idies.jhu.edu/recount3/data.


  1. Only GTEx and TCGA have junction type UNIQUE available in addition to ALL.↩︎