Data Browser query: multiple dataset query

Overview

Build a query across multiple datasets at once using harmonized metadata ontology. Metadata consists of properties, which describe each dataset’s entities, and their values. Entities are particular resources with UUIDs, such as files, cases, samples, and cell lines.

This page walks you through building a query across several datasets. Learn more about the Data Browser features and the parts of a Data Browser query.

Objective

The query is performed across several datasets:

The query selects Cases that:

  • are females, and
  • have been analyzed with RNA-Seq, and
  • are diagnosed with Acute Myeloid Leukemia, Medulloblastoma, or Atypical Teratoid Rhabdoid Tumor

Procedure

[ 1 ] Choose the datasets to be queried

  1. Access the Data Browser.
  2. Select TCGA GRCh38 and Cavatica.
  3. Click Explore selected.

Note that by selecting Cavatica, you are selecting all paediatric cancer datasets available through the Platform for your query. If you want to query a subset of the paediatric cancer datasets, you have to select them individually.

[ 2 ] Build the query

  1. Click on the Case entity to select for patients which match your query parameters.
  2. Select Demographic from the list of entities connected with the Case entity.
  3. Click +Add property below Demographic.
  4. Search for female and select Gender: female.
  5. Select File from the list of entities connected with the Case entity.
  6. Click +Add property below File.
  7. Search for rna-seq, and select Experimental strategy: RNA-seq.
  8. Select Investigation from the list of entities connected with the File entity.
  9. Click +Add property below Investigation.
  10. Search for Leukaemia, Medulloblastoma, Rhabdoid, and select Disease type: Acute Myeloid Leukaemia, Medulloblastoma, and Atypical Teratoid Rhabdoid Tumor, respectively.

[ 3 ] Save the query and import the results to your project

  1. Click Save from the Queries drop-down menu.
  2. Name your query and add an optional description.
  3. Click Save query.
  4. Import the query results to your project

Note that you will not be able to import restricted query result data without the right level of access. Files you cannot access are labelled with a red closed lock. This could occur, for example, with TCGA Controlled Data if you don't have permission from dbGaP. Read more about accessing data from the Data Browser.

That's it: you've successfully built a query across several datasets!