
Internal Function to Get Mutations/CNA/Fusion By Study ID
Source:R/genomics_by_study.R
dot-get_data_by_study.RdEndpoints for retrieving mutation and cna data are structurally similar.
This internal function allows you to pull data from either endpoint. It has
logic for sensible default guesses at study_id and molecular_profile_id when those are NULL
Usage
.get_data_by_study(
study_id = NULL,
molecular_profile_id = NULL,
data_type = c("mutation", "cna", "fusion", "structural_variant", "segment"),
base_url = NULL,
add_hugo = TRUE
)Arguments
- study_id
A study ID to query mutations. If NULL, guesses study ID based on molecular_profile_id.
- molecular_profile_id
a molecular profile to query mutations. If NULL, guesses molecular_profile_id based on study ID.
- data_type
specify what type of data to return. Options are
mutation,cna,fusion, orstructural_variant(same asfusion), andsegment(copy number segmentation data)..- base_url
The database URL to query If
NULLwill default to URL set withset_cbioportal_db(<your_db>)- add_hugo
Logical indicating whether
HugoGeneSymbolshould be added to your resulting data frame, if not already present in raw API results. Argument isTRUEby default. IfFALSE, results will be returned as is (i.e. any existing Hugo Symbol columns in raw results will not be removed).
Examples
# \dontrun{
set_cbioportal_db("public")
#> ✔ You are successfully connected!
#> ✔ base_url for this R session is now set to "www.cbioportal.org/api"
.get_data_by_study(study_id = "prad_msk_2019", data_type = "cna")
#> ℹ Returning all data for the "prad_msk_2019_cna" molecular profile in the "prad_msk_2019" study
#> # A tibble: 1 × 9
#> hugoGeneSymbol entrezGeneId uniqueSampleKey uniquePatientKey
#> <chr> <int> <chr> <chr>
#> 1 PTEN 5728 c19DXzM2OTI0TF9QMDAxX2Q6cHJhZF9t… cF9DXzM2OTI0TDp…
#> # ℹ 5 more variables: molecularProfileId <chr>, sampleId <chr>,
#> # patientId <chr>, studyId <chr>, alteration <int>
.get_data_by_study(study_id = "prad_msk_2019", data_type = "mutation")
#> ℹ Returning all data for the "prad_msk_2019_mutations" molecular profile in the "prad_msk_2019" study
#> # A tibble: 26 × 28
#> hugoGeneSymbol entrezGeneId uniqueSampleKey uniquePatientKey
#> <chr> <int> <chr> <chr>
#> 1 ZFHX3 463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 2 ZFHX3 463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 3 ATR 545 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 4 BCL2 596 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 5 ETV1 2115 c19DX1A4SzNUUl9QMDAxX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 6 ETV1 2115 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 7 FAT1 2195 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 8 MSH6 2956 c19DX1A4SzNUUl9QMDAyX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 9 MSH6 2956 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 10 FOXA1 3169 c19DX0UwS0pGSl9QMDAyX2Q6cHJhZF9… cF9DX0UwS0pGSjp…
#> # ℹ 16 more rows
#> # ℹ 24 more variables: molecularProfileId <chr>, sampleId <chr>,
#> # patientId <chr>, studyId <chr>, center <chr>, mutationStatus <chr>,
#> # validationStatus <chr>, tumorAltCount <int>, tumorRefCount <int>,
#> # normalAltCount <int>, normalRefCount <int>, startPosition <int>,
#> # endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
#> # mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>, …
.get_data_by_study(study_id = "prad_msk_2019", data_type = "fusion")
#> ℹ Returning all data for the "prad_msk_2019_structural_variants" molecular profile in the "prad_msk_2019" study
#> # A tibble: 4 × 44
#> uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId studyId
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 c19DX0NBVVdUN1… cF9DX0NBVVdUNzp… prad_msk_2019_str… s_C_CAU… p_C_CAUW… prad_m…
#> 2 c19DX0RVNkVDQ1… cF9DX0RVNkVDQzp… prad_msk_2019_str… s_C_DU6… p_C_DU6E… prad_m…
#> 3 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> 4 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> # ℹ 38 more variables: site1EntrezGeneId <int>, site1HugoSymbol <chr>,
#> # site1EnsemblTranscriptId <chr>, site1Chromosome <chr>, site1Position <int>,
#> # site1Contig <chr>, site1Region <chr>, site1RegionNumber <int>,
#> # site1Description <chr>, site2EntrezGeneId <int>, site2HugoSymbol <chr>,
#> # site2EnsemblTranscriptId <chr>, site2Chromosome <chr>, site2Position <int>,
#> # site2Contig <chr>, site2Region <chr>, site2RegionNumber <int>,
#> # site2Description <chr>, site2EffectOnFrame <chr>, ncbiBuild <chr>, …
.get_data_by_study(molecular_profile_id = "prad_msk_2019_cna", data_type = "cna")
#> ℹ Returning all data for the "prad_msk_2019_cna" molecular profile in the "prad_msk_2019" study
#> # A tibble: 1 × 9
#> hugoGeneSymbol entrezGeneId uniqueSampleKey uniquePatientKey
#> <chr> <int> <chr> <chr>
#> 1 PTEN 5728 c19DXzM2OTI0TF9QMDAxX2Q6cHJhZF9t… cF9DXzM2OTI0TDp…
#> # ℹ 5 more variables: molecularProfileId <chr>, sampleId <chr>,
#> # patientId <chr>, studyId <chr>, alteration <int>
.get_data_by_study(molecular_profile_id = "prad_msk_2019_mutations", data_type = "mutation")
#> ℹ Returning all data for the "prad_msk_2019_mutations" molecular profile in the "prad_msk_2019" study
#> # A tibble: 26 × 28
#> hugoGeneSymbol entrezGeneId uniqueSampleKey uniquePatientKey
#> <chr> <int> <chr> <chr>
#> 1 ZFHX3 463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 2 ZFHX3 463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 3 ATR 545 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 4 BCL2 596 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 5 ETV1 2115 c19DX1A4SzNUUl9QMDAxX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 6 ETV1 2115 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 7 FAT1 2195 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#> 8 MSH6 2956 c19DX1A4SzNUUl9QMDAyX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 9 MSH6 2956 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 10 FOXA1 3169 c19DX0UwS0pGSl9QMDAyX2Q6cHJhZF9… cF9DX0UwS0pGSjp…
#> # ℹ 16 more rows
#> # ℹ 24 more variables: molecularProfileId <chr>, sampleId <chr>,
#> # patientId <chr>, studyId <chr>, center <chr>, mutationStatus <chr>,
#> # validationStatus <chr>, tumorAltCount <int>, tumorRefCount <int>,
#> # normalAltCount <int>, normalRefCount <int>, startPosition <int>,
#> # endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
#> # mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>, …
.get_data_by_study(molecular_profile_id = "prad_msk_2019_structural_variants", data_type = "fusion")
#> ℹ Returning all data for the "prad_msk_2019_structural_variants" molecular profile in the "prad_msk_2019" study
#> # A tibble: 4 × 44
#> uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId studyId
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 c19DX0NBVVdUN1… cF9DX0NBVVdUNzp… prad_msk_2019_str… s_C_CAU… p_C_CAUW… prad_m…
#> 2 c19DX0RVNkVDQ1… cF9DX0RVNkVDQzp… prad_msk_2019_str… s_C_DU6… p_C_DU6E… prad_m…
#> 3 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> 4 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> # ℹ 38 more variables: site1EntrezGeneId <int>, site1HugoSymbol <chr>,
#> # site1EnsemblTranscriptId <chr>, site1Chromosome <chr>, site1Position <int>,
#> # site1Contig <chr>, site1Region <chr>, site1RegionNumber <int>,
#> # site1Description <chr>, site2EntrezGeneId <int>, site2HugoSymbol <chr>,
#> # site2EnsemblTranscriptId <chr>, site2Chromosome <chr>, site2Position <int>,
#> # site2Contig <chr>, site2Region <chr>, site2RegionNumber <int>,
#> # site2Description <chr>, site2EffectOnFrame <chr>, ncbiBuild <chr>, …
# }