Skip to contents

Endpoints for retrieving mutation and cna data are structurally similar. This internal function allows you to pull data from either endpoint. It has logic for sensible default guesses at study_id and molecular_profile_id when those are NULL

Usage

.get_data_by_study(
  study_id = NULL,
  molecular_profile_id = NULL,
  data_type = c("mutation", "cna", "fusion", "structural_variant", "segment"),
  base_url = NULL,
  add_hugo = TRUE
)

Arguments

study_id

A study ID to query mutations. If NULL, guesses study ID based on molecular_profile_id.

molecular_profile_id

a molecular profile to query mutations. If NULL, guesses molecular_profile_id based on study ID.

data_type

specify what type of data to return. Options aremutation, cna, fusion, orstructural_variant (same as fusion), and segment (copy number segmentation data)..

base_url

The database URL to query If NULL will default to URL set with set_cbioportal_db(<your_db>)

add_hugo

Logical indicating whether HugoGeneSymbol should be added to your resulting data frame, if not already present in raw API results. Argument is TRUE by default. If FALSE, results will be returned as is (i.e. any existing Hugo Symbol columns in raw results will not be removed).

Value

a dataframe of mutations, CNAs or structural variants

Examples

# \dontrun{
set_cbioportal_db("public")
#>  You are successfully connected!
#>  base_url for this R session is now set to "www.cbioportal.org/api" 
.get_data_by_study(study_id = "prad_msk_2019", data_type = "cna")
#>  Returning all data for the "prad_msk_2019_cna" molecular profile in the "prad_msk_2019" study
#> # A tibble: 1 × 9
#>   hugoGeneSymbol entrezGeneId uniqueSampleKey                   uniquePatientKey
#>   <chr>                 <int> <chr>                             <chr>           
#> 1 PTEN                   5728 c19DXzM2OTI0TF9QMDAxX2Q6cHJhZF9t… cF9DXzM2OTI0TDp…
#> # ℹ 5 more variables: molecularProfileId <chr>, sampleId <chr>,
#> #   patientId <chr>, studyId <chr>, alteration <int>
.get_data_by_study(study_id = "prad_msk_2019", data_type = "mutation")
#>  Returning all data for the "prad_msk_2019_mutations" molecular profile in the "prad_msk_2019" study
#> # A tibble: 26 × 28
#>    hugoGeneSymbol entrezGeneId uniqueSampleKey                  uniquePatientKey
#>    <chr>                 <int> <chr>                            <chr>           
#>  1 ZFHX3                   463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  2 ZFHX3                   463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  3 ATR                     545 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  4 BCL2                    596 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  5 ETV1                   2115 c19DX1A4SzNUUl9QMDAxX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#>  6 ETV1                   2115 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#>  7 FAT1                   2195 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  8 MSH6                   2956 c19DX1A4SzNUUl9QMDAyX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#>  9 MSH6                   2956 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 10 FOXA1                  3169 c19DX0UwS0pGSl9QMDAyX2Q6cHJhZF9… cF9DX0UwS0pGSjp…
#> # ℹ 16 more rows
#> # ℹ 24 more variables: molecularProfileId <chr>, sampleId <chr>,
#> #   patientId <chr>, studyId <chr>, center <chr>, mutationStatus <chr>,
#> #   validationStatus <chr>, tumorAltCount <int>, tumorRefCount <int>,
#> #   normalAltCount <int>, normalRefCount <int>, startPosition <int>,
#> #   endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
#> #   mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>, …
.get_data_by_study(study_id = "prad_msk_2019", data_type = "fusion")
#>  Returning all data for the "prad_msk_2019_structural_variants" molecular profile in the "prad_msk_2019" study
#> # A tibble: 4 × 44
#>   uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId studyId
#>   <chr>           <chr>            <chr>              <chr>    <chr>     <chr>  
#> 1 c19DX0NBVVdUN1… cF9DX0NBVVdUNzp… prad_msk_2019_str… s_C_CAU… p_C_CAUW… prad_m…
#> 2 c19DX0RVNkVDQ1… cF9DX0RVNkVDQzp… prad_msk_2019_str… s_C_DU6… p_C_DU6E… prad_m…
#> 3 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> 4 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> # ℹ 38 more variables: site1EntrezGeneId <int>, site1HugoSymbol <chr>,
#> #   site1EnsemblTranscriptId <chr>, site1Chromosome <chr>, site1Position <int>,
#> #   site1Contig <chr>, site1Region <chr>, site1RegionNumber <int>,
#> #   site1Description <chr>, site2EntrezGeneId <int>, site2HugoSymbol <chr>,
#> #   site2EnsemblTranscriptId <chr>, site2Chromosome <chr>, site2Position <int>,
#> #   site2Contig <chr>, site2Region <chr>, site2RegionNumber <int>,
#> #   site2Description <chr>, site2EffectOnFrame <chr>, ncbiBuild <chr>, …

.get_data_by_study(molecular_profile_id = "prad_msk_2019_cna", data_type = "cna")
#>  Returning all data for the "prad_msk_2019_cna" molecular profile in the "prad_msk_2019" study
#> # A tibble: 1 × 9
#>   hugoGeneSymbol entrezGeneId uniqueSampleKey                   uniquePatientKey
#>   <chr>                 <int> <chr>                             <chr>           
#> 1 PTEN                   5728 c19DXzM2OTI0TF9QMDAxX2Q6cHJhZF9t… cF9DXzM2OTI0TDp…
#> # ℹ 5 more variables: molecularProfileId <chr>, sampleId <chr>,
#> #   patientId <chr>, studyId <chr>, alteration <int>
.get_data_by_study(molecular_profile_id = "prad_msk_2019_mutations", data_type = "mutation")
#>  Returning all data for the "prad_msk_2019_mutations" molecular profile in the "prad_msk_2019" study
#> # A tibble: 26 × 28
#>    hugoGeneSymbol entrezGeneId uniqueSampleKey                  uniquePatientKey
#>    <chr>                 <int> <chr>                            <chr>           
#>  1 ZFHX3                   463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  2 ZFHX3                   463 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  3 ATR                     545 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  4 BCL2                    596 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  5 ETV1                   2115 c19DX1A4SzNUUl9QMDAxX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#>  6 ETV1                   2115 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#>  7 FAT1                   2195 c19DX004WDQyVF9QMDAyX2Q6cHJhZF9… cF9DX004WDQyVDp…
#>  8 MSH6                   2956 c19DX1A4SzNUUl9QMDAyX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#>  9 MSH6                   2956 c19DX1A4SzNUUl9QMDAzX2Q6cHJhZF9… cF9DX1A4SzNUUjp…
#> 10 FOXA1                  3169 c19DX0UwS0pGSl9QMDAyX2Q6cHJhZF9… cF9DX0UwS0pGSjp…
#> # ℹ 16 more rows
#> # ℹ 24 more variables: molecularProfileId <chr>, sampleId <chr>,
#> #   patientId <chr>, studyId <chr>, center <chr>, mutationStatus <chr>,
#> #   validationStatus <chr>, tumorAltCount <int>, tumorRefCount <int>,
#> #   normalAltCount <int>, normalRefCount <int>, startPosition <int>,
#> #   endPosition <int>, referenceAllele <chr>, proteinChange <chr>,
#> #   mutationType <chr>, ncbiBuild <chr>, variantType <chr>, keyword <chr>, …
.get_data_by_study(molecular_profile_id = "prad_msk_2019_structural_variants", data_type = "fusion")
#>  Returning all data for the "prad_msk_2019_structural_variants" molecular profile in the "prad_msk_2019" study
#> # A tibble: 4 × 44
#>   uniqueSampleKey uniquePatientKey molecularProfileId sampleId patientId studyId
#>   <chr>           <chr>            <chr>              <chr>    <chr>     <chr>  
#> 1 c19DX0NBVVdUN1… cF9DX0NBVVdUNzp… prad_msk_2019_str… s_C_CAU… p_C_CAUW… prad_m…
#> 2 c19DX0RVNkVDQ1… cF9DX0RVNkVDQzp… prad_msk_2019_str… s_C_DU6… p_C_DU6E… prad_m…
#> 3 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> 4 c19DX1ZDNlA5QV… cF9DX1ZDNlA5QTp… prad_msk_2019_str… s_C_VC6… p_C_VC6P… prad_m…
#> # ℹ 38 more variables: site1EntrezGeneId <int>, site1HugoSymbol <chr>,
#> #   site1EnsemblTranscriptId <chr>, site1Chromosome <chr>, site1Position <int>,
#> #   site1Contig <chr>, site1Region <chr>, site1RegionNumber <int>,
#> #   site1Description <chr>, site2EntrezGeneId <int>, site2HugoSymbol <chr>,
#> #   site2EnsemblTranscriptId <chr>, site2Chromosome <chr>, site2Position <int>,
#> #   site2Contig <chr>, site2Region <chr>, site2RegionNumber <int>,
#> #   site2Description <chr>, site2EffectOnFrame <chr>, ncbiBuild <chr>, …
# }