Using the ICGC API to retrieve non SSM data


#1

My first question is does the /mutations endpoint only return SSM mutations? If not, then what mutation types does it return and how can we filter them?

Also assuming that is not the case, is there any way to use the portal API to grab say, CNSM data, besides using the download endpoints, in a similar way to how the mutations endpoint is used? I want to be able to grab the CNSM data for a given donor over some genomic range without having to download and parse the data files.


#2

Hi Andrew,

Short answer: /mutations is only meant to serve SSM mutations, we have not gotten around to support ‘CNSM’.

Long answer: when we add support to other mutation types, each mutation type should have its own endpoint so there is no need to filter by mutation type. As for search mutation by genomic range, here is an example: https://icgc.org/ZDn the corresponding API call is: https://dcc.icgc.org/api/v1/mutations?filters={"mutation":{"location":{"is":["chr7:140452136-140454136"]}}}&from=1&include=consequences&size=10. This is how it works for SSM, when we add CNSM, it should work in a similar fashion.

Hope this helps, please feel free to let us know if you have any further questions.

Cheers,
Junjun


#3

Hey Junjun,

Thanks for the quick reply.

So endpoints for other mutation types like CNSM are on the roadmap but not available just yet? Is there a rough time range where they might be implemented?

For now is the only approach to use the /download/submit endpoint to get a download ID and then use the /download/{downloadId} to retrieve the CNSM data?

Thanks,
Andrew Duncan


#4

Hi Andrew,

We don’t have a timeline for CNSM data yet. We are doing this in the GDC project. If you are curious about the CNV schema, check it out here: https://github.com/NCI-GDC/gdc-models/blob/cna-models/es-models/cnv_centric/cnv_centric.mapping.yaml

Yes, it is the only way to get CNSM data in TSV format. Also, keep it in mind the annotation information in those TSV files is originated from ICGC submitting projects, that’s different than how annotation is added for SSM, which is a uniform process done at the ETL by the ICGC DCC.

Best regard
Junjun


#5

Hey Junjun,
Is there anyway to retrieve the TSV file directly? Instead of retrieving a ZIP file that I have to unzip and then read the TSV file, since for my use case I am only grabbing one of the data types (ex CNSM) at a time.

Thanks,
Andrew Duncan


#6

Hi Andrew,

Actually if you’d like to download all CNSM data for an entire ICGC Data Release without filtering, there is a better alternative. You can go to the Data Releases section of the data portal, then pick ‘current’ release, and choose individual project and locate to copy number file. Here is one example project: https://dcc.icgc.org/releases/current/Projects/BLCA-US, the file is copy_number_somatic_mutation.BLCA-US.tsv.gz.

Would that be a suitable solution for you?

Best,
Junjun