Hey there,
I downloaded all the available CNSM data form ICGC (11,662 donors), but I was wondering why there exist 1bp CNVs, and a lot of CNVs smaller than 100bp, what is the difference between them and mutation & indel?
eg.
icgc_donor_id project_code icgc_specimen_id icgc_sample_id matched_icgc_sample_id submitted_sample_id submitted_matched_sample_id mutation_type copy_number segment_mean segment_median chromosome chromosome_start chromosome_end assembly_version chromosome_start_range chromosome_end_range start_probe_id end_probe_id sequencing_strategy quality_score probability is_annotated verification_status verification_platform gene_affected transcript_affected gene_build_version platform experimental_protocol base_calling_algorithm alignment_algorithm variation_calling_algorithm other_analysis_algorithm seq_coverage raw_data_repository raw_data_accession
DO11034 GBM-US SP23901 SA140828 TCGA-12-1602-01A-01D-0591-01 undetermined NA -2.0069 NA 6 78470133 78470133 GRCh37 NA NA non-NGS NA NA not annotated not tested Affymetrix Genome-Wide Human SNP Array 6.0 Genome_Wide_SNP_6 https://www.affymetrix.com/ NA TCGA TCGA-12-1602-01A-01D-0591-01
DO23028 LIHC-US SP49551 SA269377 TCGA-CC-A1HT-01A-11D-A12Y-01 undetermined NA -1.6235 NA 10 1443180 1443180 GRCh37 NA NA non-NGS NA NA not annotated not tested Affymetrix Genome-Wide Human SNP Array 6.0 Genome_Wide_SNP_6 https://www.affymetrix.com/ NA TCGA TCGA-CC-A1HT-01A-11D-A12Y-01