I am using GISTIC 2.0 to summarize my targeted exome sequencing data on 30 samples. The copy number ratios and segmentation were done by CNVKit. Segmentation is performed using the circular binary segmentation (CBS) algorithm in CNVKit. I didn't use any marker file as input in GISTIC2.0.
I got over 1500 amplified genes and over 500 deleted genes, in the files "amp_genes_conf_90" and "del_genes_conf_90". Many of those genes are not covered in my original exome-seq bait file covered regions (but they are covered in the final segments from CNVKit).
Should I filter the list of amplified/deleted genes with the original gene list in my bait file, and get the overlapped genes as the final amp/del gene list?
The 2nd question is about the list of genes in the GISTIC2.0 output file "focal_data_by_genes". These over 23K genes are from the annotation file (e.g., hg19.mat), correct?
Below are some details in the parameters I used in the GISTIC2.0 run:
Amplification Threshold = 0.3
Deletion Threshold = 0.4
Confidence Level = 0.90
Gene GISTIC = 0
Gene collapse method = extreme
Thank you very much!