How to select genes with multi-exon 3'UTR

1

In my gff file, how do I select and generate a list of genes with more than one exons in the 3'UTR region?


rna-seq


genome


gff


gene

• 94 views

using a GTF and bioalcidaejdk : lindenb.github.io/jvarkit/BioAlcidaeJdk.html

 java -jar dist/bioalcidaejdk.jar -F GTF -e 'stream().flatMap(G->G.getTranscripts().stream()).filter(T->T.getExonCount()>1 && T.getTranscriptUTR3().isPresent()).map(T->T.getTranscriptUTR3().get()).filter(UTR->UTR.getIntervals().size()>1).forEach(U->println(U.getTranscript().getGene().getId()+" "+U.getTranscript().getGene().getGeneName()+" "+U.getTranscript().getId()+" "+U.getIntervals().size()));'  chr22.gtf.gz | sort -t ' ' -k4,4n
(...)
ENSG00000093009.11 CDC45 ENST00000438587.6 17
ENSG00000100412.17 ACO2 ENST00000676714.1 17
ENSG00000100412.17 ACO2 ENST00000678819.1 17
ENSG00000100429.18 HDAC10 ENST00000626012.2 17
ENSG00000184381.20 PLA2G6 ENST00000668499.1 17
ENSG00000286070.2 AP000356.5 ENST00000652248.1 17
ENSG00000099949.21 LZTR1 ENST00000642151.1 19
ENSG00000100023.20 PPIL2 ENST00000680434.1 19
ENSG00000100106.22 TRIOBP ENST00000344404.10 19
ENSG00000100325.15 ASCC2 ENST00000458594.5 19
ENSG00000100412.17 ACO2 ENST00000677698.1 19
ENSG00000100023.20 PPIL2 ENST00000417788.5 20
ENSG00000100150.20 DEPDC5 ENST00000642771.1 20
ENSG00000100150.20 DEPDC5 ENST00000645494.1 20
ENSG00000242259.9 C22orf39 ENST00000509549.5 21
ENSG00000254413.8 CHKB-CPT1B ENST00000453634.5 21
ENSG00000100150.20 DEPDC5 ENST00000644162.1 24
ENSG00000100150.20 DEPDC5 ENST00000645755.1 28
ENSG00000284431.1 AL022238.3 ENST00000639722.1 29
ENSG00000133454.16 MYO18B ENST00000539302.5 32
ENSG00000100150.20 DEPDC5 ENST00000642684.1 37


Login
before adding your answer.

Traffic: 1097 users visited in the last hour



Source link