How to write a script for identifying associate gene id

0

The following is a large text file I write few lines of it, I need to extract some genes from here according to my input please write a small script. I specified Input-output at the bottom of the post. This problem could seem very tough but in brief, I just want to recognize the associate gene id of one version from another, because online sites do not have the repository of this species. please help first if possible.

SDRB02000004.1  Genbank gene    6018    10396   .   +   .   gene_id "TEA_012962"; transcript_id ""; gbkey "Gene"; gene_biotype "protein_coding"; locus_tag "TEA_012962"; 
SDRB02000004.1  Genbank transcript  6018    10396   .   +   .   gene_id "TEA_012962"; transcript_id "gnl|WGS:SDRB|TEA014503.1"; gbkey "mRNA"; locus_tag "TEA_012962"; orig_protein_id "gnl|WGS:SDRB|TEA014503.1:cds_7"; orig_transcript_id "gnl|WGS:SDRB|TEA014503.1"; product "hypothetical protein"; transcript_biotype "mRNA"; 
SDRB02000004.1  Genbank exon    6018    6864    .   +   .   gene_id "TEA_012962"; transcript_id "gnl|WGS:SDRB|TEA014503.1"; locus_tag "TEA_012962"; orig_protein_id "gnl|WGS:SDRB|TEA014503.1:cds_7"; orig_transcript_id "gnl|WGS:SDRB|TEA014503.1"; product "hypothetical protein"; transcript_biotype "mRNA"; exon_number "1"; 
SDRB02000232.1  Genbank stop_codon  994202  994204  .   +   0   gene_id "TEA_014895"; transcript_id "gnl|WGS:SDRB|TEA016705.1"; gbkey "CDS"; locus_tag "TEA_014895"; orig_transcript_id "gnl|WGS:SDRB|TEA016705.1"; product "hypothetical protein"; protein_id "THG23623.1"; exon_number "19";

My input and desire output like the following -
```
Input (common gene-name) Output (special gene name)
TEA_012962 TEA014503.1
TEA_014895 TEA016705.1


script


BINGO


RNA-seq


R


GO

• 38 views



Source link