How can I identify associate gene id with a script?

0

The following is a large text file I write few lines of it, I need to extract some genes from here according to my input please write a small script. I specified Input-output at the bottom of the post. This problem could seem very tough but in brief, I just want to recognize the associate gene id of one version from another, because online sites do not have the repository of this species. please help first if possible. I have knowledge about R programming but for bash script, I am very new. I got stuck in between an analysis.

SDRB02000004.1  Genbank gene    6018    10396   .   +   .   gene_id "TEA_012962"; transcript_id ""; gbkey "Gene"; gene_biotype "protein_coding"; locus_tag "TEA_012962"; 
SDRB02000004.1  Genbank transcript  6018    10396   .   +   .   gene_id "TEA_012962"; transcript_id "gnl|WGS:SDRB|TEA014503.1"; gbkey "mRNA"; locus_tag "TEA_012962"; orig_protein_id "gnl|WGS:SDRB|TEA014503.1:cds_7"; orig_transcript_id "gnl|WGS:SDRB|TEA014503.1"; product "hypothetical protein"; transcript_biotype "mRNA"; 
SDRB02000004.1  Genbank exon    6018    6864    .   +   .   gene_id "TEA_012962"; transcript_id "gnl|WGS:SDRB|TEA014503.1"; locus_tag "TEA_012962"; orig_protein_id "gnl|WGS:SDRB|TEA014503.1:cds_7"; orig_transcript_id "gnl|WGS:SDRB|TEA014503.1"; product "hypothetical protein"; transcript_biotype "mRNA"; exon_number "1"; 
SDRB02000232.1  Genbank stop_codon  994202  994204  .   +   0   gene_id "TEA_014895"; transcript_id "gnl|WGS:SDRB|TEA016705.1"; gbkey "CDS"; locus_tag "TEA_014895"; orig_transcript_id "gnl|WGS:SDRB|TEA016705.1"; product "hypothetical protein"; protein_id "THG23623.1"; exon_number "19";

My input and desire output like the following -
```

Input (common gene-name)                   Output (special gene name)
TEA_012962                                           TEA014503.1
TEA_014895                                           TEA016705.1


script


BINGO


RNA-seq


R


GO

• 104 views



Source link