gravatar for 2001linana

2 hours ago by

Hi. I downloaded a sequences data file (of size 2.7 GB) from this link: www.covid19dataportal.org/sequences?db=embl-covid19.
It is a .txt file and the lines for the first item/sequence is as the following:

ID   MW281864; SV 1; linear; genomic RNA; STD; VRL; 29871 BP.
XX
AC   MW281864;
XX
DT   24-NOV-2020 (Rel. 144, Created)
DT   11-DEC-2020 (Rel. 144, Last updated, Version 2)
XX
DE   Severe acute respiratory syndrome coronavirus 2 isolate
DE   SARS-CoV-2/human/West Bank/Jericho_SARS-CoV-2/2020, complete genome.
XX
KW   .
XX
OS   Severe acute respiratory syndrome coronavirus 2
OC   Viruses; Riboviria; Orthornavirae; Pisuviricota; Pisoniviricetes;
OC   Nidovirales; Cornidovirineae; Coronaviridae; Orthocoronavirinae;
OC   Betacoronavirus; Sarbecovirus.
XX
RN   [1]
RP   1-29871
RA   Nasereddin A., Ereqat S., Al-Jawabreh A.;
RT   "Genetic epidemiology of severe acute respiratory syndrome coronavirus 2 in
RT   Palestine";
RL   Unpublished.
XX
RN   [2]
RP   1-29871
RA   Nasereddin A., Ereqat S., Al-Jawabreh A.;
RT   ;
RL   Submitted (21-NOV-2020) to the INSDC.
RL   Al-Quds Nutrition and Health Research Institute, Al-Quds University,
RL   Abudeis, Jerusalem 91220, Palestine
XX
DR   MD5; 32ad2f322f6c67d3d001cad2f292e154.
XX
CC   ##Assembly-Data-START##
CC   Assembly Method       :: GALAXY v. 19.09.rc1
CC   Sequencing Technology :: Illumina
CC   ##Assembly-Data-END##
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..29871
FT                   /organism="Severe acute respiratory syndrome coronavirus 2"
FT                   /host="Homo sapiens"
FT                   /isolate="SARS-CoV-2/human/West
FT                   Bank/Jericho_SARS-CoV-2/2020"
FT                   /mol_type="genomic RNA"
FT                   /country="West Bank:Jericho"
FT                   /isolation_source="nasal swab"
FT                   /collection_date="2020-11-07"
FT                   /db_xref="taxon:2697049"
FT   gene            237..21526
FT                   /gene="ORF1ab"
FT   CDS             join(237..13439,13439..21526)
FT                   /codon_start=1
FT                   /ribosomal_slippage
FT                   /gene="ORF1ab"
FT                   /product="ORF1ab polyprotein"
FT                   /protein_id="QPG02368.1"
FT                   /translation="MESLVPGF......"
FT   CDS             237..13454
FT                   /codon_start=1
FT                   /gene="ORF1ab"
FT                   /product="ORF1a polyprotein"
FT                   /protein_id="QPG02369.1"
FT                   /translation="MESLVPGF......"
FT   mat_peptide     237..776
FT                   /gene="ORF1ab"
FT                   /product="leader protein"
FT   mat_peptide     777..2690
FT                   /gene="ORF1ab"
FT                   /product="nsp2"
FT   mat_peptide     2691..8525
FT                   /gene="ORF1ab"
FT                   /product="nsp3"
FT   mat_peptide     8526..10025
FT                   /gene="ORF1ab"
FT                   /product="nsp4"
FT   mat_peptide     10026..10943
FT                   /gene="ORF1ab"
FT                   /product="3C-like proteinase"
FT   mat_peptide     10944..11813
FT                   /gene="ORF1ab"
FT                   /product="nsp6"
FT   mat_peptide     11814..12062
FT                   /gene="ORF1ab"
FT                   /product="nsp7"
FT   mat_peptide     12063..12656
FT                   /gene="ORF1ab"
FT                   /product="nsp8"
FT   mat_peptide     12657..12995
FT                   /gene="ORF1ab"
FT                   /product="nsp9"
FT   mat_peptide     12996..13412
FT                   /gene="ORF1ab"
FT                   /product="nsp10"
FT   mat_peptide     join(13413..13439,13439..16207)
FT                   /gene="ORF1ab"
FT                   /product="RNA-dependent RNA polymerase"
FT   mat_peptide     13413..13451
FT                   /gene="ORF1ab"
FT                   /product="nsp11"
FT   stem_loop       13447..13474
FT                   /gene="ORF1ab"
FT                   /note="Coronavirus frameshifting stimulation element
FT                   stem-loop 1"
FT   stem_loop       13459..13513
FT                   /gene="ORF1ab"
FT                   /note="Coronavirus frameshifting stimulation element
FT                   stem-loop 2"
FT   mat_peptide     16208..18010
FT                   /gene="ORF1ab"
FT                   /product="helicase"
FT   mat_peptide     18011..19591
FT                   /gene="ORF1ab"
FT                   /product="3'-to-5' exonuclease"
FT   mat_peptide     19592..20629
FT                   /gene="ORF1ab"
FT                   /product="endoRNAse"
FT   mat_peptide     20630..21523
FT                   /gene="ORF1ab"
FT                   /product="2'-O-ribose methyltransferase"
FT   gene            21534..25355
FT                   /gene="S"
FT   CDS             21534..25355
FT                   /codon_start=1
FT                   /gene="S"
FT                   /product="surface glycoprotein"
FT                   /protein_id="QPG02370.1"
FT                   /translation="MFVFLVLLPLVS......"
FT   gene            25364..26191
FT                   /gene="ORF3a"
FT   CDS             25364..26191
FT                   /codon_start=1
FT                   /gene="ORF3a"
FT                   /product="ORF3a protein"
FT                   /protein_id="QPG02371.1"
FT                   /translation="MDLFMRIFTIG......"
FT   gene            26216..26443
FT                   /gene="E"
FT   CDS             26216..26443
FT                   /codon_start=1
FT                   /gene="E"
FT                   /product="envelope protein"
FT                   /protein_id="QPG02372.1"
FT                   /translation="MYSFVSEETGTLIVNSVLLFLAFVVFLLVTLAILTALRLCAYCCN
FT                   IVNVSLVKPSFYVYSRVKNLNSSRVPDLLV"
FT   gene            26494..27162
FT                   /gene="M"
FT   CDS             26494..27162
FT                   /codon_start=1
FT                   /gene="M"
FT                   /product="membrane glycoprotein"
FT                   /protein_id="QPG02373.1"
FT                   /translation="MADSNGT......"
FT   gene            27173..27358
FT                   /gene="ORF6"
FT   CDS             27173..27358
FT                   /codon_start=1
FT                   /gene="ORF6"
FT                   /product="ORF6 protein"
FT                   /protein_id="QPG02374.1"
FT                   /translation="MFHLVDFQVTIAEILLIIMRTFKVSIWNLDYIINLIIKNLSKSLT
FT                   ENKYSQLDEEQPMEID"
FT   gene            27365..27730
FT                   /gene="ORF7a"
FT   CDS             27365..27730
FT                   /codon_start=1
FT                   /gene="ORF7a"
FT                   /product="ORF7a protein"
FT                   /protein_id="QPG02375.1"
FT                   /translation="MKIILFLALITLATCELYHYQECVRGTTVLLKEPCSSGTYEGNSP
FT                   FHPLADNKFALTCFSTQFAFACPDGVKHVYQLRARSVSPKLFIRQEEVQELYSPIFLIV
FT                   AAIVFITLCFTLKRKTE"
FT   gene            27727..27858
FT                   /gene="ORF7b"
FT   CDS             27727..27858
FT                   /codon_start=1
FT                   /gene="ORF7b"
FT                   /product="ORF7b"
FT                   /protein_id="QPG02376.1"
FT                   /translation="MIELSLIDFYLCFLAFLLFLVLIMLIIFWFSLELQDHNETCHA"
FT   gene            27865..28230
FT                   /gene="ORF8"
FT   CDS             27865..28230
FT                   /codon_start=1
FT                   /gene="ORF8"
FT                   /product="ORF8 protein"
FT                   /protein_id="QPG02377.1"
FT                   /translation="MKFLVFLGIIKTVAAFHQECSLQSCTQHQPYVVDDPCPIHFYSKW
FT                   YIRVGARKSAPLIELCVDEAGSKSPIQYIDIGNYTVSCLPFTINCQEPKLGSLVVRCSF
FT                   YEDFLEYHDVRVVLDFI"
FT   gene            28245..29504
FT                   /gene="N"
FT   CDS             28245..29504
FT                   /codon_start=1
FT                   /gene="N"
FT                   /product="nucleocapsid phosphoprotein"
FT                   /protein_id="QPG02378.1"
FT                   /translation="MSDNGPQN......"
FT   gene            29529..29645
FT                   /gene="ORF10"
FT   CDS             29529..29645
FT                   /codon_start=1
FT                   /gene="ORF10"
FT                   /product="ORF10 protein"
FT                   /protein_id="QPG02379.1"
FT                   /translation="MGYINVFAFPFTIYSLLLCRMNSRNYIAQVDVVNFNLT"
FT   stem_loop       29580..29615
FT                   /gene="ORF10"
FT                   /note="Coronavirus 3' UTR pseudoknot stem-loop 1"
FT   stem_loop       29600..29628
FT                   /gene="ORF10"
FT                   /note="Coronavirus 3' UTR pseudoknot stem-loop 2"
FT   stem_loop       29699..29739
FT                   /note="Coronavirus 3' stem-loop II-like motif (s2m)"
XX
SQ   Sequence 29871 BP; 8945 A; 5480 C; 5849 G; 9597 T; 0 other;
     aaccaaccaa ctttcgatct cttgtagatc tgttctctaa acgaacttta aaatctgtgt        60
     ggctgtcact cggctgcatg cttagtgcac tcacgcagta taattaataa ctaattactg       120
     tcgttgacag gacacgagta actcgtctat cttctgcagg ctgcttacgg tttcgtccgt       180
     gttgcagccg atcatcagca ......
 I was wondering, could anyone please explain a bit of this piece of data? Many thanks for your time and
 attention.



Source link