gravatar for evirolve

2 hours ago by

Hi folks, I'm struggling to find guidance for my specific issue and hoping someone can give some pointers toward a first step in either finding a solution or confirming there isn't one.

I've just gotten some RNA-Seq Illumina data back from libraries I prepped myself with a stranded prep kit, but as I'm doing some mapping, all contigs are just mapping roughly 50/50 from plus and minus strand reads. Obviously this should not be the case for most transcripts, so something went wrong at either library prep, post-sequencing processing, mapping, etc. I'm not sure how to assess where the problem is and if it's recoverable.

Here are some relevant details:

Library prep kit: NEBNext Ultra II directional RNA library kit, uses dUTP strand marking strategy

RNA source: Total RNA, double rRNA depleted (euk/prok) prior to library prep

Sequencing: Illumina HiSeq PE 150 bp reads, 8 libraries multiplexed in one lane (used NEBNext multiplex oligos set 1). The data demultiplexed fine, I've got de novo Trinity assemblies for 6 of them so far and they look great.

Mapping: I'm using the raw reads provided from my sequencing facility, I haven't done any formatting/converting that might drop information. Mapping inputs are paired fq.gz files. The software I'm using is BBMap 38.35. I did a control mapping run with the same parameters using an older stranded dataset in fq.gz and that one works as expected, so pretty sure BBMap is not the issue.

Mapping command:

bbmap.sh in1=BW1_1_R1.fq.gz in2=BW1_1_R2.fq.gz ref=ref_contigs.fasta covstats=covstats_BW1_1.txt outm=BW1_1_mapped.fq

Is there a simple check I can do just looking at my fastq text files? If my service facility trimmed adaptors, could that result in loss of strandedness information? Any thoughts are much appreciated. Thank you!

Below is a preview of my fastq data.

First read set:

head -n 20 BW1_1_R1.fq
@E00489:538:H75T7CCX2:1:1101:13068:1415 1:N:0:NTCACG
NATCTGTGTTACGTCATTAACTTGGGCGCTGTAACATAGCGTTTGGTTCATCCCACAGCACTAGTTCTGCAGATAGGAAGAGCACACGTCTGAACGCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTGAAAAATATTTTTCTAA
+
#AAAFJJF7JJFJFFAFJJJ-FJ<FJJJJ7-7FJ<FJFAJJF-7<FFJF<JAFF7F<AJAJ<F<--<7F<FF---<A<FJAJFJ<-A<FJAA-<F-AJ<A<7<JJA7A<<<-7-7<-AJJ7AFFJF<JAFJAAAF7-7---77AAJ----
@E00489:538:H75T7CCX2:1:1101:8532:1432 1:N:0:NTCACG
NTTCAACGGGTTTGTATCCCTAGCCAGTTTCCCGTTCTCTTTGGCTCATTATTTAGTGTGCTTTTCATCTTGCTATCACGGTCCTTGTTCGCTTTTGGTCTCATGGCGATATTTAGACTTAGATGGTGTTTTTTTCGCACTTTGTGCCCC
+
#-AAFJ<JJ<7-AF7JFJ<F7FJA<<77<-<<FA-777--<<-<--7--7<-F-<-<F-<-7AJ-77---F-7--<--A7-A-7AAAJ7<<7A77--<--7--<AJ-A-A<-<<7A-<A<---777-<F-A-7A----------7---7-
@E00489:538:H75T7CCX2:1:1101:18923:1538 1:N:0:NTCACG
NCGCATATCTAAAGTTATGTTTTTTAAGCTTATAGGTTTATAATACCCATTAACTTGCAAGCAGGATAGACTTCTTGGTCCGTGTTTCTAGTCGGGTCCCGAGGGTATCTCGGTTGTATCATTGTTGATCAAATAATTATTAACAATCGA
+
#AAFFJJAFJJAFJFJFJAJAJJJJJJJJFFJJ-FF7-A<--7FFFJ7A<-<-F<<-<-<7-F-FAJFF--FAFJAJFJAJJJJFJF7-A7--<F<FFFJJ7-7FJ7F-AJA-A<JJFJ-<7FAJJA7JFJ-77FAAF-AF--7AFFF77
@E00489:538:H75T7CCX2:1:1101:16853:1678 1:N:0:NTCACG
NCGGGTTTGAATCCTTAGACTGTCTCACGCATTATTTGCCTCTTTATTCAGGGTGCCTTTCTCCTTTCCCTTCATGTACTTGTTCGTTCTTGGACTTATGGCGATATTTAGTCTTAGAGGTAGAGATCGGATGAGCACATGTTTGACCGC
+
#-AAFFAJFJJFFAFFJJJJFJJJJJJJJ77-<-F-<F-FFFA-7-FAJ-<-FJJ7--FF7--<--F--<7----<<AJJJ-7FAJ-<-<-<---A-77AJF<AA-F<<FJA<AA-7-7---A7AFFAJFJ-7-7--777---7-A----
@E00489:538:H75T7CCX2:1:1101:14367:1696 1:N:0:NTCACG
NGGCGCCTTAACATAGCGTTTGGTTCATCCTACAGCACCAGTTCTGCTTACCAAAACTTGGCCCTTTATGCACATATATATCTTTTACTAAATGATAATAATAAAATTTTCTTTATAACATTAACATCATTAAAGCATGTTATGTAACGT
+
#AAAFJFJJAJFJJ-FAFJJFJJJJJJJJF7JJJ--FFAJ<-A-F7JJF-<-F7--FFJ<F-7---<<-<<F<-FFJJJJJ-JJJFJ<AF<7-<F-AJFFJJAAJJJJJJJAFFJJJ----AFJF-F<FFJ77-7-AAFJA-A7A7--7F

and second read set:

head -n 20 BW1_1_R2.fq
@E00489:538:H75T7CCX2:1:1101:13068:1415 2:N:0:NTCACG
NCAGAACTGGTGCTGTGGGATGAACCAAACGCTATGTTAAGGCGCCCAAGTTAATGACGTAACACAGATAAGATCGGAAGAGCGTTGTGTAGGGAAAGAGTGTAGATCTTGGTGGTCGCCGTATCATTAATAAAATTTTTTTTAAGCAGA
+
#AAAFJAAFJJJJJJFJJJFJJJJFJJAJJF<F<JJFJJFJJJJF777AFJF<JJJJFJJFFJJFJJ7JFJJA<-<AA-FJJ<FA-7FF-FJFJJJ<A7AJJJJJJF7F-<A7A7AFJ--7A--A-A<F<-77FA-<<FFFJ---7--7A
@E00489:538:H75T7CCX2:1:1101:8532:1432 2:N:0:NTCACG
NATGCTTGATAATGCAGCTTATTGTGGGTGGTAAACTCCATTTAAGGCTTAAAATTGTTATGAGACGGATAGCGAACTTATACCGTGAGGGAAAGTTGAAAAGTATTATGAATAGAGAGGTAAGTAGTTTGTGAAATTGTCTATGGTTTC
+
#--A-AFFJJJ<-A7---A-F--77FF<-AAFFA<-------<F<-<-7-------7<----FJ--7-7-<F---7--7--<-7F-A77-F--7A7-A--<-A-7---7-7-----7-7-7AF-7-A--7-A-FJ-7-7----7-A-F--
@E00489:538:H75T7CCX2:1:1101:18923:1538 2:N:0:NTCACG
NAATTTCTAATGTTTGTATAAATGACTGATGAATATTTTCAAAATAATTATATTGCATTATTCATAATGCACAATTTCGATTGTTAATAATTATTTGATCAACAATGATACAATTGAGATACCTTCGGGACCCGTCTTGAAACACGGACC
+
#AAAFJJJJJJJJJJJJJJJJFJJJJJJJJJFJJJJJJJJJJJJJJF-JJFJFJJFF7FJJJJJJJJJJJJFAAJAFJJFJJJJJJJJAFJJ-FJJJJFJJJJJJJJFJAJJJJJJJJJJJFFJJJJFJJ-AFFJJJFJJJJJFAAJJ)<
@E00489:538:H75T7CCX2:1:1101:16853:1678 2:N:0:NTCACG
NTCCTTCTAAGGCTAAATATCGCCATGAGACCGATAGCGAACAAGTACCGTGAGGGAAAGTTGAAAAGCACTCTGAATAGAGAGTCAAATAGTACGTGAAACTGTCTAGGGATTCAAACCCGCAGATCGGAAGAGCGTCGTGTAGGGAAA
+
#-AAFJFFJJJJJFFJFJJJJJJJJJJJJJJAA7FJJFJ-FAFJJAF<<FFF-JJJFFFJ-F<FFJJJJJJFFAF<JJJJJJ<JJAAJFJJJ<FFJJJ-FFA-AFJJJJFJF<AAFJJJFJJ<<J<<AFF-AFJJJFJJJJJJJJJJF7<
@E00489:538:H75T7CCX2:1:1101:14367:1696 2:N:0:NTCACG
NTATTCTCAAACTATAAATGGGTACGTTACATAACATGCTTTAATGATGTTATTGTTATAATGAAAATTTTATTATTTTCATTTAGTAAAAGATATATATGTGCTTAGTGGGCCAAGTTTTGGTAAGCAGAACTGGTGCTGTGGGATGTA
+
#AAAFJJFJJJJJJFJJJJJJJJJFFJJJJJJJ<JJJJJAJJJFFFFFJJJJAJJJJJJJJJJJ<JJ<JJJJJJJFJJJJJJJJJJJJAJJJFJJJJJFFJJA<JJJJJ<A<-7-JJJFJJJJFF7<FF7-JFFJJJ<<AJJA<7<FA-A



Source link