I seek advice as to the best strategy to salvage high quality reads from a 10X single cell RNA-, and ATAC-seq experiment that partially failed on the Hiseq-4000.
The issue is mainly with the scRNA-seq data.
On the flow cell we ran 5 lanes for RNA and 3 lanes for ATAC. The failure occurred as each modality requires a different run configuration due to indexing differences between them, scRNA-seq uses a single index, whereas scATAC-seq is dual index, and (we now know!) 10X do not recommended to mix single and dual indexed samples on the same flow cell.
Hindsight aside, we ran with dual index parameters, which resulted in the ATAC-seq data looking great, but for the RNA lanes, the quality scores for the reads on all of top half of the flow cell were abysmal. Why? Although 10X themselves have not been able to replicate this issue in house, this appears to be caused by a loss of focus on the upper surface of the flow cell after the i5 read. Others have mentioned the same issue elsewhere.
Our current strategy to salvage the high quality reads is to extract raw data from the sequencer from the good half of one of the RNA-seq lanes and create a new fastq file, to see if cell ranger like this, but I'm wondering - is the best strategy? - or is there a way to extract the reads with high quality scores from the fastq files that we have already generated?
If the answer is the latter, I'm unsure how to do this considering the forward and reverse reads for single cell data contain different information. This may be a trivial issue.
Any advice on this issue would be greatly appreciated.