Dear BioStars community,
I am working for the first time with 'raw' sequencing data (in the format of fastq files). The data is single end GBS data produced with two restriction enzymes.
The sequencing centre provided the data already demultiplexed, but with the barcodes still present in line at the start of the read.
Here the first two lines from one fastq file
@HISEQ:658:CDPMCANXX:6:1101:8843:1997 1:N:0: NACAGCAGACAGTGCAGTTTTACCTCAGAAACCACATATGCATGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGAT
The metadata file provides a barcode9I (GACAGCAGACAGTGC) and a barcode (GACAGCAGACAG) for this individual.
First of all, what is the difference between the two?
Furthermore, how can I remove the barcodes as part of the process_radtags module, considering that the data has already been demultiplexed (i.e., one fastq file per individual)?