I have multiple files, which originally had lots of sequences and headers in. I managed to remove the sequences to leave multiple headers only in each file.

So the file looks like this

Ncov - date_date_ - xxx| info I want |xxx - blah

Ncov - date_date_- xxx| info I want |xxx - blah

I would then like to save these new lines of information in one column as a new file, so I can easily compare all these files for % similarity - to give a much lower memory drag. However, I can’t quite get my programme to run this properly or efficiently

I am wondering how I would then code for this extraction?

