Hello. I've been trying to use awk to solve the following problem for some time with no luck. Was hoping someone here could give me some clues as to what I'm doing wrong.

I have a file that looks like this:

INPUT:

>chr8:76290516-76290880
   578  T
   579  G
   580  A
>chr14:22131464-22132025
   468  T
   469  G
   470  A
>chr12:33695439-33695441
   468  T
   469  G
   470  A

Each record in the file has a header that starts with > I would like to print a new column which is essentially a line number, starting after the header which starts with >, and to begin counting from the number following the : in the header. I would like each record (after the header >) in the file to be treated/counted independently.

I have tried to do this in steps, starting with adding the counts to the third column first, and then will attempt to add the header value to the counts following that. I have had no luck with getting the counts in the third column using this command below:

awk '{FS = "/n"}{RS = ">"}{if(!/^>/){print $1, $2, NF, $3 }}' input.txt

DESIRED OUTPUT:

>chr8:76290516-76290518
   578  T   76290516
   579  G   76290517
   580  A   76290518
>chr14:22131464-22131466
   468  T   22131464
   469  G   22131465
   470  A   22131466
>chr12:33695439-33695441
   321  T   33695439
   322  G   33695440
   333  A   33695441

Any pointers would be very much appreciated. Thank you so much!!

EDIT: I did not include my failed attempt to solve this problem, as Mensur Dlakic pointed out to me, so I have edited to include. Thank you.



Source link