Perl script on rna G quadruplex


Hello researchers ,

i am stucked in perl script on rna g quadruplex to find/count the total number of specific unique sequences in which G should be 3 runs and loops should be 7 only , i used grep function with regular expression and array in which i gave input as a fasta file and in the last i counted regular exression in which the code is running means the perl script, but answers are not correct means answer is coming same for all different types of regular expression, can any one please help me out with the same ?

can any one share perl script for unique / specific sequences total counts of G 3 and L7 ONLY ..

i used the most common regular expression : ([gG]{3,}w{1,7}){3,}[gG]{3,}

i tried simple code syntax=grep_function(regular_expression,@array)

full script i used :


#To count total transcripts containing G-Quadruplexes
#Input filename

print "Please enter file name: ";

$name =<>;

chomp $name;

open OUT ,">.$name.OUTPUT";
open(FASTA,$name) or die;

@data =<FASTA>;
$data = join('',@data); #Convert to string

@data2 = split('n',$data); #Explode on newline into array elements
@unique = grep(!$seen{$_}++,@data2); #Extract unique elements from @data2

$unique = join('',@unique); #Convert to string
@uniqueid = split('',$unique); #Explode string back into individual array elements.

#Intialize count


foreach $id(@uniqueid){

if($id eq "N"){


print "nnNumber of transcripts is : $countid";
print OUT "Number of unique transcripts is : $countid";







