I have run infer_experiment.py of rseqc package to identify the strandedness of the aligned bam file so that I can feed -s option of featureCounts . I used following command to generate the output:
infer_experiment.py -r hg38.bed -i xxy2.sort.bam

The output was:

This is PairEnd Data
Fraction of reads failed to determine: 0.0020
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0906
Fraction of reads explained by "1+-,1-+,2++,2--": 0.9073

But I am confused with the result. Here, they state Pair-end non strand specific as:

This is PairEnd Data
Fraction of reads failed to determine: 0.0172
Fraction of reads explained by "1++,1--,2+-,2-+": 0.4903
Fraction of reads explained by "1+-,1-+,2++,2--": 0.4925

and Pair-end strand specific as:

This is PairEnd Data
Fraction of reads failed to determine: 0.0072
Fraction of reads explained by "1++,1--,2+-,2-+": 0.9441
Fraction of reads explained by "1+-,1-+,2++,2--": 0.0487

In Pair-end non strand specific case output is explained by both "1++,1--,2+-,2-+" and "1+-,1-+,2++,2--" since they have similar fractions.

In Pair-end strand specific case output is explained by “1++,1–,2+-,2-+” as it has the major fraction.

But my output is explained by "1+-,1-+,2++,2--".

Here what is the strand specificity which I can feed to --strandedness option in featureCounts.
Any help appreciated.



Source link