how to extract REF and ALT DNA string around SNV
I want to reproduce the operation about Motif identification at positions of SNVs, which is the part of "The chromatin accessibility landscape of primary human cancers"
The detailed method is below:
To identify motifs whose binding affinity is predicted to be most perturbed by a given SNV, we
constructed a REF and VAR string (+/- 10 bp from the SNV, 21 bp in length) using the hg38
BSGenome reference sequence. We then used motifmatchr with CIS-BP motifs to calculate the
motif positions and motif scores (representing the significance of the motif match) using
motifmatchr “matchMotifs(positions = “out”, p.cutoff=0.01)” in both the REF and VAR string.
Then, we intersected the motif positions with the SNV in both the REF and VAR string to ensure
the motifs overlap the SNV. For each motif present, the motif’s score differential was computed if
the motif was present at the same position in both the REF and VAR. If there was no motif present
at the same position in both the REF and VAR, the differential was compared to a motif score of
For now, I just wan to know how to constructed a REF and VAR string (+/- 10 bp from the SNV, 21 bp in length). I try to use
BSgenome, but it need the
SNPlocs-class, which is hard to get for me. So I am wondering whether some one can give me some other advices.
PS: the bioconductor support link I posted about
SNPlocs-class is here
• 74 views