Mapping RNA seq reads from bed file to genes
I have a bed file with columns: chrom, read_start, read_end, id, strand
where every row represents a read from an RNA seq data set (e.g. roadmap).
I would like to map every row to a gene (ensemble id/ gene name). I have an annotation file taken from ensembl (ftp.ensembl.org/pub/current_gtf/homo_sapiens/) . Is an appropriate strategy to find the rows in the annotation file which intersects the current bed file row, and look for the row in the annotation file where the column type value is "gene"?
• 23 views