G_correction.Rd
Correct overrepresentation of 5' G bases added during reverse transcription.
G_correction(experiment, assembly = NULL)
experiment | TSRexploreR object. |
---|---|
assembly | Genome assembly in FASTA or BSgenome format. |
TSRexploreR object with G-corrected TSS GRanges.
A common artifact in most TSS mapping methods is the presence of a G base upstream of the true TSS, presumably templated by the 5' cap during reverse transcription. Soft-clipping analysis can remove such Gs if they are not incidentally templated onto the genome; however, in cases where they match the genome during alignment, they cannot be distinguished from true TSSs. In order to account for this artifact, TSRexploreR first determines the frequency of reads with a soft-clipped G in a given sample. For each read with a non-soft-clipped G at its 5' end, a Bernoulli trial is performed, with the above-mentioned frequency used as the probability of "success" (removal of the 5' G).
import_bams
to import BAMs.
bam_file <- system.file("extdata", "S288C.bam", package="TSRexploreR") assembly <- system.file("extdata", "S288C_Assembly.fasta", package="TSRexploreR") samples <- data.frame(sample_name="S288C", file_1=bam_file, file_2=NA) exp <- tsr_explorer(sample_sheet=samples, genome_assembly=assembly) %>% import_bams(paired=TRUE)#> Warning: NAs introduced by coercionexp <- G_correction(exp, assembly=assembly)