plot_dinucleotide_frequencies.Rd
Analysis of -1 and +1 dinucleotide frequencies.
plot_dinucleotide_frequencies( experiment, genome_assembly = NULL, samples = "all", threshold = NULL, use_normalized = FALSE, dominant = FALSE, data_conditions = NULL, ncol = 3, return_table = FALSE, ... )
experiment | TSRexploreR object. |
---|---|
genome_assembly | Genome assembly in FASTA or BSgenome format. |
samples | A vector of sample names to analyze. |
threshold | TSSs or TSRs with a score below this value will not be considered. |
use_normalized | Whether to use the normalized (TRUE) or raw (FALSE) counts. |
dominant | If TRUE, will only consider the highest-scoring TSS per gene, transcript, or TSR or highest-scoring TSR per gene or transcript. |
data_conditions | Apply advanced conditions to the data. |
ncol | Integer specifying the number of columns to arrange multiple plots. |
return_table | Return a table of results instead of a plot. |
... | Arguments passed to geom_col |
ggplot2 object of dinucleotide plot. If 'return_table' is TRUE, a data.frame of underlying data is returned.
It has been shown in many organisms that particular base preferences exist at the -1 and +1 positions, where +1 is the TSS and -1 is the position immediately upstream. This plotting function returns a ggplot2 barplot of -1 and +1 dinucleotide frequencies,
'genome_assembly' must be a valid genome assembly in either fasta or BSgenome format. fasta formatted genome assemblies should have the file extension '.fasta' or '.fa'. BSgenome assemblies are precompiled Bioconductor libraries for common organisms.
A set of arguments to control data structure for plotting are included. 'threshold' will define the minimum number of raw counts a TSS or TSR must have to be considered. 'dominant' specifies whether only the dominant TSS should be considered from the 'mark_dominant' function. For TSSs this can be either dominant per TSR or gene. 'data_conditions' allows for the advanced filtering, ordering, and grouping of data.
data(TSSs_reduced) assembly <- system.file("extdata", "S288C_Assembly.fasta", package="TSRexploreR") exp <- TSSs_reduced %>% tsr_explorer(genome_assembly=assembly) %>% format_counts(data_type="tss") p <- plot_dinucleotide_frequencies(exp)