Analysis of -1 and +1 dinucleotide frequencies.

plot_dinucleotide_frequencies(
  experiment,
  genome_assembly = NULL,
  samples = "all",
  threshold = NULL,
  use_normalized = FALSE,
  dominant = FALSE,
  data_conditions = NULL,
  ncol = 3,
  return_table = FALSE,
  ...
)

Arguments

experiment

TSRexploreR object.

genome_assembly

Genome assembly in FASTA or BSgenome format.

samples

A vector of sample names to analyze.

threshold

TSSs or TSRs with a score below this value will not be considered.

use_normalized

Whether to use the normalized (TRUE) or raw (FALSE) counts.

dominant

If TRUE, will only consider the highest-scoring TSS per gene, transcript, or TSR or highest-scoring TSR per gene or transcript.

data_conditions

Apply advanced conditions to the data.

ncol

Integer specifying the number of columns to arrange multiple plots.

return_table

Return a table of results instead of a plot.

...

Arguments passed to geom_col

Value

ggplot2 object of dinucleotide plot. If 'return_table' is TRUE, a data.frame of underlying data is returned.

Details

It has been shown in many organisms that particular base preferences exist at the -1 and +1 positions, where +1 is the TSS and -1 is the position immediately upstream. This plotting function returns a ggplot2 barplot of -1 and +1 dinucleotide frequencies,

'genome_assembly' must be a valid genome assembly in either fasta or BSgenome format. fasta formatted genome assemblies should have the file extension '.fasta' or '.fa'. BSgenome assemblies are precompiled Bioconductor libraries for common organisms.

A set of arguments to control data structure for plotting are included. 'threshold' will define the minimum number of raw counts a TSS or TSR must have to be considered. 'dominant' specifies whether only the dominant TSS should be considered from the 'mark_dominant' function. For TSSs this can be either dominant per TSR or gene. 'data_conditions' allows for the advanced filtering, ordering, and grouping of data.

Examples

data(TSSs_reduced) assembly <- system.file("extdata", "S288C_Assembly.fasta", package="TSRexploreR") exp <- TSSs_reduced %>% tsr_explorer(genome_assembly=assembly) %>% format_counts(data_type="tss") p <- plot_dinucleotide_frequencies(exp)