Make a plot to explore threshold values.

plot_threshold_exploration(
  experiment,
  max_threshold = 25,
  steps = 1,
  samples = "all",
  use_normalized = FALSE,
  ncol = 1,
  point_size = 1,
  return_table = FALSE,
  ...
)

Arguments

experiment

TSRexploreR object.

max_threshold

Thresholds from 1 to max_threshold will be explored.

steps

Steps to get the threshold values.

samples

A vector of sample names to analyze.

use_normalized

Whether to use the normalized (TRUE) or raw (FALSE) counts.

ncol

Integer specifying the number of columns to arrange multiple plots.

point_size

The size of the points on the plot.

return_table

Return a table of results instead of a plot.

...

Arguments passed to geom_point.

Value

ggplot2 object containing the threshold exploration plot

Details

All TSSs mapping methods produce spurious TSSs. For the most part, these spurious reads TSSs to be weak and somewhat uniformly distributed throughout promoters and gene bodies. This means that this background can be mitigated by requiring a minimum read threshold for a TSS to be considered in downstream analyses.

This plotting function generates a line plot, where the x-axis is the naive read threshold and the y-axis is the proportion of TSSs within annotated gene/transcript promoters. Additionally, the point color represents the absolute number of genes with at least 1 surviving TSS after filtering. 'max_threshold' controls the maximum threshold value explored, and 'steps' is the value that is used to increment between 1 and 'max_threshold'.

At a certain threshold there are diminishing returns, where an increase in threshold results in little increase in promoter-proximal fraction, but a precipitous loss in number of genes with a TSS. A threshold should be chosen that balances these two competing metrics. For STRIPE-seq, we have found that a threshold of 3 often provides an appropriate balance between a high promoter-proximal fraction (>= 0.8) and the number of unique genes or transcripts with at least one unique TSS.

See also

apply_threshold to permantly filter TSSs below threshold value.

Examples

data(TSSs_reduced) annotation <- system.file("extdata", "S288C_Annotation.gtf", package="TSRexploreR") exp <- TSSs_reduced %>% tsr_explorer(genome_annotation=annotation) %>% format_counts(data_type="tss") %>% annotate_features(data_type="tss")
#> Import genomic features from the file as a GRanges object ...
#> OK
#> Prepare the 'metadata' data frame ...
#> OK
#> Make the TxDb object ...
#> Warning: The "phase" metadata column contains non-NA values for features of type #> stop_codon. This information was ignored.
#> OK
p <- plot_threshold_exploration(exp)