plot_threshold_exploration.Rd
Make a plot to explore threshold values.
plot_threshold_exploration( experiment, max_threshold = 25, steps = 1, samples = "all", use_normalized = FALSE, ncol = 1, point_size = 1, return_table = FALSE, ... )
experiment | TSRexploreR object. |
---|---|
max_threshold | Thresholds from 1 to max_threshold will be explored. |
steps | Steps to get the threshold values. |
samples | A vector of sample names to analyze. |
use_normalized | Whether to use the normalized (TRUE) or raw (FALSE) counts. |
ncol | Integer specifying the number of columns to arrange multiple plots. |
point_size | The size of the points on the plot. |
return_table | Return a table of results instead of a plot. |
... | Arguments passed to geom_point. |
ggplot2 object containing the threshold exploration plot
All TSSs mapping methods produce spurious TSSs. For the most part, these spurious reads TSSs to be weak and somewhat uniformly distributed throughout promoters and gene bodies. This means that this background can be mitigated by requiring a minimum read threshold for a TSS to be considered in downstream analyses.
This plotting function generates a line plot, where the x-axis is the naive read threshold and the y-axis is the proportion of TSSs within annotated gene/transcript promoters. Additionally, the point color represents the absolute number of genes with at least 1 surviving TSS after filtering. 'max_threshold' controls the maximum threshold value explored, and 'steps' is the value that is used to increment between 1 and 'max_threshold'.
At a certain threshold there are diminishing returns, where an increase in threshold results in little increase in promoter-proximal fraction, but a precipitous loss in number of genes with a TSS. A threshold should be chosen that balances these two competing metrics. For STRIPE-seq, we have found that a threshold of 3 often provides an appropriate balance between a high promoter-proximal fraction (>= 0.8) and the number of unique genes or transcripts with at least one unique TSS.
apply_threshold
to permantly filter TSSs below threshold value.
data(TSSs_reduced) annotation <- system.file("extdata", "S288C_Annotation.gtf", package="TSRexploreR") exp <- TSSs_reduced %>% tsr_explorer(genome_annotation=annotation) %>% format_counts(data_type="tss") %>% annotate_features(data_type="tss")#>#>#>#>#>#> Warning: The "phase" metadata column contains non-NA values for features of type #> stop_codon. This information was ignored.#>p <- plot_threshold_exploration(exp)