Analyze sample similarity with correlation analysis.

plot_correlation(
  experiment,
  data_type = c("tss", "tsr", "tss_features", "tsr_features"),
  samples = "all",
  correlation_metric = "pearson",
  threshold = NULL,
  n_samples = 1,
  use_normalized = TRUE,
  font_size = 12,
  cluster_samples = FALSE,
  heatmap_colors = NULL,
  show_values = TRUE,
  return_matrix = FALSE,
  ...
)

Arguments

experiment

TSRexploreR object.

data_type

Whether to correlate TSSs ('tss') or TSRs ('tsr').

samples

A vector of sample names to analyze.

correlation_metric

Whether to use 'spearman' or 'pearson' correlation.

threshold

TSSs or TSRs with a score below this value will not be considered.

n_samples

Number of samples with TSSs or TSRs above threshold

use_normalized

Whether to use the normalized (TRUE) or raw (FALSE) counts.

font_size

The font size for the heatmap tiles.

cluster_samples

Logical for whether hierarchical clustering should be performed on rows and columns.

heatmap_colors

Vector of colors for heatmap.

show_values

Logical for whether to show correlation values on the heatmap.

return_matrix

Return the correlation matrix without plotting correlation heatmap.

...

Additional arguments passed to ComplexHeatmap::Heatmap.

Value

ggplot2 object of correlation heatmap, or correlation matrix if 'return_matrix' is TRUE.

Details

Correlation plots are a good way to assess sample similarity. This can be useful in determining replicate concordance and for the initial assessment of differences between samples from different conditions. This function generates a correlation heatmap from a previously TMM- or MOR-normalized count matrix. Pearson correlation is recommended for samples from the same technology due to the expectation of a roughly linear relationship between the magnitudes of values for each feature. Spearman correlation is recommended for comparison of samples from different technologies, such as STRIPE-seq vs. CAGE, due to the expectation of a roughly linear relationship between the ranks, rather than the specific values, of each feature.

See also

normalize_counts for TSS and TSR normalization.

Examples

data(TSSs) exp <- TSSs %>% tsr_explorer %>% format_counts(data_type="tss") %>% normalize_counts(data_type="tss", method="CPM") p <- plot_correlation(exp, data_type="tss")