TSS Clustering — tss_clustering • TSRexploreR

Basic distance and threshold-based clustering of TSSs.

tss_clustering(
  experiment,
  samples = "all",
  threshold = NULL,
  n_samples = NULL,
  max_distance = 25,
  max_width = NULL,
  singlet_threshold = NULL
)

Arguments

experiment	TSRexploreR object.
samples	A vector of sample names to analyze.
threshold	TSSs or TSRs with a score below this value will not be considered.
n_samples	Keep TSS if 'threshold' number of counts is present in n_samples number of samples.
max_distance	Maximum allowable distance between TSSs for clustering.
max_width	Maximum allowable TSR width.
singlet_threshold	TSRs of width 1 must have a score greater than or equal to this threshold to be kept.

Value

TSRexploreR object with TSRs added to GRanges and data.table counts.

Details

Genes rarely have a single TSS, but rather a cluster of TSSs. This function clusters TSSs into Transcription Start Regions (TSRs). TSSs are clustered if their score is greater than or equal to 'threshold' in at least 'n_samples' number of samples, and are less than or equal to 'max_distance' from each other. The clustered TSSs cannot encompass more than 'max_width' bases. A global singlet threshold can be applied using 'singlet_threshold'.

Examples

data(TSSs_reduced)

exp <- TSSs_reduced %>%
  tsr_explorer %>%
  format_counts(data_type="tss")
exp <- tss_clustering(exp, threshold=3)
#> Warning: Arguments in '...' ignored
#> Warning: Arguments in '...' ignored