Basic distance and threshold-based clustering of TSSs.

tss_clustering(
  experiment,
  samples = "all",
  threshold = NULL,
  n_samples = NULL,
  max_distance = 25,
  max_width = NULL,
  singlet_threshold = NULL
)

Arguments

experiment

TSRexploreR object.

samples

A vector of sample names to analyze.

threshold

TSSs or TSRs with a score below this value will not be considered.

n_samples

Keep TSS if 'threshold' number of counts is present in n_samples number of samples.

max_distance

Maximum allowable distance between TSSs for clustering.

max_width

Maximum allowable TSR width.

singlet_threshold

TSRs of width 1 must have a score greater than or equal to this threshold to be kept.

Value

TSRexploreR object with TSRs added to GRanges and data.table counts.

Details

Genes rarely have a single TSS, but rather a cluster of TSSs. This function clusters TSSs into Transcription Start Regions (TSRs). TSSs are clustered if their score is greater than or equal to 'threshold' in at least 'n_samples' number of samples, and are less than or equal to 'max_distance' from each other. The clustered TSSs cannot encompass more than 'max_width' bases. A global singlet threshold can be applied using 'singlet_threshold'.

Examples

data(TSSs_reduced) exp <- TSSs_reduced %>% tsr_explorer %>% format_counts(data_type="tss") exp <- tss_clustering(exp, threshold=3)
#> Warning: Arguments in '...' ignored
#> Warning: Arguments in '...' ignored