tss_shift.Rd
Analyze TSS shifts between samples within a consensus TSR set.
tss_shift( experiment, sample_1, sample_2, comparison_name, tss_threshold = NULL, max_distance = 100, min_threshold = 10, n_resamples = 1000L, fdr_cutoff = 0.05 )
experiment | TSRexploreR object. |
---|---|
sample_1 | First sample to compare. Vector with sample names for TSSs and TSRs, with names 'TSS' and 'TSR'. |
sample_2 | Second sample to compare. Vector with sample name for TSSs and TSRs, with names 'TSS' and 'TSR'. |
comparison_name | Name assigned to the results in the TSRexploreR object. |
tss_threshold | Minimum number of raw counts required at a TSS for it to be considered in the shifting analysis. |
max_distance | TSRs less than this distance apart will be merged. |
min_threshold | Minimum number of raw counts required in each TSR for both samples. |
n_resamples | Number of resamplings for permutation test. |
fdr_cutoff | Differential features not meeting this significance threshold will not be considered. |
TSRexploreR object with shifting scores added.
This function assesses the difference between TSS distributions from two distinct samples in a set of consensus TSRs by calculating a signed version of the earth mover's distance (EMD) that we term earth mover's score (EMS). For this approach, we imagine that the two TSS distributions in questions are piles of dirt, and ask how much dirt from one pile we would need to move, how far, and in which direction, to mimic the distribution of the other sample. The resulting EMS is between -1 and 1, with larger magnitudes indicating larger shifts and the sign indicating direction (negative values indicate upstream shifts and positive values indicate downstream shifts). The positive and negative components of the EMS are also reported. Lastly, the function calculates a p-value for the null hypothesis that there is no difference between the two samples using a permutation test and an FDR-corrected p-value calculated by the Benjamini-Hochberg procedure.
This function also calculates unsigned EMD, which indicates how much TSS "mass" has been moved between two distributions without regard to direction. This is useful in the detection of "balanced" shifts, wherein approximately equal mass is moved upstream and downstream. Examples of this are TSR splitting or merging and a change in TSR shape (e.g., peaked to broad). These will generally be marked by a low EMS, balanced positive and negative scores, and a high EMD. As for EMS, raw and FDR-corrected p-values are reported for EMD.
'sample_1' and 'sample_2' should be the names of the two samples to compare. 'sample_1' should be the control and 'sample_2' the treatment sample. The results will be stored back in the TSRexploreR object with the name given by 'comparison_name'. 'tss_threshold' applies a global threshold to remove TSSs below a certain score, and 'min_threshold' is the minimal score that both TSRs must have to be considered. 'max_distance' is the maximum distance between two two TSRs to be considered for shifting.
data(TSSs) assembly <- system.file("extdata", "S288C_Assembly.fasta", package = "TSRexploreR") samples <- data.frame( sample_name=c(sprintf("S288C_D_%s", seq_len(3)), sprintf("S288C_WT_%s", seq_len(3))), file_1=rep(NA, 6), file_2=rep(NA, 6), condition=c(rep("Diamide", 3), rep("Untreated", 3)) ) exp <- TSSs %>% tsr_explorer(sample_sheet=samples, genome_assembly=assembly) %>% format_counts(data_type="tss") %>% tss_clustering(threshold=3) %>% merge_samples(data_type = "tss", merge_group="condition") %>% merge_samples(data_type = "tsr", merge_group="condition")#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignored#> Warning: Arguments in '...' ignoredexp <- tss_shift( exp, sample_1=c(TSS="S288C_WT_1", TSR="S288C_WT_1"), sample_2=c(TSS="S288C_D_1", TSR="S288C_D_1"), comparison_name="Untreated_vs_Diamide", max_distance = 100, min_threshold = 10, n_resamples = 1000L )#> Warning: Arguments in '...' ignored#> Warning: Some sequences have fewer than nthresh scores for at least one sample. #> These are ignored and returned as NA.#>