Creates a signature representing the association between gene expression (or other molecular profile) and drug dose response, for use in drug sensitivity analysis. — drugSensitivitySig,PharmacoSet-method • PharmacoGx

Given a Pharmacoset of the sensitivity experiment type, and a list of drugs, the function will compute a signature for the effect gene expression on the molecular profile of a cell. The function returns the estimated coefficient, the t-stat, the p-value and the false discovery rate associated with that coefficient, in a 3 dimensional array, with genes in the first direction, drugs in the second, and the selected return values in the third.

Usage

# S4 method for PharmacoSet
drugSensitivitySig(
  object,
  mDataType,
  drugs,
  features,
  cells,
  tissues,
  sensitivity.measure = "auc_recomputed",
  molecular.summary.stat = c("mean", "median", "first", "last", "or", "and"),
  sensitivity.summary.stat = c("mean", "median", "first", "last"),
  returnValues = c("estimate", "pvalue", "fdr"),
  sensitivity.cutoff,
  standardize = c("SD", "rescale", "none"),
  molecular.cutoff = NA,
  molecular.cutoff.direction = c("less", "greater"),
  nthread = 1,
  parallel.on = c("drug", "gene"),
  modeling.method = c("anova", "pearson"),
  inference.method = c("analytic", "resampling"),
  verbose = TRUE,
  ...
)

Arguments

object: PharmacoSet a PharmacoSet of the perturbation experiment type
mDataType: character which one of the molecular data types to use in the analysis, out of dna, rna, rnaseq, snp, cnv
drugs: character a vector of drug names for which to compute the signatures. Should match the names used in the PharmacoSet.
features: character a vector of features for which to compute the signatures. Should match the names used in correspondant molecular data in PharmacoSet.
cells: character allows choosing exactly which cell lines to include for the signature fitting. Should be a subset of sampleNames(pSet)
tissues: character a vector of which tissue types to include in the signature fitting. Should be a subset of sampleInfo(pSet)$tissueid
sensitivity.measure: character which measure of the drug dose sensitivity should the function use for its computations? Use the sensitivityMeasures function to find out what measures are available for each PSet.
molecular.summary.stat: character What summary statistic should be used to summarize duplicates for cell line molecular profile measurements?
sensitivity.summary.stat: character What summary statistic should be used to summarize duplicates for cell line sensitivity measurements?
returnValues: character Which of estimate, t-stat, p-value and fdr should the function return for each gene drug pair?
sensitivity.cutoff: numeric Allows the user to binarize the sensitivity data using this threshold.
standardize: character One of "SD", "rescale", or "none", for the form of standardization of the data to use. If "SD", the the data is scaled so that SD = 1. If rescale, then the data is scaled so that the 95% interquantile range lies in [0,1]. If none no rescaling is done.
molecular.cutoff: Allows the user to binarize the sensitivity data using this threshold.
molecular.cutoff.direction: character One of "less" or "greater", allows to set direction of binarization.
nthread: numeric if multiple cores are available, how many cores should the computation be parallelized over?
parallel.on: One of "gene" or "drug", chooses which level to parallelize computation (by gene, or by drug).
modeling.method: One of "anova" or "pearson". If "anova", nested linear models (including and excluding the molecular feature) adjusted for are fit after the data is standardized, and ANOVA is used to estimate significance. If "pearson", partial correlation adjusted for tissue of origin are fit to the data, and a Pearson t-test (or permutation) test are used. Note that the difference is in whether standardization is done across the whole dataset (anova) or within each tissue (pearson), as well as the test applied.
inference.method: Should "analytic" or "resampling" (permutation testing + bootstrap) inference be used to estimate significance. For permutation testing, QUICK-STOP is used to adaptively stop permutations. Resampling is currently only implemented for "pearson" modelling method.
verbose: logical 'TRUE' if the warnings and other informative message shoud be displayed
...: additional arguments not currently fully supported by the function

Value

array a 3D array with genes in the first dimension, drugs in the second, and return values in the third.

Examples

data(GDSCsmall)
drug.sensitivity <- drugSensitivitySig(GDSCsmall,
  mDataType = "rna",
  nthread = 1, features = fNames(GDSCsmall, "rna")[1]
)
#> Summarizing rna molecular data for:	GDSC
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |======================================================================| 100%
#> Computing drug sensitivity signatures...
print(drug.sensitivity)
#> PharmacoSet Name:  GDSC 
#> Signature Type:  Sensitivity 
#> Date Created:  Tue Apr 16 19:03:58 2024 
#> Number of Drugs:  139 
#> Number of Genes/Probes:  1