Finds markers (differentially expressed genes) for each of the identity classes in a dataset

FindAllMarkers(
  object,
  assay = NULL,
  features = NULL,
  logfc.threshold = 0.25,
  test.use = "wilcox",
  slot = "data",
  min.pct = 0.1,
  min.diff.pct = -Inf,
  node = NULL,
  verbose = TRUE,
  only.pos = FALSE,
  max.cells.per.ident = Inf,
  random.seed = 1,
  latent.vars = NULL,
  min.cells.feature = 3,
  min.cells.group = 3,
  pseudocount.use = 1,
  return.thresh = 0.01,
  ...
)

Arguments

object

An object

assay

Assay to use in differential expression testing

features

Genes to test. Default is to use all genes

logfc.threshold

Limit testing to genes which show, on average, at least X-fold difference (log-scale) between the two groups of cells. Default is 0.25 Increasing logfc.threshold speeds up the function, but can miss weaker signals.

test.use

Denotes which test to use. Available options are:

  • "wilcox" : Identifies differentially expressed genes between two groups of cells using a Wilcoxon Rank Sum test (default)

  • "bimod" : Likelihood-ratio test for single cell gene expression, (McDavid et al., Bioinformatics, 2013)

  • "roc" : Identifies 'markers' of gene expression using ROC analysis. For each gene, evaluates (using AUC) a classifier built on that gene alone, to classify between two groups of cells. An AUC value of 1 means that expression values for this gene alone can perfectly classify the two groupings (i.e. Each of the cells in cells.1 exhibit a higher level than each of the cells in cells.2). An AUC value of 0 also means there is perfect classification, but in the other direction. A value of 0.5 implies that the gene has no predictive power to classify the two groups. Returns a 'predictive power' (abs(AUC-0.5) * 2) ranked matrix of putative differentially expressed genes.

  • "t" : Identify differentially expressed genes between two groups of cells using the Student's t-test.

  • "negbinom" : Identifies differentially expressed genes between two groups of cells using a negative binomial generalized linear model. Use only for UMI-based datasets

  • "poisson" : Identifies differentially expressed genes between two groups of cells using a poisson generalized linear model. Use only for UMI-based datasets

  • "LR" : Uses a logistic regression framework to determine differentially expressed genes. Constructs a logistic regression model predicting group membership based on each feature individually and compares this to a null model with a likelihood ratio test.

  • "MAST" : Identifies differentially expressed genes between two groups of cells using a hurdle model tailored to scRNA-seq data. Utilizes the MAST package to run the DE testing.

  • "DESeq2" : Identifies differentially expressed genes between two groups of cells based on a model using DESeq2 which uses a negative binomial distribution (Love et al, Genome Biology, 2014).This test does not support pre-filtering of genes based on average difference (or percent detection rate) between cell groups. However, genes may be pre-filtered based on their minimum detection rate (min.pct) across both cell groups. To use this method, please install DESeq2, using the instructions at https://bioconductor.org/packages/release/bioc/html/DESeq2.html

slot

Slot to pull data from; note that if test.use is "negbinom", "poisson", or "DESeq2", slot will be set to "counts"

min.pct

only test genes that are detected in a minimum fraction of min.pct cells in either of the two populations. Meant to speed up the function by not testing genes that are very infrequently expressed. Default is 0.1

min.diff.pct

only test genes that show a minimum difference in the fraction of detection between the two groups. Set to -Inf by default

node

A node to find markers for and all its children; requires BuildClusterTree to have been run previously; replaces FindAllMarkersNode

verbose

Print a progress bar once expression testing begins

only.pos

Only return positive markers (FALSE by default)

max.cells.per.ident

Down sample each identity class to a max number. Default is no downsampling. Not activated by default (set to Inf)

random.seed

Random seed for downsampling

latent.vars

Variables to test, used only when test.use is one of 'LR', 'negbinom', 'poisson', or 'MAST'

min.cells.feature

Minimum number of cells expressing the feature in at least one of the two groups, currently only used for poisson and negative binomial tests

min.cells.group

Minimum number of cells in one of the groups

pseudocount.use

Pseudocount to add to averaged expression values when calculating logFC. 1 by default.

return.thresh

Only return markers that have a p-value < return.thresh, or a power > return.thresh (if the test is ROC)

...

Arguments passed to other methods and to specific DE methods

Value

Matrix containing a ranked list of putative markers, and associated statistics (p-values, ROC score, etc.)

Examples

# Find markers for all clusters suppressWarnings(all.markers <- FindAllMarkers(object = pbmc_small))
#> Calculating cluster 0
#> Calculating cluster 1
#> Calculating cluster 2
head(x = all.markers)
#> p_val avg_logFC pct.1 pct.2 p_val_adj cluster gene #> HLA-DPB1 9.572778e-13 -4.034691 0.083 0.909 2.201739e-10 0 HLA-DPB1 #> HLA-DRB1 7.673127e-12 -3.760972 0.083 0.864 1.764819e-09 0 HLA-DRB1 #> HLA-DPA1 3.673172e-11 -3.032128 0.111 0.864 8.448296e-09 0 HLA-DPA1 #> HLA-DRA 1.209114e-10 -2.954974 0.417 0.909 2.780962e-08 0 HLA-DRA #> HLA-DRB5 9.547049e-10 -3.019608 0.056 0.773 2.195821e-07 0 HLA-DRB5 #> HLA-DQB1 3.035198e-08 -3.000755 0.028 0.659 6.980956e-06 0 HLA-DQB1
if (FALSE) { # Pass a value to node as a replacement for FindAllMarkersNode pbmc_small <- BuildClusterTree(object = pbmc_small) all.markers <- FindAllMarkers(object = pbmc_small, node = 4) head(x = all.markers) }