LDAcoop

What this package is about

The LDA is the gold standard for quantification of the clonogenic capacity of non-adherent cells. The most common method for calculating this capacity from an LDA experiment assumes implicitly independence of all cells. Interestingly, this assumption does not hold for many cell lines (i.e. cooperatively growing cells).

The effect of cellular cooperation

Figure: Fitting LDA data; assuming independence (blue), taking cooperativity into account (grey)
Figure: Fitting LDA data; assuming independence (blue), taking cooperativity into account (grey)

Cellular cooperation induces a non-linearity to the dose-response-curve (the (log) fraction of non-responding wells over the number of cells seeded) and thereby biases the estimate of the gold standard analysis method, which assumes linearity. In other words: when surrounded by many other cells, the combined clonogenic activity of 100 cells is higher than the activity of 100 separated single cells. In terms of single cell activity, 100 cells act as if they were more.

Thus, the gold standard analysis method is syntactically applicable but its result is biased (i.e. meaningless) for cooperatively growing cells.

Robust analysis in presence (and absence) of cellular cooperation

Therefore, this package equips you with the tools you need to robustly quantify the clonogenic capacity of cell lines as well as to quantify the clonogenic survival of those cell lines after treatment.

How to use this package

As known from tools for quantification of the clonogenic capacity from LDA data, the input is a data frame (table, matrix) with entries for

All commands to get tables with numbers or plots of the activity are explained below.

A full experiment

Let’s assume, you have conducted an experiment with a set of experimental conditions and biological replicates.

Your data will look something like this

#install.packages("LDAcoop")
library(LDAcoop)
data(LDAdata)
head(LDAdata)
#>        name replicate Group S-value # Tested # Clonal growth
#> 1 MDA.MB321         1     0      32       12              12
#> 2 MDA.MB321         1     0      16       12              12
#> 3 MDA.MB321         1     0       8       12              12
#> 4 MDA.MB321         1     0       4       12              12
#> 5 MDA.MB321         1     0       2       12              10
#> 6 MDA.MB321         1     0       1       12               7

A table of clonogenic activities, cooperativity coefficients and survival fractions can be generated from such a data table as follows:

BT20 <- subset.data.frame(x = LDAdata,
                          subset = name == "BT.20")
BT20 <- BT20[,c("S-value","# Tested","# Clonal growth","Group","replicate")]
round(LDA_table(x = BT20),digits = 3)
#>  reference class is 0
#>   treatment     act act.CI.lb act.CI.ub     b b.pvalue    SF SF.CI.lb SF.CI.ub
#> 1         0  29.613    24.943    35.339 1.135    0.176    NA       NA       NA
#> 2         1  33.420    28.330    39.632 1.220    0.043 0.886    0.696    1.129
#> 3         2  40.632    34.391    48.223 1.206    0.062 0.729    0.572    0.929
#> 4         4  72.997    62.611    85.477 1.434    0.003 0.406    0.321    0.513
#> 5         6 180.201   152.880   213.427 1.234    0.038 0.164    0.129    0.209
#> 6         8 423.447   375.448   482.003 2.266    0.000 0.070    0.057    0.086

In this table, we find

A plot of the experiment and the estimated survival fractions can be generated as follows:

LDA_plot(LDA_tab = BT20,uncertainty.band = T)

In case the figure looks overcrowded, uncertainty.band = FALSE can switch off the confidence bands and xlim (e.g. xlim = c(0,100)) can be used, to adjust the plotted range of seeded cells.


Mathematical background

Introduction

Clonogenicity refers to the capacity of single cells to grow into new colonies. Since not every single cell will successfully form a new colony, in a given culture the fraction of cells that do so is often of interest. This fraction is then called clonogenic activity. Mathematically it is a probability \(p\). Often the activity is communicated by the reciprocal, which corresponds to a number of cells (e.g. activity \(p = 1/42\) written as \(a = 1/p = 42\) for ‘one active cell among \(a = 42\) non-active cells’).

In the LDA, distinct numbers of cells are seeded in wells. The number of seeded cells per well is often called dose. Since, we do not want to get confused with the irradiation dose, which is a frequent treatment in the field of radiation oncology, we denote the number of cells seeded with \(s\). The readout of each well in the LDA is dichotomous (growth / no growth). So, from the binomially distributed number of active cells in a well \(X\) where

\[P(X=k) = \left( \begin{matrix} s \\ k \end{matrix}\right) p^{k} (1-p)^{s-k} \]

only \(X=0\) and \(X \neq 0\) can be observed experimentally. Thus, it is about the frequency of positive wells as a function of the number of cells seeded (\(k/n = f(s)\)). For dealing with high cell numbers, the binomial distribution is approximated by the asymptotically identical (for \(s \to \infty\)) Poisson distribution (with \(\lambda = s \cdot p\)): \[P_{\lambda}(X=k) = \frac{\lambda^k}{k!} e^{-\lambda}.\]

The basic concept for estimating the clonogenic activity of a cell suspension through diluting the number of cells in different wells is the following:

So taking the graph ‘observed fraction of positive wells over the number of cells per well’, we find the clonogenic activity as the number of cells (x-axis) just where the curve passes the expectation of 37% negative wells.

Mathematical model

Let \(n\) be the number of wells, and let \(\mu\) denote the expected fraction of positive wells. Following the single-hit Poisson model (SHPM, \(P_{\lambda}(k=0) = e^{-p s}\)), the expectation is: \[\mu = 1-e^{-ps}.\]

The number of failures \(r\) is again binomially distributed

\[ P(Y=r) = \left( \begin{matrix} n \\ r \end{matrix}\right) (1-\mu)^{r} \mu^{n-r}. \]

We find (for \(\alpha = log(p)\), \(\xi = log(s)\))

\[\begin{align*} \Leftrightarrow 1-\mu &= e^{- p s}\\ \Leftrightarrow log(1-\mu) &= - p s \\ \Leftrightarrow -log(1-\mu) &= p \cdot s \\ \Leftrightarrow log(-log(1-\mu)) &= \alpha + \xi \end{align*}\]

Therefore, technically the identification of the required number of cells is done via a generalized linear regression model (binomial family, cloglog link function). The clonogenic activity is \(p = e^{\alpha}\).

Cooperativity

In the presented model, \(p\) is independent from \(s\). So implicitly, it is assumed, that 100 single cells in single wells have the identical chance to result in at least one colony as a single well with those 100 cells would have.

But this is often not the case. Cells export or release substances into the medium and thereby communicate and cooperate. The combined chance of a group of cells to grow into a clone often exceeds the sum of the individual chances. (See CFAcoop, for a number of cells seeded \(S\) and a number of resulting colonies \(C\) we find \(C = a \cdot S^b\).)

Thus, allowing for the same communication and cooperativity, we replace \(\lambda = p \cdot s\) with \(\lambda = p \cdot s^b\) analogue to CFA. The modification reads in both cases as ‘a number of \(s\) cells shows the same activity as if they were \(s^b\) single cells’.

With this generalization we find

\[\begin{align*} P_{\lambda}(k=0) &= e^{-\lambda}\\ \Leftrightarrow 1-\mu &= e^{- p s^b}\\ \Leftrightarrow log(1-\mu) &= - p s^b \\ \Leftrightarrow log(-log(1-\mu)) &= \alpha + \xi \cdot b. \end{align*}\]

Interestingly, this equation is the same as in the special case of non-cooperativity, just without the restriction of \(b=1\), which corresponds to the non-cooperative case. The degree of cooperativity is quantified by the coefficient \(b\).

Activity

When the probability of a cell to grow into a colony is not independent from the number of surrounding cells, the earlier definition of clonogenic activity (the chance of an isolated single cell) is pointless. Thus, the approach to deal with cooperatively growing cells, requires a generalized definition for clonogenic activity.

For a fixed outcome of e.g. \(\lambda = 1\), which is equivalent with \(\mu \approx 0.63\), we still may calculate the required number of cells to be seeded from the regression coefficients \[s_{\lambda = 1} = e^{\frac{-\alpha}{b}} = \left(p^{-1}\right)^{\frac{1}{b}}.\] On average, one out of these cells will grow into a colony. Therefore, we define this number \(s_{\lambda=1}\) as the (generalized) clonogenic activity.

This is analogue to the CFA approach for cellular cooperation (which sets the reference at an expectation of \(20\) colonies.). Please note, the non-cooperative case (\(b=1\)) results in the identical activity values as from the previous definition.

Example

Given the data of an LDA experiment, we find the clonogenic activity where expected. The only change in contrast to conventional LDA data analysis is the curve, which replaces the linear model.

cell.line <- unique(LDAdata$name)[2]
AD <- subset.data.frame(LDAdata, subset = (name==cell.line) & 
                         (replicate==1) & (Group == 2))[,4:6]
LDA_plot(LDA_tab = AD,uncertainty = "act", uncertainty.band = T)

LDA_table(x = AD)
#> $`activity^-1 [N]`
#> [1] 37.432
#> 
#> $`confidence interval`
#> [1] 25.742 54.771
#> 
#> $`cooperativity coefficient b`
#> [1] 0.992
#> 
#> $`p-value cooperativity`
#> [1] 0.962
full_model_fit <- LDA_activity_single(x = AD)

In case, there are biological replicates, the replicate may be indicated by the fifth column. Thereby the frequency of responding wells can be plotted separately.

AD <- subset.data.frame(LDAdata, subset = (name==cell.line) &
                         (Group == 2))[,c(4:6,3,2)]
LDA_plot(LDA_tab = AD,uncertainty = "act", uncertainty.band = T)

LDA_table(x = AD[,1:3],ref_class = 0)
#> $`activity^-1 [N]`
#> [1] 40.632
#> 
#> $`confidence interval`
#> [1] 34.391 48.223
#> 
#> $`cooperativity coefficient b`
#> [1] 1.206
#> 
#> $`p-value cooperativity`
#> [1] 0.062

Please note that the starting motivation of a concept ‘one active cell among how many?’ is still the same. It is just restricted to the special outcome of 37% negative wells. The choice of ‘outcome of 37% negative wells’ is quite artificial and the activity for higher or lower active well ratios would change. But when it comes to survival fractions, this shift is mostly cut out and thereby this approach filters (in parts) a cooperativity-bias in those survival fractions.

Survival fraction

Even though the concept of clonogenic activity is somewhat fuzzy in the presence of cellular cooperation, the effect of treatments on this clonogenic capacity can be investigated quite accurately by the same way it is done for the CFA.

The only requirement is a statement on the outcome of interest. At CFA, a number of 20 colonies can be agreed upon - and a comparison of the number of required cells seeded with and without treatment is reasonable.

Analogously, the ratio of cell numbers required to observe the same \(\mu\) (e.g. \(\mu = 1-e^{-1}\)) is of interest, when investigating the effect of certain treatments.

\[SF_x = \frac{s_0(\lambda_0 = 1)}{s_x(\lambda_x = 1)}\] can be calculated from the fitted GLM parameters as \[SF_x = exp\left( \frac{log(p_x)}{b_x} - \frac{log(p_0)}{b_0}\right)\]

Therefore, the interpretation of survival fractions is straight forward. If the number of cells needed to be seeded with treatment is ten times the number without treatment, the survival fraction is calculated as \(0.1\). Note that it is implicitly assumed, that those cells that are non-survivors of the treatment, do not contribute to the cooperative stimulation of the survivors. Otherwise the treatment effect will be underestimated.

Uncertainty analysis

Uncertainties for the parameters of the generalized linear model are returned by the fitting procedure and can be transferred to the clonogenic activity via the general predict.glm function.

Thus, uncertainties of clonogenic activities are calculated the same way as the activities are (cutting \(y=-1\)).

Uncertainties of the survival fractions can be generally assessed by two ways

In cases of extreme cooperativity, method (b) can be numerically unstable and therefore we recommend the use of method (a) instead.

Analysis of the calculated uncertainty bounds of the survival fractions of the 10 non-extreme cell lines under all treatments (see data delivered with this package), we find a mean ratio (uncertainty-bound method(a) / uncertainty-bound method(b)) of 0.9969 and a standard deviation of 0.0049. Thus, the deviation of those two methods is very stable in the order of 1%.

Example

LDA_table_act <- LDA_table(x = BT20,uncertainty = "act")
#>  reference class is 0
cbind(round(LDA_table_act[,1:5],digits = 1),
      round(LDA_table_act[,6:9],digits = 4))
#>   treatment   act act.CI.lb act.CI.ub   b b.pvalue     SF SF.CI.lb SF.CI.ub
#> 1         0  29.6      24.9      35.3 1.1   0.1762     NA       NA       NA
#> 2         1  33.4      28.3      39.6 1.2   0.0430 0.8861   0.6955   1.1288
#> 3         2  40.6      34.4      48.2 1.2   0.0620 0.7288   0.5717   0.9294
#> 4         4  73.0      62.6      85.5 1.4   0.0033 0.4057   0.3214   0.5125
#> 5         6 180.2     152.9     213.4 1.2   0.0378 0.1643   0.1291   0.2092
#> 6         8 423.4     375.4     482.0 2.3   0.0000 0.0699   0.0565   0.0863
LDA_table_ep <- LDA_table(x = BT20,uncertainty = "ep")
#>  reference class is 0
cbind(round(LDA_table_ep[,1:5],digits = 1),
      round(LDA_table_ep[,6:9],digits = 4))
#>   treatment   act act.CI.lb act.CI.ub   b b.pvalue     SF SF.CI.lb SF.CI.ub
#> 1         0  29.6      24.9      35.3 1.1   0.1762     NA       NA       NA
#> 2         1  33.4      28.3      39.6 1.2   0.0430 0.8861   0.6959   1.1282
#> 3         2  40.6      34.4      48.2 1.2   0.0620 0.7288   0.5720   0.9286
#> 4         4  73.0      62.6      85.5 1.4   0.0033 0.4057   0.3213   0.5121
#> 5         6 180.2     152.9     213.4 1.2   0.0378 0.1643   0.1292   0.2091
#> 6         8 423.4     375.4     482.0 2.3   0.0000 0.0699   0.0565   0.0866