## Analysis of Molecular Variance (AMOVA)

## Law of total variance

Suppose that X and Y are random variables on the same probability space, and the variance of Y is finite. Then,

$Var(Y) = E[Var(Y|X)] + Var(E[Y|X])$

Intuitively, we can think of a bivariate Gaussian distribution P(X, Y):

In this case, we can get a distribution $P(Y|X=X_i)$, as in the figure below:

It intuitively makes sense that $Var(Y)$ should be the average of all their individual variance, i.e. $Var(Y) = E[Var(Y|X)]$.

Now, let's see what happens if we rotate the bivariate Gaussian distribution:

We can see that Var(Y) doesn't only depend on the individual variances of the $P(Y|X=X_i)$ distribution, but that it also depends on how spread out the distributions themselves are along the Y axis. That is to say, it depends on how spread out $P(Y|X=X_i)$ are from each mean of that $E[Y|X=X_i]$, which is equivalent to $Var(E[Y|X=X_i])$.

As an aside, there is a general variance decomposition formula for one or more components:

$Var[Y] = E[Var(Y|X_1, X_2)] + E[Var(E[Y|X_1, X_2]|X_1)] + Var(E[Y|X_1])$

which follows from the law of total conditional variance:

$Var(Y|X_1) = E[Var(Y|X_1, X_2)|X_1] + Var(E[Y|X_1, X_2] | X_1).$

Search

•

Between pop: 각 population 평균의 분산, $Var(E[Y|X])$

•

Between samples within pop: 각 population 내 sample의 분산, $Var(E[Y|X])$

•

Within samples: Diploid인 경우 생성됨. Allele에 따른 sample 내 분산(?)