Home

Simulations to compare the results of Exchangeable Effects and Exchangeable Z-scores model

Compare EE and EZ

The correct model has higher likelihood in simulations.
Simulations to compare R vs Rcpp:

Compare R vs Rcpp

In the simulation, Rcpp is faster then R in posterior calculation. Rcpp and R have similar computational time for likelihood calculation. However, from my experience, Rcpp tends to use shorter time in likelihood computation.
Simulations to find more latent factors than the number of covariates (p)

Flash Simulation

When the var_type='constant', the flash results in 6 factors, whereas the data has 5 conditions.

Some factors contain +ve in some conditions and -ve in others. These are not separated into 2 factors.

Here is the MASH analyses:

Generate the covariance matrices:

Firstly, we do the factor analysis on the data using flash. Flash

Then, we generate the potential covariance structures:
- EZ covariance matrices
- EE covariance matrices
I realized a question here. We run the cov_ed using mash data and Ulist. The mash data has Bhat, Shat, Shat_alpha, V, alpha. In cov_ed, it only use the Bhat and Shat. Therefore, it assumes the errors are independent (V is identity). The cov_ed algorithm does not account for the V matrix.

We use extreme deconvolution to find the covariance candidates. Maybe, the ignorance of the dependency in ed does not matter. Will including the dependency improve the initial covariance matrices?
Fit the mash model
- EZ V1 mash model
- EE V1 mash model
Changing the parameter alpha could obtain EE or EZ fitted model.
Changing the parameter vhat could obtain different V matrices.
- vhat = 0, V is identity matrix
- vhat = 1, V is the estimated correlation matrix
V is the correlation matrix here: \[ \hat{\beta}_{j}|\beta_{j}, S_{j} \sim N(\hat{\beta}_{j}; \beta_{j}, S_{j}VS_{j}) \]
Results:
The estimated weights for covariance matrices are different for EE and EZ model when V is the estimated correlation. When V is Identity, the estimated weights are similar. But the measurements are correlated due to sample overlap (all based on same 134 individuals), I think it is more reasonable to use the estimated correlation from the data (V1).

Among these models, EZ V1 has the highest log likelihood value.

Using Mash with V be the estimated correlation, we found more eQTLs, the number of eQTLs are similar using EE and EZ model. Based on the EZ V1 model, the majority weights for the covariance matrix is on the null matrix (0, no effects, 84.6%). The remaining weights are on the covariance matrix capturing the pattern that the standardized effects are strong positively correlated among all treatments (14.5%). The covariance component \(11'\) (5.4%) shows that the standardized effects are similar. The other covariance component has the format \(D11'D\) (9.1%), D is a diagnoal matrix, which means the standardized effects are different in size, but they are strongly correlated (\(\beta_{lpa6h} = d * \beta_{ctrl}\) etc.).

The mash fit favors component where standardized effects are very similar, which does not mean the raw effects are similar. The similarity of standardized effects could imply the similarity of raw effects if and only if the standard errors of the estimated effects are all similar. The derivation is in Mash Posterior under non-equal \(s_{jr}\). In this dataset, the \(s_{jr}\) are not similar across treatments. The equal effect covariance structure (correlation) is for the standardized effects. The raw effects have the same correlation structure, but with different covariance, different mean.

We found a lot reQTLs (raw effect), treatment specific reQTLs and time point specific reQTLs using the linear transformation. We found more reQTLs, treatment specific reQTLs and time point specific reQTLs than the paper’s result.

However, the number of reQTLs from EE model are much less than EZ model. I don’t know the reason for this.
If we define the reQTL as different standardized effects, the result is EZ V1 model reQTL.

The number of reQTLs are fewer than the original reQTLs.
Qualitative Interaction:

Among the eQTLs that significant among all treatments (4792), there are 4 of them have effects in different directions in different conditions. Interaction

In the null data, there is no qualitative interaction cases. Therefore, the estimated weights for the mash model can not capture the qualitative interaction covariance structure. Posterior Check
Using flash results to construct mashr covariance matrices
- Method 1: \[z_{i} = \sum_{k=1}^{K}l_{ik} f_{k}\] \[U_{i} = z_{i}z_{i}'\] Flash individual
  
  The number of matrices is too large. The current algorithm cannot handle it. The matrices \(U_{i}\) for two genes will be similar if they have similar \(z_{i}\) vectors. So we try to cluster \(z_{i}\) vectors. Flash individual reduce
  
  There is only one cluster identified here.
- Method 2: Suppose the rows of L come from a mixture of multivariate normals with covariances \(U_{1}, \cdots, U_{n}\). \[l_{i} \sim \sum_{m=1}^{n} \pi_{m} N_{K}(0, U_{m})\] The detail of this method is in Flash L MMVN.
  
  The mash model including these covariance matrices:
  
  EZ covariance matrices + loading
  
  EZ V1 Load model
  
  EZ V1 Load model result
The plot of logFC:

The y axis is logFC = \(log_{2} \frac{mean \ sti}{mean \ base}\). The x axis is the group of genes: ‘eQTL in both baseline and the stimulate’, ‘eQTL in baseline’, ‘eQTL in stimulate’.

EZ V1 eQTL logFC

The logFC in different groups are similar.

Here, the x axis is: ‘eQTL only in baseline’, ‘eQTL only in stimulate’.

EZ V1 eQTL logFC ONLY thresh=0.05

EZ V1 eQTL logFC ONLY thresh=0.01

EZ V1 eQTL logFC ONLY thresh=0.001

The logFC in ‘eQTL only in stimulate’ group is slightly higher.

The differences become clearer as we concentrate on more convincingly baseline/stimulate-specific eQTLs.

Method: Within the baseline/stimulate-specific group, select the more convincingly eQTLSs by decreasing the statistical significance cutoff.
contrast method simple simulation examples:
- Without signal: simple contrast example
- With signal: simple contrast example2
The contrast matrix to subtract the mean:
- Without signal: Compare with mean without signal
- With signal: Compare with mean with signal
Including more covariance structures basd on flash output.

Flash_2

This R Markdown site was created with workflowr