11.5 Exercises Easy

  • Find features associated with iCluster and MFA factors, and visualize the feature weights. Intermediate

  • Normalizing the data matrices by their \(\lambda_1\)’s as in MFA supposes we wish to assign each data type the same importance in the down-stream analysis. This leads to a natural generalizaiton whereby the different data types may be differently weighed. Provide an implementation of weighed-MFA where the different data types may be assigned individual weights.

  • In order to use NMF algorithms on data which can be negative, we need to split each feature into two new features, one positive and one negative. Implement the following function, and see that the included test does not fail:

split_neg_columns <- function(x) {
    # your code here

# a test that shows the function above works
test_split_neg_columns <- function() {
    input <- as.data.frame(cbind(c(1,2,1),c(0,1,-2)))
    output <- as.data.frame(cbind(c(1,2,1), c(0,0,0), c(0,1,0), c(0,0,2)))
    stopifnot(all(output == split_neg_columns(input)))
  • The iCluster+ algorithm has some parameters which may be tuned for maximum performance. The iClusterPlus package has a method, iClusterPlus::tune.iClusterPlus, which does this automatically based on the Bayesian Information Criterion (BIC). Run the example above and find the optimal lambda and alpha values.

  • Another covariate in the metadata of these tumors is their CpG island methylator Phenotype (CIMP). This is a phenotype carried by a group of colorectal cancers that display hypermethylation of promoter CpG island sites, resulting in the inactivation of some tumor suppressors. This is also assayed using an external test. Do any of the multi-omics methods surveyed find a latent variable that is associated with the tumor’s CIMP phenotype? Advanced

  • Does MFA give a disentangled representation?
  • Does iCluster give disentangled representations? Why do you think that is?