Abstract: Matrix factorization or latent variable analysis is a powerful approach to identify trends in data or reduce the dimension of high dimensional data. Whilst most data scientists are familiar with principal component analysis (PCA), this talk will describe extensions to PCA that can be used to examine trends and extract the most variant component across many datasets. I will compare matrix factorization approaches for integrative analysis of multiple datasets (including canonical correlation analysis, multiple factor analysis, joint non-negative matrix factorization) and describe how we apply these methods to identify biomarkers of disease in oncology.
Bio: Experienced computational biologist, R/Bioconductor developer, whose research seeks to uncover the molecular changes which give rise or promote cancer development. Aedin's team curates GeneSigDB, a database of over 3,500 gene signatures (or genesets) and they develop gene set-based approaches for large scale integrated data analysis.
Specialties: bioconductor, R, bioinformatics, genomics, multivariate analysis, biostatistican, microarray, gene expression, computational biology, breast cancer, ovarian cancer.
Aedin Culhane, PhD
Computational Biologist | Dana Farber Cancer Institute