Abstract: Novel single-cell transcriptome sequencing assays allow researchers to measure gene expression levels at the resolution of single cells and offer the unprecedented opportunity to investigate fundamental biological questions at the cellular level, such as stem cell differentiation or the discovery and characterization of rare cell types. The majority of the computational methods to analyze single-cell RNA-Seq data are implemented in R making it a natural tool to start working with single-cell transcriptomic data. Using real single-cell datasets, this workshop provides a step-by-step tutorial to the methodology and associated R packages for the following four main tasks: (1) normalization, (2) dimensionality reduction, (3) clustering, (4) differential expression analysis.
Lesson 1: What is Single-Cell RNA-Seq?
You will learn what single-cell RNA-Seq is and why it is a such a powerful technique. By the end of this lesson, you'll also know how to load, create, and access single-cell datasets in R.
Lesson 2: Quality Control and Normalization
We'll go over the first steps of the workflow to analyze single-cell RNA-seq data, which include quality control and normalization. These two steps should get all the technical issues and biases out of the way so that in the next chapters we can focus on the biological signal of interest.
Lesson 3: Visualization and Dimensionality Reduction
When studying single-cell data at the cellular level, the number of dimensions is the number of genes. The goal of dimensionality reduction is to reduce the number of dimensions to a smaller number either to visualize the data in 2 dimensions or to prepare the dataset for subsequent steps like clustering.
Lesson 4: Cell Clustering and Differential expression analysis
We will cluster cells with similar gene expression profiles and then perform differential expression (DE) analysis to find genes differentially expressed between known groups of cells. We then visualize DE genes with volcano plots and heatmaps.
A little bit of experience in R, Bioconductor and Data Visualization (e.g., using the R package ggplot2).
Bio: Fanny Perraudeau is a Senior Data Scientist at Pendulum where she manages, designs, and implements novel genomics algorithms and bioinformatics pipelines to further improve the analyze of Pendulum microbiome data. In addition, she runs statistical analyses to aid the company’s therapeutic discovery efforts. She has a master from Ecole Polytechnique, France and a PhD in Biostatistics from University of California, Berkeley with a Designated Emphasis in Computational and Genomic Biology under the supervision of Professor Sandrine Dudoit. Much of her work is motivated by the development and application of statistical methods and software for the analysis of biomedical and genomic data, especially metagenomics and single-cell RNASeq.