A Sequence of Our Videos for Beginners

Many have asked about a course on R programming and bioinformatics analysis. We both love the idea but due to the ever changing landscape of analysis tools and pipeline, We don't think such analysis can ever be done. However, we put together this list for anyone starting out in R programming and Bioinformatics.

Hope it helps :)

Chapter 1 : Basics in R Programming and Data visualization

Running Basic Statistical Analysis

R is one of the most popular tools for statistical analysis, it is also one of the few open source tools available in the market. Unlike other tools like SPSS, R uses a command line interface and could be scary for people new to the platform. However, running test in R is actually surprisingly easy and most basic analysis can be completed with a single function.

Link to Self Checking Questions: https://forms.gle/A3GP8A61JvUP1TJaA

Using ggplot2 in R for Data visualization

ggplot is one of the, if not, the most used tools for data visualization in R. It is very fast and simple to start, but also incredibly powerful once you have master the syntax.

Chapter 2: RNA-Seq Analysis Pipeline and DEGs Analysis

RNA-Seq: From FASTQ to DEGs

With that, this project was set out to discover a way to accomplish similar goal of isolating the DEGs from online database using all open source tools that does not necessary relies on all-in-one platform like Galaxy, R language as well as the packages used here are all open source, and could be downloaded and stored on a local computer relatively easily and recovered when needed. Even if Bioconductor no longer support the downloading of dataset/libraries.

Gene Set Enrichment Analysis + Over Representation Analysis

Enrichment analysis is very common in the Omics study. Did you know, with the same result from the Differential Expression Analysis, we can obtain two different types of enrichment results. This video guides you on the how and why performing them.

Chapter 3: Single Cell RNA-Seq and Analysis

Single Cell RNA-Seq: Data Import, Clustering and Markers Identification

Single Cell RNA-Sequencing have been a powerful tools for the understanding of the interactions in a group of cells that is close together. In the past the diluting effects have also been a problem in RNA-SEQ experiment whereby a smaller groups expressing a different groups of genes will get "diluted" by the surrounding cells. i.e. the guard cells in the leaves compared to the rest of the cells type.

In the video I am adapting from a script developed by Satijalab on the first look at a single cell RNA-seq experiment. Including how the sample can be clustered into difference cell types using specific marker genes and how to make a great visualization on the analysis results.

Single Cell RNA-Seq: Sub setting objects

Subset makes the object smaller and faster to run in scRNA-Seq. In many analysis it is critical for the identification of interesting patterns among the dataset

Single Cell RNA-Seq: Comparison with Integration

It is often that we need to combined the datasets from two separate condition with very different sequencing pipeline and output. They might also contain very different results and sensitivity.

Ultimately, the details in experimental design is going to another book of content, so assuming the experimental setup for the two samples is correct and comparable, how can you integrate the two datasets and compare the gene expression between them?

Single Cell RNA-Seq: Case Study

I reproduced the Single-cell RNAseq results of a Nature Communication paper using Seurat, fgsea, Monocle3, and Slingshot packages in R. This video is great for you if you're not sure how to make sense or interpret the outputs of a typical ScRNA analysis.

Chapter 4: Concepts and Algorithms

FOR Loops and Efficient Coding

Looping is one of the most common things we do in computer science and data science. Whether we are trying to plot the same type of graph over many many dataset or try to do apply the same function over all the columns, underneath many complicated high levels functions we are always try to do loopings. For here, I am trying to share some of the tips and tricks I found online and some of them are my personal best practices from my years of coding.

Introduction to WGCNA

A quick bitesize intro about Weighted Gene Co-expression Network Analysis (WGNCA), it's quite common yet important for the network analysis of genomic data.