View on GitHub

RNAseq analysis

fredhutch.io's materials for courses on RNAseq concepts and skills

Comparison of other RNAseq course materials

Fredhutch.io’s Galaxy materials

A quick start guide to doing RNA-sequencing analysis in Galaxy. Covers Importing data through gene expression analysis.

Scope

Outline

  1. FH Galaxy server login information
  2. Importing data to Galaxy
  3. Combining datasets in Galaxy
  4. Using UCSC to get a gene annotation
  5. Read mapping with TopHat
  6. Counting reads with htseq-count
  7. Differential gene expression analysis with DESeq2

Software

Gavin Ha’s lectures and R labs for TFCB

Lecture materials from the UW Tools For Computational Biology course. Covers Bioconductor packages for working with genomic data, inspecting and quering genomica data, identifying and annotating genomic varients.

Scope

Outline

  1. Genomic data analysis
  2. Using GenomicRanges to store and query genomic data
  3. Finding the overlap between two genomic sequences
  4. Sequence data analysis
  5. Loading and querying BAM files using Rsamtools
  6. Computing pile up statistics
  7. Read Variant Call Format (VCF) Files
  8. Read and extract contents of VCF
  9. Reading varients from VCF

Software

David Coffey’s RNAseq repository, for an authentic workflow

A series of shell and R scripts used to process RNA sequencing data

Scope

Outline

  1. Downloading raw fastq files from the NCBI sequence read archive (http://www.ncbi.nlm.nih.gov/sra) or generating your own sequencing files.
  2. Alignment to a reference genome. Unaligned reads may then be aligned to alternative genomes such a pathogen genome.
  3. Merging (for multilane samples) and processing
  4. Run the resulting bam files can be run through a series of additional analyses such as GATK variant detection and STAR fusion gene detection.
  5. Quality control analyses may also be performed on fastq files using FastQC and bam files using RNAseQC.

Software

Amy P’s repository with code and documentation for Pathways/SHIP, for materials translatable to high school students

Scope

Outline

Software

Alex’s Lemonade Stand RNAseq materials

A single module in a series from The Alex’s Lemonade Stand Foundation Childhood Cancer Data Lab

Scope

Outline

  1. Installing and setting up a Docker container
  2. Accessing data on flash drives
  3. Intro to R and intermediate R (Tidyverse)
  4. QC, trim, and quantification using Salmon
  5. Gene level summary using tximport
  6. RNA-seq EDA
  7. Differential gene expression analysis
  8. Normalizing count matrix
  9. Single cell - processing 10x raw data
  10. Single cell - dimensionality reduction
  11. Machine learning - data prep, cclustering, PLIER

Software

Cornell RNAseq course

RNA Seq analysis workshop course materials.

Scope

Outline

  1. Set up on the command line - create directory structure, download fastq
  2. QC raw reads w FastQC
  3. Alignment with STAR
  4. Interacting with BAM/SAM files using samtools
  5. Visual inspection with IGV
  6. Read in feature counts to R
  7. Use DESeq2 to normalize read counts for differences in seq depth and transform reads to the log2 scale.
  8. Differential gene analysis with DESeq2
  9. GO term enrichment

Software

Griffith Lab RNAseq course

An in depth course covering all aspects of RNA-seq analysis.

Scope

Outline

  1. Course set up (aws, unix, tool installation)
  2. Intro to RNA seq theory
  3. General goals/themes in RNA seq analysis workflow
  4. Intro to BAM/SAM formats
  5. Visualizatio of alignment in IGV
  6. BAM read counting
  7. Expression estimation for known genes and transcripts
  8. Differential Expression analysis
  9. Downstream interpretation of expression
  10. Alignment free estimation of expression with Kallisto/Sleuth
  11. Isoform discovery w StringTie
  12. Differential splicing analysis with Ballgown
  13. Examine and visualize junction counts
  14. DeNovo assembly with Trinity
  15. Transcript annotation with Trinotate
  16. ScRNAseq applications/advantages/challenges
  17. 10x/CellRanger overview
  18. Custom scRNAseq analysis in R

Software

Harvard

They have a series of RNAseq classes offered, using various approaches and infrastructure. The synopsis here includes:

Scope

Outline

Overview:

HPC:

Other materials:

Software

Overview: none

HPC:

nf-core

Scope

Nextflow pipeline

Outline and software

This was copy and pasted from outline:

  1. Download FastQ files via SRA, ENA or GEO ids and auto-create input samplesheet (ENA FTP; if required)
  2. Merge re-sequenced FastQ files (cat)
  3. Read QC (FastQC)
  4. UMI extraction (UMI-tools)
  5. Adapter and quality trimming (Trim Galore!)
  6. Removal of ribosomal RNA (SortMeRNA)
  7. Choice of multiple alignment and quantification routes:
    • STAR -> Salmon
    • STAR -> RSEM
    • HiSAT2 -> NO QUANTIFICATION
  8. Sort and index alignments (SAMtools)
  9. UMI-based deduplication (UMI-tools)
  10. Duplicate read marking (picard MarkDuplicates)
  11. Transcript assembly and quantification (StringTie)
  12. Create bigWig coverage files (BEDTools, bedGraphToBigWig)
  13. Extensive quality control:
    • RSeQC
    • Qualimap
    • dupRadar
    • Preseq
    • DESeq2
  14. Pseudo-alignment and quantification (Salmon; optional)
  15. Present QC for raw read, alignment, gene biotype, sample similarity, and strand-specificity checks (MultiQC, R)

RNAseq 123

Scope

DGE with Bioconductor

Outline