Whole Genome Sequencing Data Analysis

Whole Genome Sequencing Data Analysis Introduction
Workflow
Turn-around Time
Publications
FAQ

Introduction

Next-generation sequencing is widely used to perform whole genome resequencing projects. Once you have the reference sequence for an organism, you can utilize next-generation sequencing to perform comparative sequencing or resequencing to characterize the genetic variations in individuals of the same species or between related species.

Whole genome resequencing can detect SNP (Single Nucleotide Polymorphism), small indels (a few base pairs), large indels ((a few kilo base pairs to a few hundred kilo base pairs), gene fusion, and other structural variations (e.g., duplication, inversions).

If you are interested in only certain regions, or certain type of regions of a genome, please refer to Targeted ReSeq/Exome.

Workflow

Following is a list of common analysis items for Whole Genome Resequencing. One of our expert bioinformaticians will work closely with you to identify a custom analysis workflow most appropriate for your project.

1) Experiment design consultation
2) Data QC and clean up
3) Alignment to a reference with mapping statistics
4) Local realignment
5) SNP and small indel calling
6) SNP/small indel characterization
7) SNP-based genotyping
8) Association study and linkage disequilibrium analysis
9) Structural variation detection and characterization
10) Written project report with analysis methods, publication-ready graphics, and references

Turn-around Time

Upon data receipt, we usually finish a typical Whole Genome Resequencing analysis project in 3-5 days. The actual turn-around time, however, is highly dependent on sample number, data amount, and project complexity.

Publications

Publications below are representative research or review papers that will help you understand how Whole Genome Resequencing is employed in biomedical research.

  • Hillier, L. et al. (2008) Whole-genome sequencing and variant discovery in C. elegans. Nature Methods 5: 183–188.

FAQ

What kind of reads should I use for my Whole Genome Resequencing experiment?
We recommend 50bp or longer pair-end reads for Whole Genome Resequencing projects so large indels and other structural variations may be reliably detected. Certain projects may utilize single-end reads as an economical alternative if the main objective is to detect SNP and small indels.
How many folds of coverage do I need for my Whole Genome Resequencing experiment?
In general 30– 50x coverage is needed for Whole Genome Resequencing experiment. For certain biomarker discovery projects, 5-10x coverage may be sufficient. For tumor-related Whole Genome Resequencing project, 100x or more is needed. We encourage you to Contact one of our expert bioinformaticians to discuss an optimal coverage for your Whole Genome Resequencing project.