PacBio Sequencing
Overview
PacBio (Pacific Biosciences) uses SMRT sequencing (Single Molecule Real-Time) to produce long reads - often 10,000 to 25,000+ base pairs.
For better illustration, watch the video below:
How It Works
The Setup: ZMW (Zero-Mode Waveguide)
PacBio uses tiny wells called ZMWs - holes so small that light can only illuminate the very bottom.
At the bottom of each well:
- A single DNA polymerase is fixed in place
- A single DNA template is threaded through it
The Chemistry: Real-Time Detection
- Fluorescent nucleotides (A, T, G, C - each with different color) float in solution
- When polymerase grabs the correct nucleotide, it holds it in the detection zone
- Laser detects the fluorescence - we see which base is being added
- Polymerase incorporates the nucleotide, releases the fluorescent tag
- Repeat - watching DNA synthesis in real-time
Key difference from Illumina: We watch a single molecule of polymerase working continuously, not millions of molecules in sync.
Why Long Reads?
The circular template trick:
PacBio uses SMRTbell templates - DNA with hairpin adapters on both ends, forming a circle.
╭──────────────╮
│ │
────┤ Template ├────
│ │
╰──────────────╯
The polymerase goes around and around, reading the same template multiple times.
Error Correction: Why High Accuracy?
Raw reads have ~10-15% error rate (mostly insertions/deletions)
But: Because polymerase circles the template multiple times, we get multiple reads of the same sequence.
CCS (Circular Consensus Sequencing):
- Align all passes of the same template
- Errors are random, so they cancel out
- Result: >99.9% accuracy (HiFi reads)
Pass 1: ATGC-CCAAA
Pass 2: ATGCCC-AAA
Pass 3: ATGCCCAAAA
Pass 4: ATGCCC-AAA
──────────
Consensus: ATGCCCAAA ✓
When to Use PacBio
Ideal for:
- De novo genome assembly
- Resolving repetitive regions
- Detecting structural variants
- Full-length transcript sequencing
- Phasing haplotypes
Not ideal for:
- Large-scale population studies (cost)
- When short reads are sufficient