Nanopore Sequencing

Overview

Oxford Nanopore uses tiny protein pores embedded in a membrane to read DNA directly - no amplification, no fluorescence.


How It Works

The Setup: Membrane with Nanopores

A membrane separates two chambers with different electrical charges. Embedded in the membrane are protein nanopores - tiny holes just big enough for single-stranded DNA to pass through.

     Voltage applied across membrane
              โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
                   โ†“
    โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•  โ† Membrane
                โ”‚ โ—ฏ โ—ฏ โ”‚              โ† Nanopores
    โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
                   โ†‘
              DNA threads through

The Detection: Measuring Current

  1. DNA strand is fed through the pore by a motor protein
  2. As each base passes through, it partially blocks the pore
  3. Each base (A, T, G, C) has a different size/shape
  4. Different bases create different electrical resistance
  5. We measure the change in current to identify the base

Key insight: No labels, no cameras, no lasers - just electrical signals!


The Signal: It's Noisy

The raw signal is messy - multiple bases in the pore at once, random fluctuations:

Current
   โ”‚
   โ”‚ โ–„โ–„โ–„   โ–„โ–„    โ–„โ–„โ–„โ–„   โ–„โ–„   โ–„โ–„โ–„
   โ”‚โ–ˆ   โ–ˆโ–„โ–ˆ  โ–ˆโ–„โ–„โ–ˆ    โ–ˆโ–„โ–ˆ  โ–ˆโ–„โ–ˆ   โ–ˆโ–„โ–„
   โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Time
   
   Base: A  A  T  G   C  C  G  A

Machine learning (neural networks) decodes this noisy signal into base calls.


Why Nanopore?

Ultra-Long Reads

  • Typical: 10-50 kb
  • Record: >4 Mb (yes, megabases!)
  • Limited only by DNA fragment length, not the technology

Cheap and Portable

  • MinION device fits in your hand, costs ~$1000
  • Can sequence in the field (disease outbreaks, remote locations)
  • Real-time data - see results as sequencing happens

Direct Detection

  • Can detect modified bases (methylation) directly
  • No PCR amplification needed
  • Can sequence RNA directly (no cDNA conversion)

Error Rate and Correction

Raw accuracy: ~93-97% (improving with each update)

Error type: Mostly indels, especially in homopolymers

Improving Accuracy

1. Higher coverage: Multiple reads of the same region, errors cancel out

2. Duplex sequencing: DNA is double-stranded - sequence both strands and combine:

Forward strand:  ATGCCCAAA
                 |||||||||
Reverse strand:  TACGGGTTT  (complement)

โ†’ Consensus: Higher accuracy

3. Better basecallers: Neural networks keep improving, accuracy increases with software updates