NGS Technologies I & II — Exam Practice

📝NGS Technologies I & II — MCQ + Open Questions
0 / 65
Q1 Easy
Sanger dideoxy sequencing is classified as which generation of sequencing?
AZeroth generation sequencing
BFirst generation sequencing
CSecond generation sequencing
DThird generation sequencing
Explanation
Sanger dideoxy sequencing is explicitly classified as "first generation sequencing." NGS platforms (Illumina, Ion Torrent, 454, SOLiD) are second generation, while long-read technologies (PacBio, Nanopore) are often called third generation.
Q2 Medium
Which of the following factors is NOT listed as a key consideration when choosing a sequencing technology?
AError rate
BTurnaround time
CNumber of fluorescent labels used
DData output
Explanation
The three key factors for choosing a sequencing technology are: error rate, turnaround time, and data output. The number of fluorescent labels is a platform-specific technical detail, not a primary decision factor.
Q3 Medium
Moore's Law states that the number of transistors on a chip doubles approximately every:
A6 months
B12 months
C36 months
D24 months
Explanation
Moore's Law states that the number of transistors on a chip doubles every 24 months (2 years). Some definitions use 18 months instead. Importantly, NGS cost reduction has outpaced even Moore's Law.
Q4 Medium
Compared to the pre-NGS era, how has the distribution of cost and effort in genomic experiments changed?
ACost has shifted from data production to planning and data analysis
BCost has shifted from data analysis to data production
CCost remains equally distributed among all three phases
DPlanning is no longer necessary with NGS technologies
Explanation
Before NGS, data production was the most expensive and time-consuming phase. With NGS, data production became cheap and fast, so the major costs and effort shifted to experimental planning (more crucial than ever due to the volume of data possible) and data analysis (requires more time and expertise to handle massive datasets).
Q5 Tricky
The decrease in sequencing cost over the past two decades has:
AClosely followed Moore's Law predictions
BOutpaced Moore's Law
CBeen slower than Moore's Law
DFollowed a linear rather than exponential trend
Explanation
The decrease in sequencing costs has outpaced Moore's Law. While Moore's Law predicts doubling of computing power roughly every 2 years, NGS technology has advanced at an even faster rate, revolutionizing data generation in biology beyond the pace seen in computer science. The cost of sequencing a human genome dropped from ~$100 million to under $1,000.
Q6 Easy
What is described as the main cause of library preparation failure?
AIncorrect adapter sequences
BContamination with RNA
CInaccurate quantification of starting DNA
DUse of degraded DNA polymerase
Explanation
The lecture explicitly states: "Inaccurate quantification is the main cause of library preparation failure." Accurate quantification of starting DNA is critical to ensure proper library preparation for NGS.
Q7 Easy
The Ion S5 sequencer detects nucleotide incorporation by measuring:
AFluorescent light emission
BpH changes caused by H⁺ ion release
CChanges in electrical current through a nanopore
DBioluminescent signal from luciferase
Explanation
Ion S5 uses semiconductor sequencing. When a nucleotide is incorporated by DNA polymerase, hydrogen ions (H⁺) are released, causing a pH drop in the well. An ion-sensitive layer detects this change and converts it into a voltage signal. This is fundamentally different from optical (fluorescence/bioluminescence) detection used by Illumina or 454.
Q8 Medium
What is the major source of error in Ion Torrent sequencing?
AFluorescent label cross-talk between channels
BBridge amplification artifacts
CLigation probe mismatches
DHomopolymer regions where signal intensity does not scale linearly
Explanation
Ion Torrent's major error source is homopolymer regions (stretches of repeated identical bases like AAAA or TTT). The signal strength correlates with the number of bases incorporated, but doesn't scale linearly. For example, distinguishing between 3 and 4 consecutive T's can be ambiguous because the peak height falls between expected values. Fluorescent cross-talk applies to Illumina, bridge amplification is Illumina's method, and ligation probes are used by SOLiD.
Q9 Medium
In Ion Torrent sequencing, polyclonal reads occur when:
AMore than one DNA species occupies the same well
BThe same DNA fragment is sequenced multiple times
CA polymerase stalls during nucleotide incorporation
DTwo primers bind to the same template simultaneously
Explanation
Polyclonal reads occur when more than one DNA species occupies a single well. This produces mixed/overlapping signals, making base-calling unreliable. In ionograms, polyclonal reads show very few or no empty spaces because bases are incorporated continuously from different templates. These reads are filtered out during quality control.
Q10 Medium
Which clonal amplification method is used by the Ion S5 platform?
ABridge amplification on a flow cell
BRolling circle amplification
CEmulsion PCR on beads
DIsothermal amplification in nanowells
Explanation
Ion S5 uses emulsion PCR (emPCR). DNA fragments are attached to beads and encapsulated in oil-water droplets. Each droplet ideally contains a single DNA fragment, primers, nucleotides, and polymerase. The fragments are amplified to create clonal populations on the beads, which are then loaded into chip wells. Bridge amplification is the method used by Illumina.
Q11 Tricky
A key advantage of the Ion S5 over Illumina is that Ion S5:
AHas higher accuracy in homopolymer regions
BUses natural, unmodified nucleotides, reducing chemical costs
CProduces longer reads than any other platform
DCan detect base modifications directly during sequencing
Explanation
Ion S5 uses natural, unmodified nucleotides (no fluorescent labels or terminators), which minimizes chemical costs and simplifies chemistry. Illumina uses chemically modified nucleotides with fluorescent labels and reversible terminators, which are more expensive. Ion Torrent actually has LOWER accuracy in homopolymer regions than Illumina, and long-read platforms (PacBio, Nanopore) produce much longer reads.
Q12 Tricky
In an ionogram, an empty space between peaks indicates:
AA polyclonal well producing mixed signals
BA homopolymer region was encountered
CThe sequencing quality dropped below the threshold
DThe added nucleotide did not match the template during that flow
Explanation
In an ionogram, empty spaces (gaps) mean that during that nucleotide flow, no bases were incorporated — the added nucleotide didn't match the template. Polyclonal reads actually show the opposite: very few or no empty spaces because bases from mixed templates are incorporated almost continuously.
Q13 Medium
During Ion Torrent library preparation, barcodes are used to:
ATag each sample with a unique sequence for multiplexed sequencing and later demultiplexing
BIncrease the length of the sequencing reads
CDetect homopolymer errors during data analysis
DProvide primer binding sites for emulsion PCR
Explanation
Barcodes are unique DNA sequences that tag each sample, allowing multiple samples to be pooled in a single sequencing run (multiplexing). After sequencing, reads are sorted back to their original sample based on the barcode (demultiplexing). This is separate from adapter sequences that provide primer binding sites.
Q14 Easy
In Ion Torrent sequencing, reads shorter than how many bases are automatically filtered out?
A10 bases
B15 bases
C25 bases
D50 bases
Explanation
Reads smaller than 25 bases are filtered automatically because they are too short to be aligned reliably to a reference genome. This threshold may need adjustment for specific applications like miRNA sequencing, where the target molecules are naturally very short.
Q15 Easy
The Roche 454 sequencing platform uses which detection method?
APyrosequencing — light produced by luciferase-catalyzed reaction
BSemiconductor detection of pH changes
CFluorescent reversible terminator chemistry
DSequencing by ligation with fluorescent probes
Explanation
Roche 454 uses pyrosequencing. When a nucleotide is incorporated, pyrophosphate (PPi) is released. ATP sulfurylase converts PPi to ATP, which luciferase then uses to produce light. A CCD camera detects this light and integrates it as a peak in a pyrogram. This is an optical (light-based) detection method, unlike Ion Torrent's electronic detection.
Q16 Hard
In 454 pyrosequencing, which enzyme is responsible for degrading unincorporated nucleotides and excess ATP between cycles?
ALuciferase
BDNA polymerase
CATP sulfurylase
DApyrase
Explanation
Apyrase is the nucleotide-degrading enzyme that continuously degrades ATP excess and unincorporated dNTPs. The four enzymes in pyrosequencing are: DNA polymerase (incorporates nucleotides), ATP sulfurylase (converts PPi to ATP), luciferase (produces light from ATP), and apyrase (degrades excess ATP/dNTPs for the next cycle).
Q17 Hard
In Roche 454 template preparation, the ratio of DNA fragments to agarose beads during emulsion PCR is approximately:
A1:10
B1:1
C10:1
D100:1
Explanation
In 454 emulsion PCR, DNA fragments and agarose beads (with complementary oligonucleotides) are mixed in an approximately 1:1 ratio. This ensures that most beads ideally capture a single DNA fragment. The mixture is then encapsulated by vigorous vortexing into aqueous micelles surrounded by oil for PCR amplification, resulting in beads decorated with ~1 million copies of the original fragment.
Q18 Medium
Approximately how many copies of the original DNA fragment are generated on each bead in 454 emulsion PCR?
A~100
B~10,000
C~1 million
D~1 billion
Explanation
Each bead is decorated with approximately 1 million copies of the original single-stranded fragment. This high copy number is necessary to provide sufficient signal strength during the pyrosequencing reaction to detect and record nucleotide incorporation events.
Q19 Easy
ABI SOLiD uses which sequencing approach?
ASequencing by synthesis using fluorescent reversible terminators
BSequencing by ligation using fluorescently labeled di-base probes
CPyrosequencing with bioluminescent detection
DSemiconductor detection of hydrogen ions
Explanation
ABI SOLiD (Sequencing by Oligonucleotide Ligation and Detection) uses a sequencing-by-ligation approach. A set of four fluorescently labeled di-base probes compete for ligation to the sequencing primer. This is fundamentally different from polymerase-based methods (Illumina, Ion Torrent, 454).
Q20 Hard
The SOLiD system achieves high accuracy through its two-base encoding system. How many rounds of primer reset are performed for each sequence tag?
ATwo
BThree
CFour
DFive
Explanation
Five rounds of primer reset are completed for each sequence tag. Through this process, virtually every base is interrogated in two independent ligation reactions by two different primers. This dual interrogation is fundamental to the high accuracy (up to 99.99% with Exact Call Chemistry) of the SOLiD system.
Q21 Medium
What is a significant limitation of the ABI SOLiD platform?
AShort read lengths (35–50 bp) and complex color-space data analysis
BHigh homopolymer error rates
CVery low throughput
DInability to perform paired-end sequencing
Explanation
SOLiD's reads are relatively short (typically 35–50 bp), which limits genome assembly and analysis of repetitive regions. Additionally, the output is in color-space encoding that must be converted to nucleotide sequences, adding complexity to data analysis. SOLiD actually has very high throughput and can do paired-end sequencing. Homopolymer errors are characteristic of Ion Torrent and 454, not SOLiD.
Q22 Tricky
The SOLiD Exact Call Chemistry (ECC) module can achieve up to 99.99% accuracy when used:
AWithout any reference, in base-space mode
BOnly with paired-end reads
CIn combination with a reference genome
DOnly for reads shorter than 25 bp
Explanation
The Exact Call Chemistry (ECC) module achieves up to 99.99% accuracy when used in combination with a reference genome. Without a reference, the ECC module can still output data in base space (rather than color space), but it does not reach the same level of accuracy.
Q23 Easy
Illumina sequencing uses which cluster generation method?
AEmulsion PCR on beads
BBridge amplification on a flow cell
CRolling circle amplification
DIsothermal strand displacement
Explanation
Illumina uses bridge amplification on a flow cell. Single-stranded DNA molecules bind to complementary oligos on the flow cell surface, fold over to form bridges with adjacent primers, and are amplified to form clonal clusters. Emulsion PCR is used by Ion Torrent, 454, and SOLiD.
Q24 Medium
An Illumina flow cell is best described as:
AA silicon chip with millions of microwells
BA 96-well microtiter plate for PCR
CA membrane with embedded protein nanopores
DA thick glass slide with channels/lanes coated with a lawn of oligos complementary to library adapters
Explanation
An Illumina flow cell is a thick glass slide with channels or lanes. Each lane is randomly coated with a lawn of oligonucleotides complementary to library adapters. This surface enables the capture and bridge amplification of library fragments. Silicon chips with microwells describe Ion Torrent, and protein nanopore membranes describe Oxford Nanopore.
Q25 Medium
In Illumina bridge amplification, what happens immediately after the double-stranded bridge is denatured?
ATwo copies of covalently bound single-stranded templates are produced
BThe sequencing primer hybridizes immediately
CThe reverse strand is cleaved and washed away
DFluorescent nucleotides are added for sequencing
Explanation
After the double-stranded bridge is denatured, the result is two copies of covalently bound single-stranded templates. These single-stranded molecules can then flip over to hybridize to adjacent primers, and the bridge amplification cycle continues until a full cluster is formed. Reverse strand cleavage and sequencing primer hybridization occur later, after cluster generation is complete.
Q26 Medium
Each Illumina cluster represents:
AA single DNA molecule with no amplification
BA mixture of different DNA fragments from the library
CThousands of copies of the same DNA strand in a 1–2 micron spot
DApproximately 1 million copies of the DNA fragment
Explanation
Each Illumina cluster represents thousands of copies of the same DNA strand positioned in a 1–2 micron spot. Clusters appear as bright spots on fluorescent images. The high copy number provides sufficient signal intensity for detection. The ~1 million copies figure applies to 454 beads, not Illumina clusters.
Q27 Medium
In Illumina sequencing by synthesis (SBS), what ensures that only one nucleotide is incorporated per cycle?
ANucleotides are added one at a time in separate flows
BThe DNA polymerase has built-in proofreading activity
CDNA ligase prevents further extension
DEach nucleotide has a reversible chemical terminator that blocks further incorporation
Explanation
Illumina uses fluorescent reversible terminator chemistry. All four nucleotides are added simultaneously, but each is chemically blocked (has a reversible terminator) to prevent additional incorporations in the same cycle. After imaging, the fluorescent label and chemical block are enzymatically removed, allowing the next cycle. This is a key difference from Ion Torrent, which adds nucleotides one type at a time and may incorporate multiple identical bases.
Q28 Tricky
Why does Illumina SBS have higher accuracy in homopolymer regions than Ion Torrent?
AIllumina uses a more sensitive camera system
BOnly one nucleotide can be incorporated per cycle due to the reversible terminator, so each base is read individually
CIllumina reads are inherently longer than Ion Torrent reads
DIllumina uses a two-base encoding system similar to SOLiD
Explanation
Illumina's reversible terminator ensures only one nucleotide is incorporated per cycle, regardless of the template sequence. Even in a homopolymer run (e.g., AAAA), each A is incorporated and read in a separate cycle. Ion Torrent flows one nucleotide type at a time without a terminator, so multiple identical bases can be incorporated simultaneously, and the signal intensity must be used to infer the count — which is error-prone.
Q29 Hard
In Illumina 2-channel SBS chemistry (e.g., NextSeq 500), how is a G base identified?
ANo signal in either channel (neither red nor green)
BGreen signal only
CRed signal only
DBoth red and green signals simultaneously
Explanation
In 2-channel chemistry: T = green only, C = red only, A = both red + green, G = neither (no signal). This reduces imaging requirements from 4 channels to 2, allowing faster scanning with simpler optics. G is essentially a "dark" base inferred from the absence of signal.
Q30 Hard
In Illumina 1-channel (single-color) SBS chemistry, which base is identified by appearing green in the first imaging but dark (no signal) in the second imaging?
AT
BC
CA
DG
Explanation
In 1-channel chemistry: Step 1 (first image): A and T emit green, C and G are dark. Step 2 (chemistry): A loses its green dye, T stays green, C gets activated to green, G stays dark. Step 3 (second image): A = green→black, T = green→green, C = black→green, G = black→black. So A is identified by going from green to dark.
Q31 Medium
In the Illumina cluster generation workflow, what happens immediately after reverse strand cleavage?
ABridge amplification continues
BFree 3' ends are blocked to prevent unwanted DNA priming
CFluorescent nucleotides are added
DThe flow cell is scanned for cluster positions
Explanation
After linearization and reverse strand cleavage (leaving only forward strands), free 3' ends are blocked to prevent unwanted DNA priming. Only after blocking does the sequencing primer hybridize to the adapter sequence for Read 1. The full sequence is: bridge amplification → denaturation → linearization → reverse strand cleavage → 3' blocking → primer hybridization → sequencing.
Q32 Medium
What is Exclusion Amplification (ExAmp) in the context of Illumina sequencing?
AA method to exclude polyclonal reads during analysis
BA technique to filter out short reads
CAn alternative sequencing chemistry for long reads
DAn improved method for cluster generation on flow cells
Explanation
Exclusion Amplification (ExAmp) is an improved cluster generation method used on newer Illumina platforms. It is designed to generate higher-quality, more evenly spaced clusters on flow cells, improving data quality and throughput compared to traditional bridge amplification alone.
Q33 Easy
What is paired-end sequencing?
ASequencing both ends of a DNA fragment, producing two reads per molecule
BSequencing the same fragment twice for error correction
CSequencing two different samples simultaneously on one flow cell
DUsing two different sequencing chemistries on the same library
Explanation
Paired-end sequencing sequences both ends of a DNA fragment, generating two reads per molecule — one from each end. Although the middle portion remains unsequenced, the two reads are physically linked (from the same fragment), providing crucial positional information for alignment and variant detection.
Q34 Medium
Which type of variant is particularly difficult to detect with single-end reads but becomes detectable with paired-end sequencing?
ASingle nucleotide polymorphisms (SNPs)
BPoint mutations
CInsertion-deletion (indel) variants and structural rearrangements
DHeterozygous genotypes
Explanation
Paired-end sequencing facilitates detection of insertion-deletion (indel) variants, structural rearrangements, gene fusions, and novel transcripts, which is "not possible with single-read data." SNPs and point mutations can be detected with single-end reads. The paired positional information allows mapping discordant read pairs to identify structural changes.
Q35 Tricky
During paired-end sequencing on Illumina, how is the Read 2 template generated?
AThe original forward template is used again with a different primer
BThe template loops over to form a bridge, is re-amplified, linearized, and the forward strand is cleaved — leaving the reverse strand as template
CA separate library is prepared and loaded onto the same flow cell
DThe sequencing primer is simply moved to the opposite end of the same strand
Explanation
For Read 2 in paired-end sequencing: (1) The Read 1 sequenced strand is stripped off. (2) Template strands and lawn primers are unblocked. (3) The single-stranded template loops over to form a bridge by hybridizing with a lawn primer. (4) The primer is extended, creating a new double-stranded bridge. (5) Bridges are linearized and the original forward template is cleaved. (6) The reverse strand remains as the template for Read 2 sequencing.
Q36 Easy
Which statement about paired-end sequencing is TRUE?
AIt requires twice the amount of DNA compared to single-read sequencing
BIt requires restriction digestion of the DNA
COnly specific Illumina platforms support paired-end sequencing
DIt requires the same amount of DNA as single-read sequencing
Explanation
Paired-end sequencing uses the same amount of DNA as single-read genomic DNA or cDNA sequencing. It does not require methylation of DNA or restriction digestion, and all Illumina NGS systems are capable of paired-end sequencing. It's a simple modification to the standard library preparation process.
Q37 Easy
In Illumina library preparation, "indexing" refers to:
AAdding unique barcode sequences to library fragments for sample identification during multiplexing
BCreating a reference index for read alignment
CNumbering each cluster position on the flow cell
DMeasuring the fragment size distribution of the library
Explanation
Indexing (also called barcoding) is the process of adding unique DNA sequences (indexes) to library fragments during preparation. This allows multiple samples to be pooled and sequenced together (multiplexing) on the same flow cell or lane, and later computationally separated (demultiplexed) based on their unique index sequences.
Q38 Easy
Which file format stores sequencing reads along with per-base quality scores but no alignment information?
ABAM
BVCF
CFASTQ
DBED
Explanation
FASTQ is a text-based format that stores reads and per-base quality scores, but contains no alignment information. BAM stores reads plus alignment information in binary format. VCF stores variant calls (SNPs, indels). BED defines genomic features/regions in the reference genome.
Q39 Medium
A BAM file differs from a FASTQ file in that it:
AIs a plain text file that can be directly viewed
BContains only quality scores without the actual sequences
CStores variant calls for SNPs and indels
DIs a compressed binary file containing both reads and alignment information, with an index for fast access
Explanation
BAM (Binary Alignment Map) is a compressed binary format containing both reads and alignment information. It uses an index file to give fast access to small sections of the file but cannot be directly viewed as text (requires specialized tools/genome browsers). FASTQ is plain text and contains no alignment data. VCF stores variant calls, not BAM.
Q40 Easy
The VCF (Variant Call Format) file is used to represent:
ARaw sequencing reads with quality scores
BSNPs, indels, and structural variation calls
CGenomic feature annotations like genes and regulatory elements
DRead alignment positions in binary format
Explanation
VCF (Variant Call Format) is a standardized text file format for representing SNP, indel, and structural variation calls — differences between the sequenced sample and the reference genome. FASTQ stores raw reads, BAM stores alignments, and BED defines genomic features/regions.
Q41 Medium
A BED (Browser Extensible Data) file is best described as:
AA tab-delimited text file that defines genomic features or regions added to a reference file
BA compressed binary format for read alignments
CA text file storing only DNA sequences without quality information
DA format for storing raw fluorescence intensity data
Explanation
A BED file is a tab-delimited text file that defines a feature track — specifying genomic features or regions such as genes or regulatory elements. BED files are added to a reference file to annotate or highlight specific parts of the reference genome. Option C describes FASTA format.
Q42 Medium
How many lines represent each read in a FASTQ file?
A2 lines (header + sequence)
B3 lines (header + sequence + quality)
C4 lines (identifier + sequence + separator + quality scores)
DVariable — depends on read length
Explanation
Each read in FASTQ is represented by exactly 4 lines: Line 1 (@Read_ID) = Identifier, Line 2 = DNA sequence, Line 3 (+) = Separator, Line 4 = Quality scores encoded as ASCII characters. This compact format combines sequence and quality information efficiently.
Q43 Easy
A Phred quality score (Q score) of 20 corresponds to:
AAn error rate of 1 in 10 (90% accuracy)
BAn error rate of 1 in 100 (99% accuracy)
CAn error rate of 1 in 1,000 (99.9% accuracy)
DAn error rate of 1 in 10,000 (99.99% accuracy)
Explanation
Q = -10 × log₁₀(e), where e is the error probability. For Q=20: 20 = -10 × log₁₀(e), so log₁₀(e) = -2, meaning e = 0.01 = 1 in 100, giving 99% accuracy. Q10 = 1/10 (90%), Q30 = 1/1,000 (99.9%), Q40 = 1/10,000 (99.99%).
Q44 Medium
In the Sanger FASTQ format, Phred quality scores are encoded using ASCII characters in which range?
AASCII 0 to 93
BASCII 0 to 126
CASCII 64 to 126
DASCII 33 to 126
Explanation
Sanger FASTQ format encodes Phred quality scores from 0 to 93 using ASCII characters 33 to 126 (Q score = ASCII value − 33). The Solexa/Illumina early format used a different offset (ASCII 64). A tip: if you see characters with ASCII codes higher than 90 in the quality string, the file is likely in the older Solexa/Illumina format.
Q45 Tricky
How can you distinguish a Solexa/Illumina FASTQ file from a standard Sanger FASTQ file by examining the quality string?
ASolexa/Illumina files may contain characters with ASCII code higher than 90
BSolexa/Illumina files use numeric quality scores instead of ASCII characters
CSanger files always start with the '@' symbol while Solexa uses '>'
DSolexa files contain only uppercase letters in the quality string
Explanation
The lecture explicitly states: "Although Solexa/Illumina read file looks pretty much like FASTQ, they are different in that the qualities are scaled differently. In the quality string, if you can see a character with its ASCII code higher than 90, probably your file is in the Solexa/Illumina format." Both formats use '@' as a read identifier and ASCII-encoded quality scores, but the offset differs.
Q46 Medium
Illumina sequencing platforms typically achieve quality scores around:
AQ10 (90% accuracy)
BQ20 (99% accuracy)
CQ30 (99.9% accuracy) or better
DQ40 (99.99% accuracy)
Explanation
Illumina sequencing typically achieves around Q30 (1 error in 1,000 bases) or better. Ion Torrent averages around Q20 due to homopolymer difficulties, with improvements pushing closer to Q30. Q30 is a common quality benchmark in the field.
Q47 Medium
The "moving window" trimming approach in read filtering works by:
ARemoving the first 25 bases of every read
BKeeping only reads above a specific length threshold
CRandomly sampling reads to reduce file size
DTrimming reads at the position where base quality drops below a certain threshold
Explanation
A moving window approach automatically trims reads when base quality drops below a certain threshold. It slides a window along the read and cuts at the position where quality deteriorates. Quality typically drops toward the ends of reads, so this helps retain the high-quality portion while discarding unreliable bases.
Q48 Easy
The FASTA format differs from FASTQ in that FASTA:
AIncludes per-base quality scores
BContains only sequence information without quality scores
CIs a binary format that cannot be viewed directly
DStores alignment information along with the sequence
Explanation
FASTA is a simple plain-text format that stores only the sequence information (header line starting with '>' followed by the sequence). It does not include quality scores. FASTQ extends this by adding per-base quality scores encoded as ASCII characters. FASTA was widely used before high-throughput sequencing became common.
Q49 Easy
Oxford Nanopore sequencing works by detecting:
AChanges in electrical current as nucleic acids pass through a protein nanopore
BFluorescent signals from labeled nucleotides
CpH changes from hydrogen ion release
DLight produced by luciferase enzyme activity
Explanation
Nanopore sequencing monitors changes in electrical current as single-stranded DNA or RNA passes through a protein nanopore embedded in a membrane. The current changes are decoded to determine the nucleotide sequence. This is fundamentally different from synthesis-based or ligation-based detection methods.
Q50 Medium
A unique feature of Oxford Nanopore Technology is its ability to:
AAchieve error rates below 0.01%
BAmplify DNA before reading each molecule
CSequence RNA directly without conversion to cDNA
DProduce paired-end reads on ultra-long fragments
Explanation
Oxford Nanopore can sequence RNA directly — no need to convert it to cDNA first. This is a unique capability. Nanopore also requires no DNA amplification. Its error rate is around 5% (not below 0.01%), and since it produces continuous long reads, paired-end sequencing is not needed.
Q51 Tricky
The Nanopore does NOT read individual nucleotides one at a time. Instead, it:
AReads an entire chromosome in one signal event
BDetects only purine vs pyrimidine groups
CRequires fluorescent labeling to distinguish bases
DDetects a signal affected by a short sequence (k-mer), making signal-to-sequence conversion complex
Explanation
The nanopore detects a signal affected by a short sequence (k-mer) of nucleotides that are simultaneously within or near the pore, not individual bases. This makes the signal-to-sequence conversion computationally complex, as the current change at any point depends on multiple adjacent nucleotides. This contributes to the relatively higher error rate.
Q52 Medium
The MinION from Oxford Nanopore Technologies has up to how many nanopore channels?
A126
B512
C3,000
D144,000
Explanation
MinION has up to 512 nanopore channels. PromethION has up to 3,000 channels per flow cell and up to 48 flow cells (total up to 144,000 channels). GridION has five individually addressable flow cell positions compatible with MinION and Flongle flow cells.
Q53 Medium
Which Oxford Nanopore device is best suited for smaller samples, quality checking, or targeted regions?
AFlongle
BMinION
CPromethION
DGridION Mk1
Explanation
Flongle is an adapter for MinION designed for smaller tests. It's suitable for quality checks, amplicons, smaller genomes, targeted regions, or diagnostics. It's a single-use, on-demand, cost-efficient solution when you have smaller samples or prefer running single samples rather than multiplexing.
Q54 Hard
Why is long-read sequencing (Nanopore/PacBio) NOT suitable for ancient or highly degraded DNA?
AThe cost per base is too high for degraded samples
BDegraded DNA causes the nanopore to clog
CHigh error rates combined with sensitivity to DNA quality make results unreliable on degraded templates
DDegraded DNA fragments are too long for the sequencing chemistry
Explanation
Long-read technologies are not suitable for ancient/degraded DNA due to higher error rates combined with sensitivity to DNA quality. Ancient DNA is fragmented and chemically damaged, which compounds the already higher error rates of these platforms. These methods require high-quality, intact DNA to take full advantage of their long-read capability.
Q55 Tricky
Why might the Nagoya Protocol be relevant when using a portable Nanopore sequencer in the field?
AIt requires all sequencing data to be made publicly available
BIt restricts the export of biological resources across national borders, meaning sequencing must often be done locally
CIt mandates the use of specific sequencing platforms in clinical settings
DIt prohibits the use of portable sequencers outside laboratory environments
Explanation
The Nagoya Protocol restricts the export of biological resources (tissue, blood, DNA) across national borders to prevent unauthorized use of genetic resources with commercial or research value. This means you can bring the sequencer to the sample (making Nanopore's portability valuable), but you may not be allowed to export the sample itself. Sequencing must often be done locally.
Q56 Easy
PacBio's SMRT sequencing immobilizes DNA polymerases in tiny wells called:
ANanopores
BMicrotiter wells
CIon semiconductor wells
DZero-Mode Waveguides (ZMWs)
Explanation
PacBio SMRT sequencing uses Zero-Mode Waveguides (ZMWs) — tiny wells at the bottom of which a DNA polymerase is attached. Fluorescently labeled nucleotides (with six phosphate groups) are incorporated continuously, and a camera detects the color and timing of each incorporation event in real time.
Q57 Medium
PacBio's Circular Consensus Sequencing (CCS) achieves high accuracy by:
AReading the same circular template multiple times and averaging the results to generate HiFi reads
BUsing two-base encoding similar to SOLiD
CAdding reversible terminators to slow down the polymerase
DUsing bridge amplification to generate clonal clusters
Explanation
CCS creates a circular DNA template (SMRTbell) by ligating hairpin adapters to both ends. The polymerase makes multiple passes around the circle, reading the same sequence repeatedly. These multiple subreads are aligned and averaged to generate a high-fidelity (HiFi) consensus read with >99.9% accuracy. Because errors are random, the same error is unlikely to recur at the same position across passes.
Q58 Medium
A key difference between PacBio SMRT sequencing and Illumina SBS is that PacBio:
AUses a termination step to control nucleotide addition
BDoes not use fluorescently labeled nucleotides
CIncorporates nucleotides continuously without a termination step
DRequires cluster amplification before sequencing
Explanation
Unlike Illumina (which uses reversible terminators to add one base per cycle), PacBio has no termination step — nucleotides are incorporated continuously in real time. PacBio does use fluorescently labeled nucleotides (with six phosphate groups), and it performs single-molecule sequencing without prior amplification.
Q59 Hard
PacBio SMRT sequencing can detect DNA modifications (e.g., methylation) because:
AModified bases emit a different fluorescent color
BModified bases alter the interpulse duration (time between incorporations), which is detected in real time
CModified bases cause a pH change that differs from unmodified bases
DA separate bisulfite treatment step is required before sequencing
Explanation
PacBio detects DNA modifications through "interpulse duration" — the time between base incorporation events. When the polymerase encounters a modified base, the incorporation time is longer. The SMRT system records the color and duration of emitted light in real time, so the interpulse duration can indicate DNA modification events directly, without separate chemical treatment.
Q60 Medium
Which of the following is an advantage of PacBio HiFi reads?
ALowest cost per gigabase among all platforms
BHighest throughput (most reads per run)
CShortest library preparation time
DUniform coverage with little or no GC bias due to PCR-free library prep
Explanation
PacBio HiFi reads provide uniform coverage with little or no sequence bias (e.g., GC bias) thanks to PCR-free library preparation and random error profiles. PacBio has higher cost per run and lower throughput than Illumina, and library preparation is more technically demanding, not simpler.
Q61 Hard
Match the signal type to the correct platform: Fluorescence (optical), pH sensing (electronic), Bioluminescence (optical), Electrical current changes.
AIllumina = Fluorescence, Ion Torrent = pH, 454 = Bioluminescence, Nanopore = Electrical current
BIllumina = Bioluminescence, Ion Torrent = pH, 454 = Fluorescence, Nanopore = Electrical current
CIllumina = Fluorescence, Ion Torrent = Electrical current, 454 = pH, Nanopore = Bioluminescence
DIllumina = pH, Ion Torrent = Fluorescence, 454 = Bioluminescence, Nanopore = Electrical current
Explanation
Illumina uses fluorescent reversible terminator chemistry (optical detection). Ion Torrent detects pH changes via an ion-sensitive semiconductor (electronic). Roche 454 uses pyrosequencing with luciferase-generated bioluminescence (optical). Oxford Nanopore detects changes in electrical current through a protein pore (electronic). PacBio also uses fluorescence but in a real-time, single-molecule format.
Q62 Medium
Raw sequencing data from a single run can be approximately 2.5 terabytes, which after processing into FASTQ format reduces to about:
A500 gigabytes
B100 gigabytes
C30 gigabytes
D1 gigabyte
Explanation
Raw data can be ~2.5 terabytes, but once processed into FASTQ format, it reduces to about 30 gigabytes. This massive reduction highlights the importance of data management — keeping only essential files and compressing data efficiently, since storing terabytes of raw data long-term is impractical.
Q63 — Open Calculation
A base has a Phred quality score of Q30. What is the error probability and the corresponding base call accuracy? Show your calculation using the Q score formula.
✓ Model Answer

The Q score formula is: Q = −10 × log₁₀(e), where e is the error probability.

Given Q = 30:
30 = −10 × log₁₀(e)
log₁₀(e) = −3
e = 10⁻³ = 0.001
Error rate = 1 in 1,000 bases
Base call accuracy = 1 − 0.001 = 0.999 = 99.9%

Q30 is considered a standard quality benchmark for Illumina sequencing, meaning that on average, only 1 in 1,000 base calls is incorrect.

Q64 — Open Short Answer
Compare the clonal amplification methods used by Illumina (bridge amplification) and Ion Torrent (emulsion PCR). Describe the key steps of each and explain how each approach generates clonal populations of DNA templates for sequencing.
✓ Model Answer

Illumina — Bridge Amplification:

1. Single-stranded library fragments hybridize to complementary oligonucleotides (lawn primers) on a glass flow cell surface via their adapters.

2. DNA polymerase synthesizes a complementary strand; the original template is washed away.

3. The newly synthesized strand folds over ("bridges") to hybridize with an adjacent complementary primer on the flow cell.

4. Polymerase extends the primer, creating a double-stranded bridge.

5. The bridge is denatured, yielding two covalently attached single-stranded copies.

6. The cycle repeats, exponentially amplifying copies in a localized cluster (1–2 micron spot with thousands of identical copies).

7. After amplification, clusters are linearized, reverse strands are cleaved, and 3' ends are blocked before sequencing.

Ion Torrent — Emulsion PCR:

1. Library fragments are mixed with beads coated with complementary oligonucleotides, ideally at a 1:1 ratio (one fragment per bead).

2. Beads and fragments are emulsified into oil-water droplets, creating millions of individual micro-reactors.

3. Each droplet contains a single bead, a DNA fragment, primers, nucleotides, and polymerase.

4. Standard PCR amplification occurs inside each droplet, generating clonal populations on each bead.

5. Beads with amplified DNA are enriched (Ion Sphere Particle Enrichment) and deposited into chip wells for sequencing.

Key difference: Bridge amplification occurs on a flat surface (flow cell) generating spatially separated clusters, while emulsion PCR occurs in liquid microdroplets generating bead-bound clonal populations that are then loaded into wells.

Q65 — Open Tricky
Explain how Illumina's 2-channel and 1-channel SBS chemistries can distinguish all four nucleotides (A, T, C, G) using fewer imaging channels than the traditional 4-channel approach. What are the advantages and trade-offs of the 1-channel system?
✓ Model Answer

2-Channel Chemistry (e.g., NextSeq 500): Uses two colors (red and green). T = green only; C = red only; A = both red and green; G = neither color (dark). Each cycle requires two images, one per channel, and the combination identifies the base.

1-Channel Chemistry: Uses only a single color (green) but requires two imaging steps per cycle with an intervening chemical modification:

Step 1 — Incorporation: All 4 nucleotides are added. A and T emit green; C and G are dark.

Step 2 — First imaging: Green = A or T; Dark = C or G.

Step 3 — Chemical modification: A loses its green dye (cleaved off); T retains green; C is activated to fluoresce green; G remains dark.

Step 4 — Second imaging: A = green→dark; T = green→green (stays); C = dark→green; G = dark→dark (stays).

Advantages of 1-channel: Uses only one dye and one detector, reducing instrument cost and complexity. No need for multicolor scanning or complex optics.

Trade-offs: Requires two chemical processing steps and two images per cycle, making each cycle slightly longer. The chemistry is more complex and nucleotide reagents are more expensive than in multi-channel systems.