Lecture 1 — Introduction: Foundational Genetics & Genomics Concepts

📝Lecture 1 — Concepts in Genetics & Genomics
0 / 40
Q1 Easy
Which branch of genetics uses statistical models to estimate the genetic contribution to variation in traits controlled by multiple genes?
AClassical genetics
BMolecular genetics
CQuantitative genetics
DPopulation genetics
Explanation
Quantitative genetics analyzes traits controlled by multiple genes (e.g., height, milk production) and uses statistical models to estimate the genetic contribution to phenotypic variation. Population genetics studies allele frequency changes, classical genetics focuses on Mendelian inheritance patterns, and molecular genetics investigates gene structure/function at the DNA/RNA/protein level.
Q2 Medium
Which domain of genetics links genetics with evolutionary biology by studying the distribution and change of allele frequencies?
AMolecular genetics
BPopulation genetics
CQuantitative genetics
DClassical genetics
Explanation
Population genetics studies the distribution and change of allele frequencies within populations, directly linking genetics with evolutionary biology. Don't confuse it with quantitative genetics, which focuses on multi-gene traits and variance components.
Q3 Tricky
According to the lecture, which domain of genetics tends to be less central in genomics projects that focus on individual genes or regions?
AClassical genetics
BMolecular genetics
CPopulation genetics
DQuantitative genetics
Explanation
The lecture specifically states: "Quantitative genetics, while important for understanding polygenic traits, tends to be less central in genomics projects that focus on individual genes or regions." This is a subtle point easily overlooked. Classical and molecular genetics form the "foundational pillars" of applied genomics.
Q4 Easy
Classical genetics was developed before the molecular nature of DNA was understood. What did it primarily rely on to infer genetic laws?
APhenotypic traits and breeding analysis
BDNA sequencing and molecular markers
CStatistical models of allele frequencies
DProtein expression analysis
Explanation
Classical genetics focused on observing phenotypic traits (visible differences such as eye color or body shape) and inferring rules of inheritance through carefully planned breeding experiments and offspring analysis. DNA sequencing came much later.
Q5 Medium
What is the relationship between the distance of two genes on the same chromosome and the probability of crossing over?
AGreater distance = lower probability of crossing over
BGreater distance = higher probability of crossing over
CDistance has no effect on crossing over frequency
DOnly genes on different chromosomes undergo crossing over
Explanation
Genes that are farther apart on the same chromosome have a higher probability of recombination (crossing over) during meiosis. Genes close together tend to be inherited together because a crossover event between them is less likely. This principle is the foundation of genetic mapping.
Q6 Tricky
Why is the distance between genes on the same chromosome important when planning a genomics project, according to Professor Fontanesi?
AIt determines the total genome size
BIt affects protein expression levels
CIt is related to the level of recombination
DIt determines the mutation rate
Explanation
The lecture explicitly notes: "Fontanesi suggests looking at the distance between hereditary elements on the same chromosome when planning a project, because distance is related to the level of recombination." This is practical advice for designing genomics experiments — closely linked loci behave differently from distant ones.
Q7 Medium
Why is polyploidy a challenge in applied genomics?
APolyploid organisms cannot reproduce sexually
BPolyploid organisms have smaller genomes that are harder to detect
CPolyploidy eliminates crossing over during meiosis
DMultiple gene copies make it harder to determine which copy is responsible for a given trait
Explanation
In polyploid organisms (tetraploid, hexaploid, etc.), each gene may exist in multiple copies. This makes it significantly harder to determine which copy is responsible for a given trait, complicating both experimental design and data analysis. It may even make it impossible to resolve which part of the DNA corresponds to which parental genome.
Q8 Easy
A hexaploid organism has how many sets of chromosomes?
ASix
BFour
CThree
DTwo
Explanation
Hexaploid = six sets of chromosomes. Diploid = two, tetraploid = four. Polyploidy is common in plants (many crops are tetraploid or hexaploid) and creates complexity in genomic analysis.
Q9 Tricky
When planning a genomics project on cattle, which factor most limits the speed of experimental progress?
ALarge genome size
BLong reproductive cycle and generation time
CHigh ploidy level
DLack of available reference genomes
Explanation
Classical genetics relies on generational cycles, making time a key constraint. Cattle have a long reproductive cycle, so it's not feasible to expect rapid results. Cattle are diploid (not polyploid), and reference genomes exist. The lecture specifically highlights this as a practical consideration.
Q10 Easy
What is the genotypic ratio in the F2 generation of a monohybrid cross between two heterozygous individuals (Tt × Tt)?
A3:1
B1:1
C1:2:1
D9:3:3:1
Explanation
The genotypic ratio from Tt × Tt is 1 TT : 2 Tt : 1 tt = 1:2:1. The 3:1 ratio is the phenotypic ratio (3 tall : 1 dwarf) — a common trap! The 9:3:3:1 ratio applies to a dihybrid cross.
Q11 Medium
In Mendel's dihybrid cross (RrYy × RrYy), the F2 phenotypic ratio 9:3:3:1 is observed. What key condition must be true for this ratio to appear?
ABoth genes must show incomplete dominance
BBoth genes must be on the same chromosome
COne gene must be epistatic to the other
DThe two genes must assort independently (unlinked)
Explanation
The 9:3:3:1 ratio only appears when the genes are on different chromosomes or far enough apart on the same chromosome to assort independently. The lecture notes that "Mendel was lucky" — the two traits he studied were on different chromosomes. If genes were linked (close together on the same chromosome), the ratio would deviate.
Q12 Tricky
Why did the lecture describe Mendel as "lucky" in his experimental design for studying independent assortment?
AThe traits he studied happened to be on different chromosomes
BHe used a species with unusually short generation time
CAll his traits showed complete dominance without exceptions
DPea plants are polyploid, giving clearer segregation patterns
Explanation
The lecture explicitly states Mendel was lucky because the two traits he studied (seed shape and seed color) were on different chromosomes, allowing them to assort independently. If they had been linked (close together on the same chromosome), the 9:3:3:1 ratio would not have appeared, making the underlying pattern harder to detect. Pea plants are diploid, not polyploid.
Q13 Medium
If the tall (T) and dwarf (t) alleles in Mendel's pea plants had shown incomplete dominance instead of complete dominance, what phenotype would heterozygous (Tt) plants display?
ATall
BMedium height (intermediate)
CDwarf
DBoth tall and dwarf simultaneously
Explanation
The lecture states: "If the 'tall' and 'dwarf' alleles had shown incomplete dominance, Mendel would have observed pea plants with medium height, somewhere between the tall and dwarf phenotypes." Incomplete dominance produces an intermediate phenotype, while codominance (D) would show both phenotypes fully expressed (like AB blood type).
Q14 Easy
According to Mendel's Law of Segregation, how many alleles does each gamete carry for a given gene?
ATwo — one from each parent
BIt depends on the ploidy level
COne
DTwo identical copies
Explanation
The Law of Segregation states that during gametogenesis, each individual's pair of alleles segregates, meaning only ONE allele is passed into each gamete. A diploid individual carries two alleles per gene, but each gamete receives only one.
Q15 Medium
A plant with genotype TT is grown in a physically restricted environment. What is the expected outcome?
AIt will always reach full tall height because it is homozygous dominant
BIt will become dwarf because the environment overrides the genotype
CIts genotype will change to Tt due to environmental pressure
DIt may not reach its full height despite having the genetic potential for tallness
Explanation
The lecture states that a plant with the tall allele "grown in a physically restricted environment may not reach its full height." This illustrates environmental effects on phenotypic expression — the genotype doesn't change (C is wrong), but the phenotype can be modified. The environment doesn't override genetics completely (B), but it can limit expression (D is correct).
Q16 Easy
Which blood type is an example of codominance?
AAB blood type
BType O blood
CType A blood (heterozygous)
DRh-negative blood
Explanation
The AB blood type is the classic example of codominance mentioned in the lecture — both A and B alleles are fully expressed in the heterozygote. Neither allele masks the other.
Q17 Easy
In a pedigree, what does a half-filled symbol represent?
AAn affected individual
BA carrier of a trait (usually recessive)
CA deceased individual
DAn individual of unknown sex
Explanation
A half-filled symbol = carrier of a trait. Fully filled = affected individual. A rhombus represents unknown/unspecified sex. These symbols are standard in pedigree notation.
Q18 Medium
Why does inbreeding increase the expression of recessive traits in a population?
AIt increases the mutation rate at recessive loci
BIt changes recessive alleles into dominant alleles
CIt increases the probability of homozygosity through identical-by-descent alleles
DIt eliminates dominant alleles from the population
Explanation
Inbred individuals may inherit two alleles that are identical by descent (IBD) — both alleles come from a common ancestor. IBD alleles increase the probability of homozygosity, making recessive traits more likely to be expressed. Inbreeding doesn't change alleles or increase mutation rates.
Q19 Medium
Which of the following is NOT a limitation of pedigree analysis in human studies?
ALow reproductive rate
BUncontrolled matings
CLong generation time
DInability to trace monogenic diseases
Explanation
The three limitations listed in the lecture are: low reproductive rate, uncontrolled matings, and long generation time (A, B, C). However, pedigree analysis remains a core tool for tracing monogenic diseases, identifying carriers, and diagnosing X-linked or mitochondrial disorders — so D is not a limitation; it's actually a strength.
Q20 Easy
In a PLINK .ped file, what does column 6 represent?
ASex of the individual
BPhenotype (usually 1 = control, 2 = case)
CMaternal ID
DAllele data for the first SNP
Explanation
In the PLINK .ped file: col 1 = Family ID, col 2 = Individual ID, col 3 = Paternal ID, col 4 = Maternal ID, col 5 = Sex (1=M, 2=F), col 6 = Phenotype (1=control, 2=case, -9/0=missing), col 7+ = allele data. Sex is column 5, not 6.
Q21 Tricky
In a PLINK .ped file, the sex column uses the coding: 1 = Male, 2 = Female, 0 = Unknown. What value is used for missing phenotype information?
A0 only
B-1
C-9 or 0
DNA
Explanation
Missing phenotype data in PLINK is coded as -9 or 0. This is tricky because 0 is used for unknown sex AND can indicate missing phenotype. In the sex column: 0 = unknown sex. In the phenotype column: 1 = control, 2 = case, -9/0 = missing. The .fam file description specifies: '-9'/'0'/non-numeric = missing data for case/control.
Q22 Medium
Which PLINK file specifies chromosome number, SNP ID, genetic distance, and physical base-pair position?
A.map or .bim file
B.ped file
C.fam file
D.bed file
Explanation
The .map (or .bim in binary format) file contains genomic position information: chromosome number, SNP ID, genetic distance, and physical base-pair position. The .ped file contains individual-level metadata and genotypes. The .fam file contains sample information (first 6 columns of .ped). The .bed file is binary genotype data.
Q23 Medium
Which of the following is NOT a listed application of PLINK?
ARunning quality control on genomic data
BStudying population structure
CIdentifying genetic variants associated with diseases
DPerforming de novo genome assembly
Explanation
PLINK's applications include: identifying variants associated with diseases, analyzing heritability, studying population structure, running QC (missingness, heterozygosity, HWE), and feeding data into advanced models. De novo genome assembly is NOT a PLINK function — that requires specialized assemblers like SPAdes, Canu, or similar tools.
Q24 Easy
During which phase of mitosis are chromosomes best visualized in cytogenetics?
AInterphase
BMetaphase
CAnaphase
DTelophase
Explanation
Chromosomes are best visualized during metaphase because they are in their most condensed state. This makes them visible under microscopy and allows staining techniques (Giemsa, FISH) to distinguish individual chromosomes.
Q25 Medium
Even with whole genome sequencing available, cytogenetics still plays a crucial role in which of the following?
AIdentifying SNPs at single-base resolution
BSequencing mitochondrial genomes
CDiagnosing chromosomal abnormalities and genome assembly validation
DMeasuring allele frequencies in populations
Explanation
The lecture lists cytogenetics' current roles as: clinical genetics (diagnosing syndromes like Down syndrome), cancer genomics (chromosomal rearrangements), evolutionary biology (comparing karyotypes), and genome assembly validation. SNP identification and allele frequency measurement are done with sequencing/genotyping tools, not cytogenetics.
Q26 Easy
What does linkage disequilibrium (LD) describe?
AThe non-random association of alleles at two or more loci in a population
BThe physical distance between two genes in base pairs
CThe random segregation of alleles during meiosis
DThe linkage between a gene and its protein product
Explanation
LD refers to the non-random association of alleles at two or more loci — certain allele combinations occur together more (or less) frequently than expected if they were independent. This is a key concept for GWAS and population genomics.
Q27 Tricky
Which statement about linkage disequilibrium (LD) is correct?
ALD is the same thing as physical linkage
BLD can only exist between loci on the same chromosome
CRecombination increases LD over time
DLD ≠ physical linkage, but physical linkage contributes to LD
Explanation
The lecture explicitly states: "LD ≠ physical linkage, but physical linkage contributes to LD." LD can also be caused by population structure, genetic drift, selection, and population admixture — factors that can create LD even between loci on different chromosomes. Recombination breaks down LD (not increases it).
Q28 Medium
Which of the following is NOT listed as a cause of linkage disequilibrium?
APhysical proximity of loci on the same chromosome
BHigh recombination rates between loci
CGenetic drift and small population sizes
DNew mutations arising on specific genetic backgrounds
Explanation
High recombination rates BREAK DOWN LD, they do not cause it. Causes of LD include: physical proximity (low recombination), small population sizes, genetic drift, selection, population admixture, and new mutations on specific backgrounds.
Q29 Medium
In the LD numerical example, if P(A) = 0.5 and P(B) = 0.5, what is the expected frequency of haplotype AB under linkage equilibrium?
A0.25
B0.50
C0.40
D0.10
Explanation
Under linkage equilibrium (independence), P(AB) = P(A) × P(B) = 0.5 × 0.5 = 0.25. The observed frequency was 0.4, which is higher than expected, indicating LD. The difference between observed and expected haplotype frequencies is the hallmark of LD.
Q30 Medium
The lactose tolerance haplotype block near the LCT gene has been maintained in high LD. What evolutionary force is responsible?
AGenetic drift
BRandom mutation
CPositive selection
DGene flow
Explanation
The lecture states that SNPs near the LCT gene "form a haplotype block that has been maintained due to positive selection (people with this haplotype digest lactose better)." Positive selection favors beneficial alleles and their linked variants, preserving the LD structure.
Q31 Easy
What does 1 centimorgan (cM) represent?
A1 million base pairs of DNA
BA 10% chance of recombination per generation
CThe distance equal to one gene length
DA 1% chance of recombination per generation between two loci
Explanation
1 centimorgan (cM) = 1% chance of recombination between two loci per generation. On average, 1 out of 100 meioses will result in recombination between loci that are 1 cM apart. Named after Thomas Hunt Morgan. Note: 1 cM does NOT necessarily equal 1 Mb — the relationship between genetic and physical distance varies across the genome.
Q32 Hard
Two loci are 50 cM apart on the same chromosome. How do they behave in terms of inheritance?
AThey assort independently, as if on different chromosomes
BThey are always inherited together
CThey recombine 100% of the time
DThey cannot be mapped using recombination frequency
Explanation
50 cM is the maximum recombination frequency. Loci ≥50 cM apart recombine 50% of the time, which is the same as independent assortment (as if on different chromosomes). They are NOT linked at this distance. Note: 50 cM ≠ 100% recombination; the maximum observable recombination frequency is 50%.
Q33 Medium
In the ZW sex determination system (birds), which sex is heterogametic?
AMales (ZW)
BBoth sexes equally
CFemales (ZW)
DNeither — sex is determined by environment
Explanation
In birds and some reptiles (ZW system): Males = ZZ (homogametic), Females = ZW (heterogametic). This is the opposite of mammals where males (XY) are heterogametic. Don't mix them up!
Q34 Tricky
What is the pseudoautosomal region (PAR)?
AA region on autosomes that behaves like a sex chromosome
BThe only region of sex chromosomes where crossing over occurs during meiosis
CA duplicated region found on all chromosomes
DA region that determines sex in the ZW system
Explanation
The PAR is the only part of the sex chromosomes that acts like an autosome and allows crossing over during meiosis. It's located at the tips of the X and Y chromosomes. Option A is a clever distractor — PAR is a region of sex chromosomes that behaves like autosomes, not the other way around.
Q35 Hard
In Hymenoptera (e.g., honeybees), males are haploid and females are diploid. How are males produced?
AFrom fertilized eggs with a special sex-determining gene
BFrom eggs exposed to high temperature during development
CFrom fertilized eggs that lose one set of chromosomes
DFrom unfertilized eggs
Explanation
In the haplo-diploid system of Hymenoptera: males develop from unfertilized eggs and are haploid (one set of chromosomes), while females develop from fertilized eggs and are diploid (two sets). This is a unique sex determination mechanism distinct from XY and ZW systems.
Q36 Medium
Which of the following is NOT an assumption of Hardy-Weinberg Equilibrium?
AInfinitely large population size
BRandom mating
COverlapping generations
DNo mutation, migration, or selection
Explanation
HWE requires that generations do NOT overlap (parents do not mate with offspring). "Overlapping generations" violates the HWE assumptions. The full list: diploid, sexual reproduction, non-overlapping generations, random mating, infinite population, equal allele frequencies between sexes, no evolutionary forces.
Q37 Easy
If the frequency of allele A is p = 0.6 and allele a is q = 0.4, what is the expected frequency of heterozygotes (Aa) under Hardy-Weinberg equilibrium?
A0.48
B0.36
C0.24
D0.16
Explanation
Under HWE, heterozygote frequency = 2pq = 2 × 0.6 × 0.4 = 0.48. AA = p² = 0.36, aa = q² = 0.16. Check: 0.36 + 0.48 + 0.16 = 1.00 ✓
Q38 Tricky
In the classroom experiment, 21 students were assigned genotypes: 8 AA, 6 AB, 7 BB. Compared to HWE expectations, the number of heterozygotes is:
AHigher than expected — suggesting outbreeding
BExactly as expected — population is in HWE
CCannot be determined without knowing allele frequencies
DLower than expected — consistent with inbreeding or non-random mating
Explanation
From the data: p(A) = (16+6)/(42) = 0.52, q(B) = 0.48. Expected heterozygotes = 2pq × 21 = 2(0.52)(0.48)(21) ≈ 10.5. Observed = 6. So there are fewer heterozygotes than expected, which the lecture attributes to "random assignments (not truly random mating), small sample size, and inbreeding, which increases the proportion of homozygous genotypes."
Q39 Medium
What is the difference between dominance and epistasis?
ADominance is between populations; epistasis is within populations
BDominance is interaction between alleles at the same gene; epistasis is interaction between alleles at different genes
CDominance is always complete; epistasis can be partial
DDominance affects phenotype; epistasis only affects genotype
Explanation
Dominance (intragenic interaction) occurs between alleles of the SAME gene. Epistasis (intergenic interaction) occurs between DIFFERENT genes. Both affect phenotype. Dominance is not always complete — it can be incomplete or show codominance.
Q40 Easy
A heritability (h²) value of 0.65 for height would be classified as:
ALow heritability
BMedium heritability
CHigh heritability
DHeritability cannot exceed 0.5
Explanation
Heritability ranges: Low < 0.1, Medium = 0.1–0.4, High > 0.4. A value of 0.65 is high heritability. The lecture notes stature has h² ≈ 0.5–0.7, which is classified as high. Heritability can range from 0 to 1.
Q41 Easy
Who coined the term "genome" and in what year?
AThomas Hunt Morgan, 1910
BHans Winkler, 1920
CGregor Mendel, 1866
DFrancis Collins, 2003
Explanation
The term "genome" was defined in 1920 by Hans Winkler as the set of genes in a haploid set of chromosomes. Today the term encompasses all DNA in a cell (nuclear, mitochondrial, etc.).
Q42 Medium
How many genomes do plants typically have?
AOne (nuclear)
BTwo (nuclear + mitochondrial)
CTwo (nuclear + chloroplast)
DThree (nuclear + mitochondrial + chloroplast)
Explanation
The lecture states "Plants typically have 3 genomes": nuclear, mitochondrial, and chloroplast. Animals have 2 (nuclear + mitochondrial). This is an easy detail to overlook.
Q43 Tricky
The Human Genome Project required an accuracy standard of fewer than:
A1 error per 10,000 bases
B1 error per 1,000 bases
C1 error per 100,000 bases
D1 error per 1,000,000 bases
Explanation
The lecture states the HGP required "an accuracy standard of fewer than one error per 10,000 bases." This is a specific number from the slides that a professor might test.
Q44 Medium
Which of the following is an example of an extremophile from the Archaea domain?
AEscherichia coli
BSaccharomyces cerevisiae
CMethanogens
DDrosophila melanogaster
Explanation
The lecture lists Archaea as extremophiles including: Thermophiles (heat-loving), Halophiles (salt-loving), and Methanogens (methane-producing). E. coli is a bacterium, yeast is a eukaryote, and Drosophila is an insect.
Q45 Medium
What is the Whole Genome Shotgun (WGS) approach?
ASequencing individual chromosomes one at a time
BRandom sequencing of DNA to reconstruct full genomes without prior knowledge of DNA location
CSequencing only protein-coding regions of the genome
DTargeted sequencing of specific disease-associated genes
Explanation
WGS involves random sequencing of DNA fragments to reconstruct the full genome without needing prior knowledge of where each fragment comes from. This is a core sequencing strategy covered throughout the course.
Q46 Tricky
Which of the following metadata can sometimes be inferred from genomic data by comparing sequences to annotated reference datasets?
ASex
BStature
CDiet
DLocation of sample collection
Explanation
The lecture states that some features like sex "can sometimes be inferred by comparing sequences to annotated reference datasets" (e.g., by checking X/Y chromosome coverage), while others like stature "are harder to predict purely from genomic data." Stature is a complex trait influenced by many genes and environment.
Q47 — Open Short Answer
List the four main domains of genetics and briefly describe the focus of each.
✓ Model Answer

1. Classical Genetics (Transmission/Formal Genetics): Focuses on how traits are passed from parents to offspring using breeding experiments and phenotypic analysis to infer genetic laws (e.g., Mendel's laws).

2. Molecular Genetics: Investigates the structure and function of genes at the molecular level (DNA, RNA, proteins), including gene expression, mutation, and gene regulation.

3. Population Genetics: Studies the distribution and change of allele frequencies within populations, linking genetics with evolutionary biology.

4. Quantitative Genetics: Analyzes traits controlled by multiple genes using statistical models to estimate genetic contribution to phenotypic variation (e.g., height, milk production).

Q48 — Open Calculation
In a population of 200 individuals, you observe the following genotypes: 90 AA, 40 Aa, 70 aa. Calculate the allele frequencies of A and a, determine the expected genotype frequencies under HWE, and state whether this population is in Hardy-Weinberg equilibrium.
✓ Model Answer

Step 1: Calculate allele frequencies

Total alleles = 200 × 2 = 400
Copies of A = (90 × 2) + (40 × 1) = 180 + 40 = 220
p = freq(A) = 220 / 400 = 0.55
q = freq(a) = 1 − 0.55 = 0.45

Step 2: Expected genotype frequencies under HWE

AA = p² = 0.55² = 0.3025 → expected count = 0.3025 × 200 = 60.5
Aa = 2pq = 2 × 0.55 × 0.45 = 0.495 → expected count = 0.495 × 200 = 99
aa = q² = 0.45² = 0.2025 → expected count = 0.2025 × 200 = 40.5

Step 3: Comparison

Observed: 90 AA, 40 Aa, 70 aa
Expected: 60.5 AA, 99 Aa, 40.5 aa

There is a large excess of homozygotes and a deficit of heterozygotes (40 observed vs. 99 expected). The population is NOT in HWE. This deviation could be caused by inbreeding, population structure, selection, or non-random mating.

Q49 — Open Short Answer
Explain what linkage disequilibrium (LD) is, how it differs from physical linkage, and give one real-world example mentioned in the lecture.
✓ Model Answer

Definition: Linkage disequilibrium (LD) is the non-random association of alleles at two or more loci in a population. Certain allele combinations occur together more (or less) frequently than expected under independence.

LD vs. Physical Linkage: LD ≠ physical linkage. Physical linkage refers to genes being on the same chromosome. Physical linkage contributes to LD (close genes recombine less), but LD can also arise from other forces: small population size, genetic drift, selection, population admixture, or new mutations. Conversely, physically linked genes can have low LD if enough recombination has occurred over time.

Real-world example: Lactose tolerance in humans — a variant near the LCT gene (lactose digestion) is in high LD with nearby SNPs, forming a haplotype block maintained by positive selection because individuals with this haplotype digest lactose better.

Q50 — Open Short Answer
Describe the structure of a PLINK .ped file. What information does each column contain (columns 1–7+)?
✓ Model Answer

The .ped file is a text file with no header, where each line corresponds to one individual. The columns are:

Column 1 — Family ID: Identifier for the family, used to group related individuals.

Column 2 — Individual ID: Unique identifier for each individual.

Column 3 — Paternal ID: Father's ID (0 if unknown).

Column 4 — Maternal ID: Mother's ID (0 if unknown).

Column 5 — Sex: 1 = Male, 2 = Female, 0 = Unknown.

Column 6 — Phenotype: 1 = control, 2 = case, -9 or 0 = missing.

Column 7+ — Allele data: Genotype information with two alleles per locus (e.g., A A, G T). The number of loci can be as many as the dataset supports.

Genomic positions for each locus are specified in the associated .map or .bim file (chromosome, SNP ID, genetic distance, physical position).

Q51 — Open Tricky
Explain the formula P = G + E in quantitative genetics. What are the three components of the genotype effect, and how does heritability relate to this equation?
✓ Model Answer

The equation: Phenotype = Genotype effect + Environmental effect, or Var(P) = Var(G) + Var(E)

Three components of the genotype effect:

1. Additive genetic effect: The sum of individual allele effects across all loci contributing to the trait.

2. Dominance effect (intragenic): Interaction between alleles at the same gene (e.g., how Tt differs from the average of TT and tt).

3. Epistatic effect (intergenic): Interaction between alleles at different genes.

Heritability (h²): Measures how much of the phenotypic variation in a population is due to genetic differences. It is calculated by comparing related individuals (since unrelated individuals don't share genetic background). Ranges: Low (<0.1), Medium (0.1–0.4), High (>0.4). It tells us what proportion of Var(P) is attributable to Var(G).