Fowzan Alkuraya, Alfaisal University
We currently know a small number are benign, and a smaller number are pathogenic. The idea is to drive it towards knowing every possible variant. Even if we could classify every variant, it would be outdated shortly. However, we can use phenotype, which keeps up with the gene pool – that way we can ask how the genotype translates to phenotype. It’s not really that easy…
The formidable challenge of heterozygosity.
We are robust to heterozygous mutations, obviously.
Gene level challenge. Is it dispensable? is there a non-disease phenotype? Is it a recessive disease phenotype.
Variant level: Some we’ll never see because they’re embryonically lethal. Some may never be clinically consequential. non-coding? truncating genes in dominant genes with no phenotype?
Fortunately, it’s all in the same species! And, if we can show something is pathogenic, we can know that for next time. Exploiting the special structure of the Saudi population to improve our understanding of the human genome.
- High rates of consanguinity – endless source of homozygotes.
- Large family size – great segregation power
Examples for Discovery of novel disease genes.
Some workflow: use predictive technologies, use frequency data. Use model organisms. etc. Use family data to identify how this variant exerts effect.
At the end of the day, this data can be shared so that everyone can benefit from this knowledge.
In second example, finding novel “lethal” genes. Can’t do it statistically because it’s so rare. Best hope is to observe balletic variants in non-viable embryonic tissue. Show a case in which homozygous variant was present in all non-viable embryos from single family. Able to do that without knowing anything about biology about the gene.
What do they do with it? They put it out so everyone can share in the knowledge. You never know which family is going to be making life-altering decisions based on the variant.
Published it – turned out to be the most frequent mutation in fetal losses in Saudi population. Turned out to be important in endothelia protein. (Cerebral Haemorrhages)
Now in Clinvar.
Example where it’s hard to understand the mechanism of the disease, and an example where prediction tools aren’t able to get it right.
How many variants are we just missing because they’re in the dark matter of the genome? variants in non-coding parts of the genome/variants in the coding part = ?
We don’t know either of these, so it’s a hard problem: Homozygosity mapping to the rescue. Challenge of non-coding mutations.
104 families with recessive genotype that maps to a single locus. 101 of 104 were found to have genic mutations. Vaast majority of disease causing mutations are in genes, then.
Good news: presumed non genic mutations <3%.
Bad new: many others will be missed for other reasons.
Demonstrated this with a sample cohort (33 families)
Catalogue of balletic LOF in well phenotypes individuals. Able to find several genes that have been erroneously linked to disease phenotype.
[My paraphrasing: So, in the end, we should all be concerned about all of the variants, and getting them right.]