A slot machine model for the formation of mRNAs in primates

NOV . 01 2017
Peking University, October 31, 2017: In eukaryotes, the flow of genetic information to proteins goes through the intermediate product of messenger RNAs (mRNAs), which arises first via transcription and undergoes sequence removal, or splicing, before becoming mature transcripts. The parts remained in mRNA are defined as exons and the removed parts as introns. These RNA processing events are tightly regulated with a frequency varied between 0 and 1, and typically multiple alternative processing events exist in one gene. Traditional technology can quantify the frequency of each alternative processing event, but are hard pressed to globally delineate the combination of multiple alternative events on the same mRNA molecule due to the limited scope. Using the train ride from Beijing to Shanghai as an analogy: Traditional approach can calculate the proportion of trains stopping at each station along the route; but if one wants to know the combinatorial mode of the stops for a train, a system with global scope is required to monitor each train in the whole journey.

Recently, the team headed by Dr. 
Li Chuan-Yun  from the Institute of Molecular Medicine, Peking University, performed a comparative study in human and monkey to investigate the combinatorial mode of alternative events at the whole-gene level. Using single molecule long-read sequencing, they substantially expanded the repertoire of alternative RNA processing events in primates, and found that the combination of these events is largely independent along the length of the gene, leading to thousands of novel gene products missed by current annotations. They further found that this independent combination has contributed to a large repertoire of human-specific isoforms encoded by 502 genes, linking them to human-specific functions.

The work has major impact on our functional understanding of transcriptome output. First, given the prevalence of unknown gene products missed by current “gold standard” annotations in primates, the reference transcript may not represent the major gene product encoded by the host gene. Second, direct generalization of the findings of gene functions in model animals, such as mouse and rat, to human may require additional validation; this is due to the notion that while model animals encode similar genes to human, their transcription products may have entirely different structures.

The paper describing these interesting findings was recently published in Molecular Biology and Evolution as a “Fast Track” cover story (, with Dr. 
Zhang Shi-Jian and Dr. Wang Chenqu as joint first authors, Dr. Li Chuan-Yun as the corresponding author. “This work stands to be a landmark contribution to the literature on the general topic of isoform evolution”, as commented by the editor of this magazine.

Edited by: Zhang Jiang
 Institute of Molecular Medicine