The complexity of the genome gives rise to global optimization problems that test the limits of our computational power. One such problem is the inference of isoform function.
Isoform function prediction is the problem of deciding which biological functions are performed by specific gene products. The lack of labeled training data makes an unsupervised approach that shares information between the gene products highly desirable. We developed a parallelized Expectation-Maximization algorithm that makes use of mini-batches as an effective strategy for tackling the computational complexity of the problem and the very large input size. The algorithm optimizes a model that jointly predicts the sequence similarity of all pairs of isoforms given the number of shared functions.
One of the projects in which we used our isoform function inference methodology is the study of coupling of expression and splicing. We found that many exons whose splicing is coupled to the transcription of their gene are included in isoforms that have the potential to prevent genomic instability. Furthermore, their inclusion suggests synchronization with the cell-cycle. We also observed that this functionality is dysregulated in cancer.
Publications:
An expectation–maximization framework for comprehensive prediction of isoform-specific functions
Alternative splicing is coupled to gene expression in a subset of variably expressed genes(preprint)
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.