Introduction
We introduce a novel computational pipeline, Mut2Vec, to generate distributed representations of mutations and experimentally validate the efficacy of the generated mutation representation. We expect Mut2Vec to potentially serve as a helping hand in many biomedical applications such as cancer analysis and complex drug sensitivity problems.
Pre-trained Mut2Vec
We provide Mut2Vec in two annotation version of Ensembl Gene ID(ENSG) and HUGO Gene Nomenclature Committe gene symbol(HGNC).
Mut2Vec Release Data(.txt) | ||
---|---|---|
Date | ENSG | HGNC |
06. 26. 2017 | Download | Download |
Driver Candidates
Using IntOGen driver mutations, we measured driver enrichment p-value of each cluster with hypergeometric distribution and extracted clusters with p-value below 5-e2.
Driver Candidates | |
---|---|
Date | .tsv(Tab Separated) |
06. 26. 2017 | Download |
Mut2Vec Training Pipeline
Visualization of Driver/Passenger Mutation vectors
Red dots are drivers and Blue dots are passengers. The score in each figure is Normalized Mutual Information(NMI). Click on the figure to zoom in.