CanPredict


Back to catalogue >>

Reference: Kaminker J.S., Zhang Y., Waugh A., Haverty P.M., Peters B., Sebisanovic D., Stinson J., Forrest W.F., Bazan F., Seshagiri S., Zhang Z. Distinguishing cancer-associated missense mutations from common polymorphisms. Cancer Research (2007) 67 (2) 465-473.
Hosted: Hosted at the University of California, San Francisco, Computer graphics laboratory. (http://www.cgl.ucsf.edu/Research/genentech/canpredict/index.html)

Summary:
CanPredict classifies missense variants as cancer-associated or not. It uses a random forest, a machine learning algorithm trained on classifications from a number of predictive methods.

Methodology:
The random forest is trained on three predictive methods:
• SIFT scores of cancer-related variants.
• Pfam-based LogR.E-value scores of cancer-related variants.
• Measured characteristic that describes a cancer gene using gene ontology (GO).
A query variant can then be assessed by the random forest and, based on its scores for each of the above predictions, be classified as likely cancer-related or not.

Input:
The protein accession can be given or alternatively the protein sequence can be pasted in FASTA format. The substitutions can then be given. For predicting many variants over different genes, the batch submission form should be used. Results are emailed to the user.