Splice Site Tool Analysis

an image linked to this news item

NGRL Manchester has published a report describing the assessment of six of the most common donor and acceptor prediction algorithms for their ability to predict the pathogenicity of splice site variants.


Splicing is a process which modifies RNA after transcription. It allows for introns to be removed and exons joined together to form mature mRNA, ready for translation into protein. Specific splicing of a gene can be easily affected by mutations in the sequence surrounding the splice site junction, leading to alternate splicing and thus alternate protein products.


In-silico splice site prediction tools can be used to predict the effect of a genetic variant on splicing. A large number of prediction tools are currently available but only small scale analyses of these algorithms have been carried out. The UV guidelines provided by the CMGS suggest several splice site prediction algorithms, but the performance of these algorithms has not previously been formally assessed and may give divergent results. The splice site tool analysis performed by NGRL Manchester aims to provide a reliable assessment of the performance of these algorithms in the prediction of splicing-related variant pathogenicity. It also assesses the scope of the splice-site prediction tools to ensure that they can be used in the most appropriate way, and the report shows scientists how to use splice site prediction tools for the prediction of pathogenesis with more confidence.


The report describes the assessment of six of the most common donor and acceptor prediction algorithms for their ability to predict the pathogenicity of splice site variants. SSFL, MaxEntScan, NNSplice and GeneSplicer were accessed through the Alamut interface. HSF and a second implementation of MaxEntScan were accessed through the HSF interface. Netgene2 was implemented using a stand alone web interface. In each algorithm the splice signal given by the wild type sequence was compared to the splice site signal given by a mutated sequence supplied by the user. 

We conclude that the four algorithms used in Alamut were shown to have a high degree of accuracy and users can be confident in the safe interpretation of these results.  The algorithms, with the exception of SSFL, can be used as standalone web tools as well as via the Alamut interface. However, the results obtained through alternative implementations may differ.

The range of splice site signal strength predictions given by the algorithms was determined by the position of the variant. Variants found between +7 and -10 from the splice site junction show a reduction in splicing predicted by the algorithms and it is in this range that the algorithms are likely to be the most useful.

Download the report >>