In:
Human Heredity, S. Karger AG, Vol. 73, No. 1 ( 2012), p. 47-51
Abstract:
〈 i 〉 Aims: 〈 /i 〉 Next-generation sequencing has opened the possibility of large-scale sequence-based disease association studies. A major challenge in interpreting whole-exome data is predicting which of the discovered variants are deleterious or neutral. To address this question in silico, we have developed a score called Combined Annotation scoRing toOL (CAROL), which combines information from 2 bioinformatics tools: PolyPhen-2 and SIFT, in order to improve the prediction of the effect of non-synonymous coding variants. 〈 i 〉 Methods: 〈 /i 〉 We used a weighted 〈 i 〉 Z 〈 /i 〉 method that combines the probabilistic scores of PolyPhen-2 and SIFT. We defined 2 dataset pairs to train and test CAROL using information from the dbSNP: ‘HGMD-PUBLIC’ and 1000 Genomes Project databases. The training pair comprises a total of 980 positive control (disease-causing) and 4,845 negative control (non-disease-causing) variants. The test pair consists of 1,959 positive and 9,691 negative controls. 〈 i 〉 Results: 〈 /i 〉 CAROL has higher predictive power and accuracy for the effect of non-synonymous variants than each individual annotation tool (PolyPhen-2 and SIFT) and benefits from higher coverage. 〈 i 〉 Conclusion: 〈 /i 〉 The combination of annotation tools can help improve automated prediction of whole-genome/exome non-synonymous variant functional consequences.
Type of Medium:
Online Resource
ISSN:
0001-5652
,
1423-0062
Language:
English
Publisher:
S. Karger AG
Publication Date:
2012
detail.hit.zdb_id:
1482710-4
SSG:
12