Effective automation of mutations assessment in cancer

Machine learning model have successfully detected mutations in genomes of patients with cancer diseases.

Training dataset consisted of 41 thousand manually reviewed variants from 440 patients with nine types of tumors. Initial tests relied on 1/3 of the original dataset and then proceeded to 212 thousand variants.

Out of three models utilizing logistic regression, random forest algorithm, and deep learning approach, the latter two performed best. Accuracy of predictions reached 89.3% agreement with human assessment.

The software is publicly available under https://github.com/griffithlab/DeepSVR.

More: “A deep learning approach to automate refinement of somatic variant calling from cancer sequencing data”, B. Ainscough et al., 2018, doi:10.1038/s41588-018-0257-y.