-
Modeling of RNA tertiary structure is one of the basic problems in molecular biophysics, and it is very important in understanding the biological function of RNA and designing new structures. RNA tertiary structure is mainly determined by seven torsions of main-chain and side-chain backbone, the accurate prediction of these torsion angles is the basis of modeling RNA tertiary structure. At present, there are only a few methods of using deep learning to predict RNA torsion angles, and the prediction accuracy needs further improving if it is used to model RNA tertiary structure. In this study, we also develop a deep learning method, 1dRNA, to predict RNA backbone torsions and pseudotorsion angles, including two different deep learning models, the convolution model (DRCNN) that considers the features of adjacent nucleotides and the Hyper-long-short-term memory model (DHLSTM) that considers the features of all the nucleotides. We then empirically show that DRCNN and DHLSTM outperform existing state-of-the-art methods under the same datasets, the prediction accuracy of DRCNN model is improved by 5% to 28% for β, δ, ζ, χ, η, and θangle, and the prediction accuracy of DHLSTM model is improved by 6% to 15% for β, δ, ζ, χ, η, θangle. The DRCNN model predicts better results than the DHLSTM model and the existing models in the δ, ζ, χ, η, θangle, and the DHLSTM model predicts better results than the DRCNN model and the existing model in the βand εangles, and the existing models predicted better results than the DRCNN model and DHLSTM model in the αand γangles. The DRCNN model and the existing models predict a richer distribution of angles than the DHLSTM model. In terms of model stability, the DHLSTM model is much more stable than the DRCNN model and the existing models, with fewer outliers. The results also show that the αangle and γangle are the most difficult to predict, the angles of the ring region is more difficult to predict than the angles of the helix region, the model is also not sensitive to the change of the target sequence length, and the deviation of the model prediction angle from the decoys can also be used to evaluate the RNA tertiary structures quality.
-
Keywords:
- RNA structure/
- torsional angle prediction/
- deep learning
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] -
数据集 序列长度区间数目 二级结构 20—50 50—100 100—200 200—300 300—400 400—512 括号 假结 不配对 训练集 50 179 46 1 7 1 55.10% 5.63% 39.36% 验证集 20 10 0 0 0 0 52.19% 9.8% 38.01% 测试集I 11 41 10 0 0 0 57.58% 2.81% 39.61% 测试集II 8 16 6 0 0 0 58.42% 5.25% 36.33% 测试集III 40 13 1 0 0 0 65.02% 2.67% 32.31% 数据集 7个标准扭转角 伪角 α/(°) β/(°) γ/(°) δ/(°) ε/(°) ζ/(°) χ/(°) η/(°) θ/(°) DHLSTM 验证集 47.91 20.22 37.18 16.57 18.23 35.02 19.85 28.09 32.85 测试集I 48.20 20.66 37.13 13.08 18.82 30.27 17.33 25.74 29.22 测试集II 47.95 19.89 35.30 15.19 17.87 30.99 17.67 27.20 31.49 测试集III 45.45 22.30 40.80 13.51 21.43 30.69 16.96 23.87 29.84 DRCNN 验证集 44.67 19.96 35.31 13.86 22.20 31.62 19.49 24.77 30.22 测试集I 44.84 20.74 36.27 10.51 21.48 27.53 16.39 23.12 26.34 测试集II 43.41 19.55 35.45 12.19 22.71 28.13 17.16 24.28 28.12 测试集III 27.14 15.81 25.20 9.73 14.51 17.98 11.58 13.67 17.77 SPOT-
RNA-1D[21]验证集 45.18 20.58 33.88 17.99 20.72 37.50 23.01 33.55 37.02 测试集I 43.94 21.94 32.98 14.61 20.69 33.27 19.59 30.25 32.91 测试集II 39.50 18.92 29.47 16.01 17.46 28.91 18.20 28.14 30.25 测试集III 37.89 21.04 34.68 13.83 22.32 27.87 17.01 25.31 27.22 配对类型 七个标准扭转角 伪角 α/(°) β/(°) γ/(°) δ/(°) ε/(°) ζ/(°) χ/(°) η/(°) θ/(°) DHLSTM 括号 34.08 16.48 30.21 9.76 17.98 21.38 11.23 18.03 21.91 假结 34.20 14.98 27.06 6.80 14.25 20.29 10.98 27.41 18.02 环区 66.77 32.60 60.72 21.05 27.54 47.85 28.52 35.41 46.16 DRCNN 括号 19.43 11.40 18.54 6.65 11.84 12.0 8.30 10.90 12.94 假结 20.42 14.25 16.75 6.73 12.86 13.54 10.25 16.14 13.52 环区 40.84 23.26 37.44 15.59 19.07 29.07 18.44 19.25 27.08 -
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35]
Catalog
Metrics
- Abstract views:1783
- PDF Downloads:121
- Cited By:0