-
精确描述复杂分子体系的自由能地貌图是理解和操控其行为, 并进一步实现分子设计制造工业化的重要基础. 刻画高维空间自由能地貌图的主要挑战是其往往在不同时空间尺度上具有多个层次, 每个层次都可能有不止一个亚稳态被相应的自由能垒分开, 且跨越路径有可能不止一条. 另外很多体系涉及非线性行为, 这使得理论解析和直接使用分子模拟都有很大困难. 针对这些挑战, 多年来研究者们发展了多种多样的增强采样方法, 但往往需要很多经验选择和操作, 从而一方面使得研究进程较为缓慢, 另一方面也让误差控制成为困难. 变分虽然在物理、统计和工程中已经被广泛应用并取得巨大成功, 但在复杂分子体系中的应用却随着神经网络的发展刚刚开始. 本文将对这些探索性工作的主要方向、进展和局限进行简要总结, 也对将来的可能发展给出展望, 希望能够激发更多对基于变分的分子体系自由能地貌图人工智能算法的关注和努力, 促进大分子药物、分子生物机器等实践应用的发展.Accurate description of the free energy landscape (FES) is the basis for understanding complex molecular systems, and for further realizing molecular design, manufacture and industrialization. Major challenges include multiple metastable states, which usually are separated by high potential barriers and are not linearly separable, and may exist at multiple levels of time and spatial scales. Consequently FES is not suitable for analytical analysis and brute force simulation. To address these challenges, many enhanced sampling methods have been developed. However, utility of them usually involves many empirical choices, which hinders research advancement, and also makes error control very unimportant. Although variational calculus has been widely applied and achieved great success in physics, engineering and statistics, its application in complex molecular systems has just begun with the development of neural networks. This brief review is to summarize the background, major developments, current limitations, and prospects of applying variation in this field. It is hoped to facilitate the AI algorithm development for complex molecular systems in general, and to promote the further methodological development in this line of research in particular.
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] -
变分方法 主要目标 关注的集合空间
问题类别特点或主要局限 频谱分
解分析基组线
性组合给定构象子状态空间划分下求解集合变量和子态间转换速率 第1类、第2类 马尔可夫假设与线性基组局限, 需要人工划分构象空间子状态 神经网
络实现从给定轨迹中直接求解子态划分和对应转换速率 第2类 马尔可夫假设, 没有解析表示的特征函数, 需要人工调整架构测试不同聚类数量 自由能垒跨越概率时间关
联函数基组线
性组合在选定基组空间的线性组合基础上求解状态转换路径和其上的自由能垒跨越概率 第3类 基组线性组合局限, 需要定义始末态 神经网
络实现在和给定始末态一致的神经网络函数空间求解状态转换路径和其上的自由能垒跨越概率 第3类 需要定义始末态 基于偏置
势变分基组线
性组合利用偏置势增强采样在基组线性组合空间快速求解给定集合变量方向自由能主要能量谷地 第2类 泛函受基组选择限制 神经网
络实现利用偏置势增强采样在神经网络函数空间快速求解给定集合变量方向自由能主要能量谷地 第2类 泛函导数求解的采样需求导致偏置势(和对应自由能)的精度紧密相关, 收敛受KL散度非对称性限制 Lumpability 和
Decomposability优化集合变量 第1类 有明确误差控制, 方差取决于隐空间维度, 两种定义的一致性要求可逆过程 信息瓶颈模型 求解信息瓶颈对应集合空间CV表示, 并利用偏置势加速自由能面采样 第2类 线性编码过程假设局限 变分自适应 结合粗粒化信息加速采样求解自由能面 第2类 总体架构较为复杂 变分自编码器 通过集合变量空间加速采样求解自由能面和聚类转化路径 第2类、第3类 特别关注隐空间 -
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133]
计量
- 文章访问数:1485
- PDF下载量:74
- 被引次数:0