学术视点
| 来源:【字号:大 中 小】
题目:A Comparative Study of Sequence Clustering Algorithms(序列聚类算法的比较研究)
作者:Z. Ju et al
来源:Big Data Mining and Analytics(大数据挖掘与分析), vol. 8, no. 5, pp. 1011-1022.
摘要:Sequence clustering software is essential in bioinformatics. However, selecting the appropriate one can be challenging due to its diverse algorithms and targeted applications. This paper analyzes and evaluates eight representative softwares (algorithms) in terms of precision, sensitivity, speed, scale of running time, and memory consumption. Furthermore, this paper examines the effects of sequence count, sequence length, identity, thread count, and GPU on the above aspects. Sequence length and identity significantly impact clustering efficiency (speed and memory consumption), with fluctuation amplitudes exceeding an order of magnitude and non-monotonic effects observed. The evaluation results are analyzed and summarized in tables for users' reference.
编者译:序列聚类软件乃生物信息学研究之关键工具,然其算法繁杂、应用各异,致使遴选适配软件颇具难度。本文遴选了八款具代表性的软件(算法),就其精确度、灵敏度、运算速率、运行时长规模及内存占用等维度进行了系统评估,并进一步探讨了序列数量、序列长度、相似度阈值、线程数及GPU加速对上述指标的影响。研究表明,序列长度与相似度阈值对聚类效率(运算速率与内存消耗)具有显著影响,其波动幅度逾一个数量级,且呈非单调变化。评估结果经归纳整理,以表格形式呈现,供研究者择取参考。
题目:中国科学院超级计算中心创新发展
作者:钱芳、柴芳姣、赵芸卿、田原、白一頔、姜金荣
来源:数据与计算发展前沿, 2025, 7(3): 15-29.
摘要:超级计算关乎国家发展,是世界各国竞相抢占的战略制高点。中国科学院超级计算中心在中国科学院的支持下,充分发挥中国科学院建制化优势,以科学计算应用需求为牵引,在超级计算环境、基础软件、应用软件方面积极布局、持续深耕。在计算与科研交叉领域取得了丰硕的成果,推动了中国高性能计算技术的自主创新与国际竞争力提升。
题目:中国科学院科学传播信息化的发展历程与思考
作者:林磊、杜义华、王英、张文韬、何洪波、陈雄、王闰强
来源:数据与计算发展前沿, 2025, 7(3): 40-47.
摘要:阐述中国科学院在科学传播信息化方面的探索历程,分析其发展历程、主要成就及面临的挑战。通过回顾中国科学院科学传播信息化的历史背景、关键节点和主要成果,梳理其发展历程。分析互联网、移动互联网、大数据和人工智能等技术对科学传播的影响,以及中国科学院如何适应和利用这些技术推动科学传播的创新发展。中国科学院在不同时代推进科学传播:互联网兴起时构建网络化平台高效集成传播科普资源;移动互联网时代构建新媒体矩阵实现多元精准传播;大数据时代利用技术提升传播精准度;人工智能时代探索AIGC技术推动科学传播创新应用。中国科学院在科学传播信息化方面取得了显著成果,为提升公众科学素养、推动科学与社会互动做出了重要贡献。未来,随着人工智能技术的不断发展,中国科学院将继续探索利用新技术推动科学传播的创新,构建更加智能化、精准化和高效化的科学传播新体系。
题目:Biological Knowledge Graph-Enhanced Cancer State Prediction Network with Adjustable Connections(基于生物知识图谱增强的可调连接的癌症状态预测网络)
作者:Y. Liu, S. Yi, X. Chen, W. Guo and L. He
来源:Big Data Mining and Analytics(大数据挖掘与分析), vol. 8, no. 5, pp. 1174-1188.
摘要:With recent advances in oncology research, effectively utilizing patient genomic data to predict cancer status has become a significant challenge. Although previous deep learning methods based on biological knowledge have made progress, they still face limitations, such as the restricted scope and imprecision of the biological knowledge covered by the biological functional datasets used, as well as difficulties in adapting to data variations. To address these challenges, this paper proposes a novel deep learning neural network model based on a biological knowledge graph (namely BKGNet) and an adjustable connection mechanism─biological knowledge graph-enhanced cancer state prediction network with adjustable connections. Specifically, we construct a predictive neural network that leverages the richer and more precise structured biological information from the knowledge graph, providing the model with more comprehensive and accurate biological background support, thereby enhancing its ability to model complex biological relationships. We also introduce an adjustable connection mechanism that, while ensuring the rationality of biological relationships, transforms fixed biological connections into learnable connection strengths. This allows the model to flexibly adjust the interactions between biological entities. Experimental results demonstrate that BKGNet outperforms traditional machine learning and deep learning baseline models in terms of prediction accuracy, highlighting the advantages of its network architecture. Further ablation experiments validate the effectiveness of both the knowledge graph and the adjustable connection mechanism.
编者译:随着肿瘤学研究的不断深入,如何有效利用患者基因组数据以精准预测癌症状态已成为一项重大挑战。尽管既往基于生物学知识的深度学习方法取得了一定进展,但仍受限于生物功能数据集所涵盖知识的范围狭窄与精度不足,且难以适应数据异质性。为破解上述难题,本文提出一种融合生物知识图谱与可调连接机制的新型深度学习神经网络模型——生物知识图谱增强的可调连接癌症状态预测网络。具体而言,该模型充分挖掘知识图谱中更加完备且精确的结构性生物学信息,为网络提供更为全面、准确的生物学背景支撑,以强化其对复杂生物关系的建模能力;同时引入可调连接机制,在确保生物关系合理性的前提下,将固定连接转化为可学习的连接强度,使模型能够灵活调节生物实体间的相互作用。实验结果表明,该网络在预测精度上显著优于传统机器学习及深度学习基线模型。
题目:生成式人工智能(GAI)背景下的新型数字鸿沟识别框架研究
作者:孙榕、李白杨
来源:图书情报知识, 2025, 42(3): 19-30.
摘要:识别生成式人工智能背景下新型数字鸿沟的表现形式和核心内容,为破解数字弱势群体的认知和行动困境提供启示。基于“认知—行为”视角,利用大规模文献调研的方法构建框架。构建了一个“认知—获取—使用—评估”框架,以识别和解析生成式人工智能环境下新型数字鸿沟的三级指标;根据分析结果针对性地提出了对策建议,并对未来研究进行了展望。提出识别和解析生成式人工智能背景下新型数字鸿沟的框架,为解决新型数字鸿沟相关的概念辨析、形态识别和弥合路径等问题提供参考。
