Version 0.8(2025-06-27)
- English
- 中文
1、Overview
We are excited to announce the official release of KAG version 0.8. This update focuses on continuously enhancing the consistency, rigor, and accuracy of large model knowledge base-driven reasoning and question-answering. It also introduces several significant new features and capabilities.
First, we have upgraded the capabilities of the KAG knowledge base. We have expanded support for two modes: private domain knowledge bases (including structured and unstructured data) and public domain knowledge bases. This includes the ability to integrate public web data sources such as LBS and WebSearch via the MCP protocol. Additionally, we have improved the management of private domain knowledge base indexing, incorporating multiple foundational index types such as Outline, Summary, KnowledgeUnit, AtomicQuery, Chunk, and Table. This supports developers in customizing indexes and synchronizing them with product interfaces. Users can select the most appropriate index type based on their specific scenarios, achieving a balance between construction costs and business outcomes.
Second, we have optimized the product experience and refined the system integration interfaces. On the product front, we have decoupled the knowledge base from applications. The knowledge base now manages private domain data (both structured and unstructured) and public domain data, while applications can link to multiple knowledge bases. Based on the index types used during knowledge base construction, the system automatically adapts the corresponding retrieval engine to recall data. Both applications and knowledge bases independently manage model dependencies and task scheduling, enhancing flexibility.
In terms of system integration, we have refined the KAG recall and Q&A interfaces, adding support for embedding reasoning and Q&A pages within business applications. We have fully embraced the MCP protocol, enabling seamless integration of KAG reasoning and Q&A into agent workflows (based on the MCP protocol).
Third, KAG has successfully adapted to the KAG-Thinker model. Through optimizations such as broad decomposition and deep solving of complex problems, knowledge boundary determination, and noise-resistant retrieval results, the stability of the KAG framework's reasoning paradigm and the rigor of its reasoning logic have been significantly improved under the guidance of iterative thinking paradigms. Further details on the release and usage of the KAG-Thinker model will be provided in our upcoming announcements.
In addition to the aforementioned framework and product optimizations, we have also enhanced Q&A efficiency and the stability of streaming outputs. This update is based on the Qwen2.5-72B foundation model and has achieved performance alignment across various RAG frameworks and selected KG datasets. The overall benchmark results of this release are illustrated in Figures 1~3, with detailed metrics available in the open_benchmark section.
Figure1 Performance of KAG V0.8 and baselines on Multi-hop QA benchmarks
Figure2 Performance of KAG V0.8 and baselines(from OpenKG OneEval) on Knowledge based QA benchmarks
Figure3 Performance of KAG V0.8 on NetOperator_QA benchmarks
2、Framework Enhancements
2.1、Configurable Index Management
Figure4 KAG Framework Optimization in V0.8
Index extraction and retrieval are core capabilities of knowledge base applications. KAG v0.8 has upgraded its architecture to support configurable management of index construction and retrieval. Each index type is equipped with its own extractor (Extractor) and retriever (Retriever), managed through the IndexManager.
KAG comes with built-in foundational index types such as KnowledgeUnit (an enhanced version of graph triples), Outline, Summary, Chunk, AtomicQuery, and Table, along with their corresponding extractors and retrievers. During the knowledge base construction phase, the platform invokes the appropriate extractor based on the index types selected by the user to perform index extraction. In the application phase, based on the knowledge bases associated with the application, the platform automatically calls the corresponding retriever to complete the recall of graph/chunk/doc data and integrates with the KAG-Solver pipeline to enable reasoning and question-answering.
Developers can extend the IndexManager to implement or combine custom extractors and retrievers. After packaging KAG with these custom components and replacing the corresponding installation package in the openspg-server image, users can select the custom index types on the product page during knowledge base construction. This enables seamless integration of custom index types with the product.
3、OpenBenchMark
3.1、Multi-hop QA Dataset
3.1.1、benchMark
- musique
Method | em | f1 | llm_accuracy |
---|---|---|---|
Naive Gen | 0.033 | 0.074 | 0.083 |
Naive RAG | 0.248 | 0.357 | 0.384 |
HippoRAGV2 | 0.289 | 0.404 | 0.452 |
PIKE-RAG | 0.383 | 0.498 | 0.565 |
KAG-V0.6.1 | 0.363 | 0.481 | 0.547 |
KAG-V0.7LC | 0.379 | 0.513 | 0.560 |
KAG-V0.7 | 0.385 | 0.520 | 0.579 |
KAG-V0.8 | 0.432 | 0.569 | 0.626 |
- hotpotqa
Method | em | f1 | llm_accuracy |
---|---|---|---|
Naive Gen | 0.223 | 0.313 | 0.342 |
Naive RAG | 0.566 | 0.704 | 0.762 |
HippoRAGV2 | 0.557 | 0.694 | 0.807 |
PIKE-RAG | 0.558 | 0.686 | 0.787 |
KAG-V0.6.1 | 0.599 | 0.745 | 0.841 |
KAG-V0.7LC | 0.600 | 0.744 | 0.828 |
KAG-V0.7 | 0.603 | 0.748 | 0.844 |
KAG-V0.8 | 0.625 | 0.772 | 0.857 |
- twowiki
Method | em | f1 | llm_accuracy |
---|---|---|---|
Naive Gen | 0.199 | 0.310 | 0.382 |
Naive RAG | 0.448 | 0.512 | 0.573 |
HippoRAGV2 | 0.542 | 0.618 | 0.684 |
PIKE-RAG | 0.63 | 0.72 | 0.81 |
KAG-V0.6.1 | 0.666 | 0.755 | 0.811 |
KAG-V0.7LC | 0.683 | 0.769 | 0.826 |
KAG-V0.7 | 0.684 | 0.770 | 0.836 |
KAG-V0.8 | 0.692 | 0.784 | 0.847 |
3.1.2、params for each method
Method | dataset | LLM(Build/Reason) | embed | param |
---|---|---|---|---|
Naive Gen | 10k docs、1k questions provided by HippoRAG | qwen2.5-72B | bge-m3 | - |
Naive RAG | same as above | qwen2.5-72B | bge-m3 | num_docs: 10 |
HippoRAGV2 | same as above | qwen2.5-72B | bge-m3 | retrieval_top_k=200 linking_top_k=5 max_qa_steps=3 qa_top_k=5 graph_type=facts_and_sim_passage_node_unidirectional embedding_batch_size=8 |
PIKE-RAG | same as above | qwen2.5-72B | bge-m3 | tagging_llm_temperature: 0.7 qa_llm_temperature: 0.0 chunk_retrieve_k: 8 chunk_retrieve_score_threshold: 0.5 atom_retrieve_k: 16 atomic_retrieve_score_threshold: 0.2 max_num_question: 5 num_parallel: 5 |
KAG-V0.6.1 | same as above | qwen2.5-72B | bge-m3 | refer to the kag_config.yaml files in each subdirectory under https://github.com/OpenSPG/KAG/tree/v0.6/kag/examples. |
KAG-V0.7LC | same as above | builder:qwen2.5-7B solver:qwen2.5-72B | bge-m3 | refer to the kag_config.yaml files in each subdirectory under https://github.com/OpenSPG/KAG/tree/v0.6/kag/examples. |
KAG-V0.7 | same as above | qwen2.5-72B | bge-m3 | refer to the kag_config.yaml files in each subdirectory under https://github.com/OpenSPG/KAG/tree/v0.6/kag/examples. |
KAG-V0.8 | same as above | qwen2.5-72B | bge-m3 | refer to the kag_config.yaml files in each subdirectory under https://github.com/OpenSPG/KAG/tree/v0.6/kag/examples. |
3.2、NetOperatorQA
Method | em | f1 | llm_accuracy | Methodology | Metric Sources |
---|---|---|---|---|---|
KAG-V0.8 | - | - | 93.9% | Custom NetOperatorQA Pipeline Based on KAG Framework | https://github.com/OpenSPG/KAG/kag/examples/NetOperatorQA |
3.3、Structured Datasets
- PeopleRelQA
Method | em | f1 | llm_accuracy | Methodology | Metric Sources |
---|---|---|---|---|---|
deepseek-v3(OpenKG oneEval) | - | 2.60% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
qwen2.5-72B(OpenKG oneEval) | - | 2.50% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
GPT-4o(OpenKG oneEval) | - | 3.20% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
QWQ-32B(OpenKG oneEval) | - | 3.00% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
Grok 3(OpenKG oneEval) | - | 4.70% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
KAG-V0.7 | 45.5% | 86.6% | 84.8% | Custom PRQA Pipeline with Cypher Solver Based on KAG Framework | Ant Group KAG Team |
KAG-V0.8 | 47.6% | 89.3% | 91.0% | Custom PRQA Pipeline with Cypher Solver Based on KAG Framework | Ant Group KAG Team |
- AffairQA
Method | em | f1 | llm_accuracy | Methodology | Metric Sources |
---|---|---|---|---|---|
deepseek-v3 | - | 42.50% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
qwen2.5-72B | - | 45.00% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
GPT-4o | - | 41.00% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
QWQ-32B | - | 45.00% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
Grok 3 | - | 45.50% | - | Dense Retrieval + LLM Generation | OpenKG WeChat |
KAG-V0.7 | 77.5% | 83.1% | 88.2% | Custom AffairQA Pipeline Based on KAG Framework | Ant Group KAG Team |
KAG-V0.8 | 78.4% | 84.7% | 97.5% | Custom AffairQA Pipeline Based on KAG Framework | Ant Group KAG Team |
4、Product and platform optimization
4.1、System Integration
This release offers three methods to integrate KAG into business systems: HttpAPI, MCP Protocol, and Frontend Page Embedding.
The KAG HttpAPI provides two types of interfaces: recall and reasoning & Q&A. Developers can choose to use KAG as a retrieval source or as a complete reasoning and Q&A capability.
The KAG MCP Protocol offers interfaces for reasoning and Q&A, which can be embedded into agent applications like Cursor.
In this release, the reasoning and Q&A functionality has been upgraded to an independent page, allowing developers to embed this page into their business systems.
4.2、User Experience Optimization
This release addresses several community concerns, including response latency, streaming output stability, model configuration management, and task scheduling reliability. We greatly appreciate the community's support and patience with KAG.
5、Future Plans
In upcoming versions, we will continue to focus on enhancing large models' ability to leverage external knowledge bases. Our goal is to achieve bidirectional enhancement and seamless integration between large models and symbolic knowledge, improving the factuality, rigor, and consistency of reasoning and Q&A in professional scenarios. We will also keep releasing updates to push the boundaries of capability and drive adoption in vertical domains.
6、Acknowledgments
We extend our heartfelt gratitude to the following experts and colleagues for their invaluable support during this framework upgrade:
- Open Source Community Contributors: Novelrui, thundax-lyp, thesteganos, Like0x, unrealise, J4ckycjl, luzizhuo, hy89
- Special Thanks: Senior developer Li Yunpeng, for developing plugins for IDEA, PyCharm, and VSCode, which significantly improved the experience of writing KAG schemas.
1、总体摘要
我们正式发布KAG 0.8版本,本次更新旨在持续提升大模型利用知识库推理问答的一致性、严谨性和精准性,并引入了多项重要功能特性。
首先,我们升级了KAG 知识库的能力。扩展了私域知识库(含结构化、非结构化数据)、公网知识库 两种模式,支持通过MCP 协议引入LBS、WebSearch 等公网数据源。此外,升级了私域知识库索引管理的能力,内置Outline、Summary、KnowledgeUnit、AtomicQuery、Chunk、Table 等多种基础索引类型,支持开发者自定义索引 & 产品端联动 的能力。用户可根据场景特点选择合适的索引类型,在构建成本&业务效果之间取得平衡。
其次,我们优化了产品体验、完善了系统集成接口。产品方面,将知识库和应用解耦,知识库管理私域数据(结构化 & 非结构化)、公网数据;应用可关联多知识库,基于知识库构建阶段的索引类型,自动适配对应的检索器完成数据召回;应用/知识库 独立管理模型依赖、任务调度,以提升灵活性。系统集成方面,完善了KAG召回 & 问答接口,新增对业务应用中嵌入推理问答页面的支持;全面拥抱MCP,提供在agent 流程中接入KAG 推理问答(基于MCP 协议)的能力。
再次,KAG 完成了对KAG-Thinker 模型的适配。通过复杂问题的广度拆分和深度求解、知识边界判定、检索结果抗噪等优化,在多轮迭代式思考范式的牵引下,提升了KAG框架推理范式的稳定性,推理逻辑的严谨性。KAG-Thinker 模型的发布和使用请关注我们后续的公告。
除了上述框架和产品优化外,我们还提升了问答效率&流式输出稳定性。本次更新以Qwen2.5-72B为基础模型,完成了各RAG框架及部分KG数据集的效果对齐。发布的整体榜单效果可参考图1~图3,榜单细节详见open_benchmark部分。
图1 Performance of KAG V0.8 and baselines on Multi-hop QA benchmarks
图2 Performance of KAG V0.8 and baselines(from OpenKG OneEval) on Knowledge based QA benchmarks
图3 Performance of KAG V0.8 on NetOperatorQA benchmarks
2、框架优化
2.1、索引配置化管理
图4 KAG Framework Optimization in V0.8
索引抽取和索引检索是知识库类应用的核心能力,KAG v0.8 对架构进行了升级,以支持索引构建和检索的配置化管理。每种索引类型拥有独立的抽取器(Extractor)和检索器(Retriever),通过IndexManager 进行管理。
KAG 内置KnowledgeUnit(图谱三元组的升级版)、Outline、Summary、Chunk、AtomicQuery、Table 等基础索引类型,以及对应的索引抽取器、检索器。知识库构建阶段,平台根据用户选择的索引类型,调用对应的抽取器完成索引抽取;知识库应用阶段,根据应用关联的知识库,平台自动调用对应检索器完成graph/chunk/doc 的召回,并联动KAG-Solver pipeline 实现推理问答。
开发者可扩展IndexManager,实现/组合 对应的索引抽取器、检索器,完成KAG 打包并替换openspg-server 镜像中对应的安装包,在知识库构建的产品页面即可勾选自定义索引类型,实现自定义索引类型&产品的联动。
2.2、KAG-Thinker 工程化
请关注我们即将发布的KAG-Thinker 模型公告。
3、OpenBenchMark
3.1、多跳事实问答
3.1.1、benchMark
- musique
Method | em | f1 | llm_accuracy |
---|---|---|---|
Naive Gen | 0.033 | 0.074 | 0.083 |
Naive RAG | 0.248 | 0.357 | 0.384 |
HippoRAGV2 | 0.289 | 0.404 | 0.452 |
PIKE-RAG | 0.383 | 0.498 | 0.565 |
KAG-V0.6.1 | 0.363 | 0.481 | 0.547 |
KAG-V0.7LC | 0.379 | 0.513 | 0.560 |
KAG-V0.7 | 0.385 | 0.520 | 0.579 |
KAG-V0.8 | 0.432 | 0.569 | 0.626 |
- hotpotqa
Method | em | f1 | llm_accuracy |
---|---|---|---|
Naive Gen | 0.223 | 0.313 | 0.342 |
Naive RAG | 0.566 | 0.704 | 0.762 |
HippoRAGV2 | 0.557 | 0.694 | 0.807 |
PIKE-RAG | 0.558 | 0.686 | 0.787 |
KAG-V0.6.1 | 0.599 | 0.745 | 0.841 |
KAG-V0.7LC | 0.600 | 0.744 | 0.828 |
KAG-V0.7 | 0.603 | 0.748 | 0.844 |
KAG-V0.8 | 0.625 | 0.772 | 0.857 |
- twowiki
Method | em | f1 | llm_accuracy |
---|---|---|---|
Naive Gen | 0.199 | 0.310 | 0.382 |
Naive RAG | 0.448 | 0.512 | 0.573 |
HippoRAGV2 | 0.542 | 0.618 | 0.684 |
PIKE-RAG | 0.63 | 0.72 | 0.81 |
KAG-V0.6.1 | 0.666 | 0.755 | 0.811 |
KAG-V0.7LC | 0.683 | 0.769 | 0.826 |
KAG-V0.7 | 0.684 | 0.770 | 0.836 |
KAG-V0.8 | 0.692 | 0.784 | 0.847 |
3.1.2、各种方法参数配置
Method | 数据集 | 基模(构建/推理) | 向量模型 | 参数设置 |
---|---|---|---|---|
Naive Gen | hippoRAG 论文提供的1万 docs、1千 questions; | qwen2.5-72B | bge-m3 | 无 |
Naive RAG | 同上 | qwen2.5-72B | bge-m3 | num_docs: 10 |
HippoRAGV2 | 同上 | qwen2.5-72B | bge-m3 | retrieval_top_k=200 linking_top_k=5 max_qa_steps=3 qa_top_k=5 graph_type=facts_and_sim_passage_node_unidirectional embedding_batch_size=8 |
PIKE-RAG | 同上 | qwen2.5-72B | bge-m3 | tagging_llm_temperature: 0.7 qa_llm_temperature: 0.0 chunk_retrieve_k: 8 chunk_retrieve_score_threshold: 0.5 atom_retrieve_k: 16 atomic_retrieve_score_threshold: 0.2 max_num_question: 5 num_parallel: 5 |
KAG-V0.6.1 | 同上 | qwen2.5-72B | bge-m3 | 参见https://github.com/OpenSPG/KAG/tree/v0.6 examples 各子目录的kag_config.yaml |
KAG-V0.7LC | 同上 | 构建:qwen2.5-7B 问答:qwen2.5-72B | bge-m3 | 参见https://github.com/OpenSPG/KAG open_benchmarks 各子目录kag_config.yaml |
KAG-V0.7 | 同上 | qwen2.5-72B | bge-m3 | 参见https://github.com/OpenSPG/KAG open_benchmarks 各子目录kag_config.yaml |
KAG-V0.8 | 同上 | qwen2.5-72B | bge-m3 | 参见https://github.com/OpenSPG/KAG open_benchmarks 各子目录kag_config.yaml |
3.2、运营商知识库问答
Method | em | f1 | llm_accuracy | 方法论 | 指标来源 |
---|---|---|---|---|---|
Naive_RAG | - | - | 78.6 | 基于向量的知识库检索 | 蚂蚁KAG 团队 |
KAG-V0.8 | - | - | 93.9% | 基于KAG 框架自定义 NetOperatorQA pipeline | 蚂蚁KAG 团队 参见https://github.com/OpenSPG/KAG/kag/examples/NetOperatorQA |
3.3、结构化数据集
- PRQA(人物关系问答)
Method | em | f1 | llm_accuracy | 方法论 | 指标来源 |
---|---|---|---|---|---|
deepseek-v3(OpenKG oneEval) | - | 2.60% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
qwen2.5-72B(OpenKG oneEval) | - | 2.50% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
GPT-4o(OpenKG oneEval) | - | 3.20% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
QWQ-32B(OpenKG oneEval) | - | 3.00% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
Grok 3(OpenKG oneEval) | - | 4.70% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
KAG-V0.7 | 45.5% | 86.6% | 84.8% | 基于KAG 框架自定义AffairQA pipeline + cypher_solver | 蚂蚁KAG 团队 |
KAG-V0.8 | 47.6% | 89.3% | 91.0% | 基于KAG 框架自定义AffairQA pipeline + cypher_solver | 蚂蚁KAG 团队 |
- AffairQA(政务信息问答)
Method | em | f1 | llm_accuracy | 方法论 | 指标提供者 |
---|---|---|---|---|---|
deepseek-v3 | - | 42.50% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
qwen2.5-72B | - | 45.00% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
GPT-4o | - | 41.00% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
QWQ-32B | - | 45.00% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
Grok 3 | - | 45.50% | - | Dense Retrieval + LLM Generation | OpenKG 公众号 |
KAG-V0.7 | 77.5% | 83.1% | 88.2% | 基于KAG 框架自定义AffairQA pipeline | 蚂蚁KAG 团队 |
KAG-V0.8 | 78.4% | 84.7% | 97.5% | 基于KAG 框架自定义AffairQA pipeline | 蚂蚁KAG 团队 |
4、产品优化
4.1、系统集成
本次升级提供了三种将KAG 集成到业务系统的方式:HttpAPI、MCP Protocol、前端页面嵌入。
KAG HttpAPI 提供了召回、推理问答两类接口,开发者可选择将KAG 作为一路检索源、或者作为完整的推理问答能力使用。KAG MCP Protocol 提供了推理问答的接口,可嵌入Cursor 等Agent 应用中。此次发版,推理问答 升级为独立的页面,开发者可选择将该页面嵌入业务系统中。
4.2、用户体验优化
此次发版,对于社区用户关注的问答耗时、流式输出稳定性、模型配置管理、任务调度稳定性等问题都做了较多优化,感谢社区对KAG 的包容。
5、后续计划
近期版本迭代中,我们持续致力于持续提升大模型利用外部知识库的能力,实现大模型与符号知识的双向增强和有机融合,不断提升专业场景推理问答的事实性、严谨性和一致性等,我们也将持续发布,不断提升能力的上限,不断推进垂直领域的落地。
6、致谢
此次框架升级得到了以下专家和同仁的鼎力支持,我们深表感激:
- 开源社区热心用户:Novelrui 、thundax-lyp、thesteganos 、Like0x 、unrealise 、J4ckycjl、luzizhuo、hy89
- 特别鸣谢:资深开发者李云鹏,提供IDEA、pycharm、vscode 等插件,极大提升了KAG schema 编写的体验