Version 0.8(2025-06-27)

English
中文

1、Overview

We are excited to announce the official release of KAG version 0.8. This update focuses on continuously enhancing the consistency, rigor, and accuracy of large model knowledge base-driven reasoning and question-answering. It also introduces several significant new features and capabilities.

First, we have upgraded the capabilities of the KAG knowledge base. We have expanded support for two modes: private domain knowledge bases (including structured and unstructured data) and public domain knowledge bases. This includes the ability to integrate public web data sources such as LBS and WebSearch via the MCP protocol. Additionally, we have improved the management of private domain knowledge base indexing, incorporating multiple foundational index types such as Outline, Summary, KnowledgeUnit, AtomicQuery, Chunk, and Table. This supports developers in customizing indexes and synchronizing them with product interfaces. Users can select the most appropriate index type based on their specific scenarios, achieving a balance between construction costs and business outcomes.

Second, we have optimized the product experience and refined the system integration interfaces. On the product front, we have decoupled the knowledge base from applications. The knowledge base now manages private domain data (both structured and unstructured) and public domain data, while applications can link to multiple knowledge bases. Based on the index types used during knowledge base construction, the system automatically adapts the corresponding retrieval engine to recall data. Both applications and knowledge bases independently manage model dependencies and task scheduling, enhancing flexibility.

In terms of system integration, we have refined the KAG recall and Q&A interfaces, adding support for embedding reasoning and Q&A pages within business applications. We have fully embraced the MCP protocol, enabling seamless integration of KAG reasoning and Q&A into agent workflows (based on the MCP protocol).

Third, KAG has successfully adapted to the KAG-Thinker model. Through optimizations such as broad decomposition and deep solving of complex problems, knowledge boundary determination, and noise-resistant retrieval results, the stability of the KAG framework's reasoning paradigm and the rigor of its reasoning logic have been significantly improved under the guidance of iterative thinking paradigms. Further details on the release and usage of the KAG-Thinker model will be provided in our upcoming announcements.

In addition to the aforementioned framework and product optimizations, we have also enhanced Q&A efficiency and the stability of streaming outputs. This update is based on the Qwen2.5-72B foundation model and has achieved performance alignment across various RAG frameworks and selected KG datasets. The overall benchmark results of this release are illustrated in Figures 1~3, with detailed metrics available in the open_benchmark section.

Figure1 Performance of KAG V0.8 and baselines on Multi-hop QA benchmarks

Figure2 Performance of KAG V0.8 and baselines(from OpenKG OneEval) on Knowledge based QA benchmarks

Figure3 Performance of KAG V0.8 on NetOperator_QA benchmarks

2、Framework Enhancements

2.1、Configurable Index Management

Figure4 KAG Framework Optimization in V0.8

Index extraction and retrieval are core capabilities of knowledge base applications. KAG v0.8 has upgraded its architecture to support configurable management of index construction and retrieval. Each index type is equipped with its own extractor (Extractor) and retriever (Retriever), managed through the IndexManager.

KAG comes with built-in foundational index types such as KnowledgeUnit (an enhanced version of graph triples), Outline, Summary, Chunk, AtomicQuery, and Table, along with their corresponding extractors and retrievers. During the knowledge base construction phase, the platform invokes the appropriate extractor based on the index types selected by the user to perform index extraction. In the application phase, based on the knowledge bases associated with the application, the platform automatically calls the corresponding retriever to complete the recall of graph/chunk/doc data and integrates with the KAG-Solver pipeline to enable reasoning and question-answering.

Developers can extend the IndexManager to implement or combine custom extractors and retrievers. After packaging KAG with these custom components and replacing the corresponding installation package in the openspg-server image, users can select the custom index types on the product page during knowledge base construction. This enables seamless integration of custom index types with the product.

3、OpenBenchMark

3.1、Multi-hop QA Dataset

3.1.1、benchMark

musique

Method	em	f1	llm_accuracy
Naive Gen	0.033	0.074	0.083
Naive RAG	0.248	0.357	0.384
HippoRAGV2	0.289	0.404	0.452
PIKE-RAG	0.383	0.498	0.565
KAG-V0.6.1	0.363	0.481	0.547
KAG-V0.7LC	0.379	0.513	0.560
KAG-V0.7	0.385	0.520	0.579
KAG-V0.8	0.432	0.569	0.626

hotpotqa

Method	em	f1	llm_accuracy
Naive Gen	0.223	0.313	0.342
Naive RAG	0.566	0.704	0.762
HippoRAGV2	0.557	0.694	0.807
PIKE-RAG	0.558	0.686	0.787
KAG-V0.6.1	0.599	0.745	0.841
KAG-V0.7LC	0.600	0.744	0.828
KAG-V0.7	0.603	0.748	0.844
KAG-V0.8	0.625	0.772	0.857

twowiki

Method	em	f1	llm_accuracy
Naive Gen	0.199	0.310	0.382
Naive RAG	0.448	0.512	0.573
HippoRAGV2	0.542	0.618	0.684
PIKE-RAG	0.63	0.72	0.81
KAG-V0.6.1	0.666	0.755	0.811
KAG-V0.7LC	0.683	0.769	0.826
KAG-V0.7	0.684	0.770	0.836
KAG-V0.8	0.692	0.784	0.847

3.1.2、params for each method

Method	dataset	LLM(Build/Reason)	embed	param
Naive Gen	10k docs、1k questions provided by HippoRAG	qwen2.5-72B	bge-m3	-
Naive RAG	same as above	qwen2.5-72B	bge-m3	num_docs: 10
HippoRAGV2	same as above	qwen2.5-72B	bge-m3	retrieval_top_k=200 linking_top_k=5 max_qa_steps=3 qa_top_k=5 graph_type=facts_and_sim_passage_node_unidirectional embedding_batch_size=8
PIKE-RAG	same as above	qwen2.5-72B	bge-m3	tagging_llm_temperature: 0.7 qa_llm_temperature: 0.0 chunk_retrieve_k: 8 chunk_retrieve_score_threshold: 0.5 atom_retrieve_k: 16 atomic_retrieve_score_threshold: 0.2 max_num_question: 5 num_parallel: 5
KAG-V0.6.1	same as above	qwen2.5-72B	bge-m3	refer to the `kag_config.yaml` files in each subdirectory under https://github.com/OpenSPG/KAG/tree/v0.6/kag/examples.
KAG-V0.7LC	same as above	builder：qwen2.5-7B solver：qwen2.5-72B	bge-m3	refer to the `kag_config.yaml` files in each subdirectory under https://github.com/OpenSPG/KAG/tree/v0.6/kag/examples.
KAG-V0.7	same as above	qwen2.5-72B	bge-m3	refer to the `kag_config.yaml` files in each subdirectory under https://github.com/OpenSPG/KAG/tree/v0.6/kag/examples.
KAG-V0.8	same as above	qwen2.5-72B	bge-m3	refer to the `kag_config.yaml` files in each subdirectory under https://github.com/OpenSPG/KAG/tree/v0.6/kag/examples.

3.2、NetOperatorQA

Method	em	f1	llm_accuracy	Methodology	Metric Sources
KAG-V0.8	-	-	93.9%	Custom NetOperatorQA Pipeline Based on KAG Framework	https://github.com/OpenSPG/KAG/kag/examples/NetOperatorQA

3.3、Structured Datasets

PeopleRelQA

Method	em	f1	llm_accuracy	Methodology	Metric Sources
deepseek-v3(OpenKG oneEval)	-	2.60%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
qwen2.5-72B(OpenKG oneEval)	-	2.50%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
GPT-4o(OpenKG oneEval)	-	3.20%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
QWQ-32B(OpenKG oneEval)	-	3.00%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
Grok 3(OpenKG oneEval)	-	4.70%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
KAG-V0.7	45.5%	86.6%	84.8%	Custom PRQA Pipeline with Cypher Solver Based on KAG Framework	Ant Group KAG Team
KAG-V0.8	47.6%	89.3%	91.0%	Custom PRQA Pipeline with Cypher Solver Based on KAG Framework	Ant Group KAG Team

AffairQA

Method	em	f1	llm_accuracy	Methodology	Metric Sources
deepseek-v3	-	42.50%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
qwen2.5-72B	-	45.00%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
GPT-4o	-	41.00%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
QWQ-32B	-	45.00%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
Grok 3	-	45.50%	-	Dense Retrieval + LLM Generation	OpenKG WeChat
KAG-V0.7	77.5%	83.1%	88.2%	Custom AffairQA Pipeline Based on KAG Framework	Ant Group KAG Team
KAG-V0.8	78.4%	84.7%	97.5%	Custom AffairQA Pipeline Based on KAG Framework	Ant Group KAG Team

4、Product and platform optimization

4.1、System Integration

This release offers three methods to integrate KAG into business systems: HttpAPI, MCP Protocol, and Frontend Page Embedding.

The KAG HttpAPI provides two types of interfaces: recall and reasoning & Q&A. Developers can choose to use KAG as a retrieval source or as a complete reasoning and Q&A capability.

The KAG MCP Protocol offers interfaces for reasoning and Q&A, which can be embedded into agent applications like Cursor.

In this release, the reasoning and Q&A functionality has been upgraded to an independent page, allowing developers to embed this page into their business systems.

4.2、User Experience Optimization

This release addresses several community concerns, including response latency, streaming output stability, model configuration management, and task scheduling reliability. We greatly appreciate the community's support and patience with KAG.

5、Future Plans

In upcoming versions, we will continue to focus on enhancing large models' ability to leverage external knowledge bases. Our goal is to achieve bidirectional enhancement and seamless integration between large models and symbolic knowledge, improving the factuality, rigor, and consistency of reasoning and Q&A in professional scenarios. We will also keep releasing updates to push the boundaries of capability and drive adoption in vertical domains.

6、Acknowledgments

We extend our heartfelt gratitude to the following experts and colleagues for their invaluable support during this framework upgrade:

Open Source Community Contributors: Novelrui, thundax-lyp, thesteganos, Like0x, unrealise, J4ckycjl, luzizhuo, hy89
Special Thanks: Senior developer Li Yunpeng, for developing plugins for IDEA, PyCharm, and VSCode, which significantly improved the experience of writing KAG schemas.

1、总体摘要

我们正式发布KAG 0.8版本，本次更新旨在持续提升大模型利用知识库推理问答的一致性、严谨性和精准性，并引入了多项重要功能特性。

首先，我们升级了KAG 知识库的能力。扩展了私域知识库（含结构化、非结构化数据）、公网知识库两种模式，支持通过MCP 协议引入LBS、WebSearch 等公网数据源。此外，升级了私域知识库索引管理的能力，内置Outline、Summary、KnowledgeUnit、AtomicQuery、Chunk、Table 等多种基础索引类型，支持开发者自定义索引 & 产品端联动的能力。用户可根据场景特点选择合适的索引类型，在构建成本&业务效果之间取得平衡。

其次，我们优化了产品体验、完善了系统集成接口。产品方面，将知识库和应用解耦，知识库管理私域数据（结构化 & 非结构化）、公网数据；应用可关联多知识库，基于知识库构建阶段的索引类型，自动适配对应的检索器完成数据召回；应用/知识库独立管理模型依赖、任务调度，以提升灵活性。系统集成方面，完善了KAG召回 & 问答接口，新增对业务应用中嵌入推理问答页面的支持；全面拥抱MCP，提供在agent 流程中接入KAG 推理问答（基于MCP 协议）的能力。

再次，KAG 完成了对KAG-Thinker 模型的适配。通过复杂问题的广度拆分和深度求解、知识边界判定、检索结果抗噪等优化，在多轮迭代式思考范式的牵引下，提升了KAG框架推理范式的稳定性，推理逻辑的严谨性。KAG-Thinker 模型的发布和使用请关注我们后续的公告。

除了上述框架和产品优化外，我们还提升了问答效率&流式输出稳定性。本次更新以Qwen2.5-72B为基础模型，完成了各RAG框架及部分KG数据集的效果对齐。发布的整体榜单效果可参考图1~图3，榜单细节详见open_benchmark部分。

图1 Performance of KAG V0.8 and baselines on Multi-hop QA benchmarks

图2 Performance of KAG V0.8 and baselines(from OpenKG OneEval) on Knowledge based QA benchmarks

图3 Performance of KAG V0.8 on NetOperatorQA benchmarks

2、框架优化

2.1、索引配置化管理

图4 KAG Framework Optimization in V0.8

索引抽取和索引检索是知识库类应用的核心能力，KAG v0.8 对架构进行了升级，以支持索引构建和检索的配置化管理。每种索引类型拥有独立的抽取器(Extractor)和检索器（Retriever），通过IndexManager 进行管理。

KAG 内置KnowledgeUnit（图谱三元组的升级版）、Outline、Summary、Chunk、AtomicQuery、Table 等基础索引类型，以及对应的索引抽取器、检索器。知识库构建阶段，平台根据用户选择的索引类型，调用对应的抽取器完成索引抽取；知识库应用阶段，根据应用关联的知识库，平台自动调用对应检索器完成graph/chunk/doc 的召回，并联动KAG-Solver pipeline 实现推理问答。

开发者可扩展IndexManager，实现/组合对应的索引抽取器、检索器，完成KAG 打包并替换openspg-server 镜像中对应的安装包，在知识库构建的产品页面即可勾选自定义索引类型，实现自定义索引类型&产品的联动。

2.2、KAG-Thinker 工程化

请关注我们即将发布的KAG-Thinker 模型公告。

3、OpenBenchMark

3.1、多跳事实问答

3.1.1、benchMark

musique

Method	em	f1	llm_accuracy
Naive Gen	0.033	0.074	0.083
Naive RAG	0.248	0.357	0.384
HippoRAGV2	0.289	0.404	0.452
PIKE-RAG	0.383	0.498	0.565
KAG-V0.6.1	0.363	0.481	0.547
KAG-V0.7LC	0.379	0.513	0.560
KAG-V0.7	0.385	0.520	0.579
KAG-V0.8	0.432	0.569	0.626

hotpotqa

Method	em	f1	llm_accuracy
Naive Gen	0.223	0.313	0.342
Naive RAG	0.566	0.704	0.762
HippoRAGV2	0.557	0.694	0.807
PIKE-RAG	0.558	0.686	0.787
KAG-V0.6.1	0.599	0.745	0.841
KAG-V0.7LC	0.600	0.744	0.828
KAG-V0.7	0.603	0.748	0.844
KAG-V0.8	0.625	0.772	0.857

twowiki

Method	em	f1	llm_accuracy
Naive Gen	0.199	0.310	0.382
Naive RAG	0.448	0.512	0.573
HippoRAGV2	0.542	0.618	0.684
PIKE-RAG	0.63	0.72	0.81
KAG-V0.6.1	0.666	0.755	0.811
KAG-V0.7LC	0.683	0.769	0.826
KAG-V0.7	0.684	0.770	0.836
KAG-V0.8	0.692	0.784	0.847

3.1.2、各种方法参数配置

Method	数据集	基模(构建/推理)	向量模型	参数设置
Naive Gen	hippoRAG 论文提供的1万 docs、1千 questions；	qwen2.5-72B	bge-m3	无
Naive RAG	同上	qwen2.5-72B	bge-m3	num_docs: 10
HippoRAGV2	同上	qwen2.5-72B	bge-m3	retrieval_top_k=200 linking_top_k=5 max_qa_steps=3 qa_top_k=5 graph_type=facts_and_sim_passage_node_unidirectional embedding_batch_size=8
PIKE-RAG	同上	qwen2.5-72B	bge-m3	tagging_llm_temperature: 0.7 qa_llm_temperature: 0.0 chunk_retrieve_k: 8 chunk_retrieve_score_threshold: 0.5 atom_retrieve_k: 16 atomic_retrieve_score_threshold: 0.2 max_num_question: 5 num_parallel: 5
KAG-V0.6.1	同上	qwen2.5-72B	bge-m3	参见https://github.com/OpenSPG/KAG/tree/v0.6 examples 各子目录的kag_config.yaml
KAG-V0.7LC	同上	构建：qwen2.5-7B 问答：qwen2.5-72B	bge-m3	参见https://github.com/OpenSPG/KAG open_benchmarks 各子目录kag_config.yaml
KAG-V0.7	同上	qwen2.5-72B	bge-m3	参见https://github.com/OpenSPG/KAG open_benchmarks 各子目录kag_config.yaml
KAG-V0.8	同上	qwen2.5-72B	bge-m3	参见https://github.com/OpenSPG/KAG open_benchmarks 各子目录kag_config.yaml

3.2、运营商知识库问答

Method	em	f1	llm_accuracy	方法论	指标来源
Naive_RAG	-	-	78.6	基于向量的知识库检索	蚂蚁KAG 团队
KAG-V0.8	-	-	93.9%	基于KAG 框架自定义 NetOperatorQA pipeline	蚂蚁KAG 团队参见https://github.com/OpenSPG/KAG/kag/examples/NetOperatorQA

3.3、结构化数据集

PRQA（人物关系问答）

Method	em	f1	llm_accuracy	方法论	指标来源
deepseek-v3(OpenKG oneEval)	-	2.60%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
qwen2.5-72B(OpenKG oneEval)	-	2.50%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
GPT-4o(OpenKG oneEval)	-	3.20%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
QWQ-32B(OpenKG oneEval)	-	3.00%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
Grok 3(OpenKG oneEval)	-	4.70%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
KAG-V0.7	45.5%	86.6%	84.8%	基于KAG 框架自定义AffairQA pipeline + cypher_solver	蚂蚁KAG 团队
KAG-V0.8	47.6%	89.3%	91.0%	基于KAG 框架自定义AffairQA pipeline + cypher_solver	蚂蚁KAG 团队

AffairQA（政务信息问答）

Method	em	f1	llm_accuracy	方法论	指标提供者
deepseek-v3	-	42.50%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
qwen2.5-72B	-	45.00%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
GPT-4o	-	41.00%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
QWQ-32B	-	45.00%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
Grok 3	-	45.50%	-	Dense Retrieval + LLM Generation	OpenKG 公众号
KAG-V0.7	77.5%	83.1%	88.2%	基于KAG 框架自定义AffairQA pipeline	蚂蚁KAG 团队
KAG-V0.8	78.4%	84.7%	97.5%	基于KAG 框架自定义AffairQA pipeline	蚂蚁KAG 团队

4、产品优化

4.1、系统集成

本次升级提供了三种将KAG 集成到业务系统的方式：HttpAPI、MCP Protocol、前端页面嵌入。

KAG HttpAPI 提供了召回、推理问答两类接口，开发者可选择将KAG 作为一路检索源、或者作为完整的推理问答能力使用。KAG MCP Protocol 提供了推理问答的接口，可嵌入Cursor 等Agent 应用中。此次发版，推理问答升级为独立的页面，开发者可选择将该页面嵌入业务系统中。

4.2、用户体验优化

此次发版，对于社区用户关注的问答耗时、流式输出稳定性、模型配置管理、任务调度稳定性等问题都做了较多优化，感谢社区对KAG 的包容。

5、后续计划

近期版本迭代中，我们持续致力于持续提升大模型利用外部知识库的能力，实现大模型与符号知识的双向增强和有机融合，不断提升专业场景推理问答的事实性、严谨性和一致性等，我们也将持续发布，不断提升能力的上限，不断推进垂直领域的落地。

6、致谢

此次框架升级得到了以下专家和同仁的鼎力支持，我们深表感激：

开源社区热心用户：Novelrui 、thundax-lyp、thesteganos 、Like0x 、unrealise 、J4ckycjl、luzizhuo、hy89
特别鸣谢：资深开发者李云鹏，提供IDEA、pycharm、vscode 等插件，极大提升了KAG schema 编写的体验

1、Overview​

2、Framework Enhancements​

2.1、Configurable Index Management​

3、OpenBenchMark​

3.1、Multi-hop QA Dataset​

3.1.1、benchMark​

3.1.2、params for each method​

3.2、NetOperatorQA​

3.3、Structured Datasets​

4、Product and platform optimization​

4.1、System Integration​

4.2、User Experience Optimization​

5、Future Plans​

6、Acknowledgments​

1、总体摘要​

2、框架优化​

2.1、索引配置化管理​

2.2、KAG-Thinker 工程化​

3、OpenBenchMark​

3.1、多跳事实问答​

3.1.1、benchMark​

3.1.2、各种方法参数配置​

3.2、运营商知识库问答​

3.3、结构化数据集​

4、产品优化​

4.1、系统集成​

4.2、用户体验优化​

5、后续计划​

6、致谢​

1、Overview

2、Framework Enhancements

2.1、Configurable Index Management

3、OpenBenchMark

3.1、Multi-hop QA Dataset

3.1.1、benchMark

3.1.2、params for each method

3.2、NetOperatorQA

3.3、Structured Datasets

4、Product and platform optimization

4.1、System Integration

4.2、User Experience Optimization

5、Future Plans

6、Acknowledgments

1、总体摘要

2、框架优化

2.1、索引配置化管理

2.2、KAG-Thinker 工程化

3、OpenBenchMark

3.1、多跳事实问答

3.1.1、benchMark

3.1.2、各种方法参数配置

3.2、运营商知识库问答

3.3、结构化数据集

4、产品优化

4.1、系统集成

4.2、用户体验优化

5、后续计划

6、致谢