Skip to main content

Version 0.6(2025-01-07)

On January 7, 2025, OpenSPG officially released version 0.6, bringing updates across multiple areas, including domain knowledge mounting, vertical domain schema management, visual knowledge exploration, and support for summary generation tasks. In terms of user experience, it offers a mechanism for resuming knowledge base tasks from breakpoints, introduces a user login and permission system, and optimizes task scheduling for building processes. In developer mode, it supports configuring different models for different stages and enables schema-constraint mode for extraction, significantly enhancing the system's flexibility, usability, performance, and security. This release provides users with a more powerful knowledge management platform that adapts to diverse application scenarios.


🌟 New Features

  1. Support for Summary Generation Tasks

    • Native support for abstractive summarization tasks without sacrificing multi-hop factual reasoning accuracy. On the CSQA dataset, while comprehensiveness, diversity, and empowerment metrics are slightly lower than LightRAG (-1.2/10), the factual accuracy metric is better than LightRAG (+0.1/10). On multi-hop question answering datasets such as HotpotQA, TwoWiki, and MuSiQue, since LightRAG and GraphRAG do not provide a factual QA evaluation entry, the EM metric using the default entry is close to 0. For quantitative evaluation results, please refer to the KAG code repository under examples/csqa/README.md and follow the steps to reproduce.
  2. Domain Schema Management

    • The product provides SPG schema management capabilities, allowing users to optimize knowledge base construction and inference Q&A performance by customizing schemas.
  3. Knowledge Exploration

    • Added a knowledge exploration feature to enable visual query and analysis of knowledge base data, and provided an HTTP API for integration with other systems.
  4. Support for Mounting Domain Knowledge in KAG-Builder(Developer Mode)

    • In developer mode, the system supports injecting domain knowledge (domain vocabulary, relationships between terms) into the knowledge base, which can significantly improve knowledge base construction and inference Q&A performance (with a 10%+ improvement in the medical domain).
  5. Adding Knowledge Alignment Component to the KAG-Builder Pipeline

    • Kag-Builder provides a default knowledge alignment component that includes features such as filtering out invalid data and linking similar entities. This optimizes the structure and data quality of the graph.

⚙️ User Experience Optimizations

  1. Resumable Tasks

    • Provide resumable capabilities for knowledge base construction tasks at the file level and chunk level in both product mode and developer mode, to reduce the time and token consumption caused by full re-runs after task failures.
  2. User Login & Permission System

    • Implement a user login and permission system to prevent unauthorized access and operations on the knowledge base data.
  3. Optimized Knowledge Base Construction Task Scheduling

    • Provide database-based knowledge base construction task scheduling to avoid task anomalies or interruptions after container restarts.
  4. Support for Configuring Different Models at Different Stages (Developer Mode)

    • The system provides a component management mechanism based on a registry, allowing users to instantiate component objects via configuration files. This supports users in developing and embedding custom components into the KAG-Builder and KAG-Solver workflows. Additionally, it enables the configuration of different-sized models at different stages of the workflow, thereby enhancing the overall reasoning and question-answering performance.
  5. Optimization of Layout Analysis for Markdown, PDF, and Word Files

    • For Markdown, PDF, and Word files, the system prioritizes dividing the content into chunks based on the file's sections. This ensures that the content within each chunk is more cohesive.**
  6. Global Configuration and Knowledge Base Configuration

    • Provide global configuration for the knowledge base, allowing unified settings for storage engines, generation models, and representation model access information.
  7. Support for Schema-Constrained Extraction and Linking (Developer Mode)

    • Provide a schema-constraint mode that strictly adheres to schema definitions during the knowledge base construction phase, enabling finer-grained and more complex knowledge extraction.