Skip to content

kag产品端联动问题,index构建报错 #727

@lpdswing

Description

@lpdswing
  • 参考官方文档尝试把约束模式抽取的extractor添加到kag_index_manager.py
@KAGIndexManager.register("spo_graph_index")
class KAGConstraintSPOIndexManager(KAGIndexManager):
    @property
    def name(self):
        return "图谱约束模式索引管理器"

    @property
    def description(self) -> str:
        return "它首先从文本中抽取实体和关系构建图谱,然后将文本块与图谱节点关联起来。在检索时,它同时利用图谱的结构化查询能力(CS/FR Retriever)和文本的向量检索能力,实现更精准、更具推理能力的检索,特别适合复杂的问答任务。"

    @property
    def schema(self) -> str:
        return """
Chunk(文本块): IndexType
     properties:
        content(内容): Text
          index: TextAndVector  

KnowledgeUnit(知识点): IndexType
     properties:
        structedContent(结构化文本): Text
          index: TextAndVector
        ontology(本体): Text
        desc(描述): Text
            index: TextAndVector
        relatedQuery(关联问): AtomicQuery
        extendedKnowledge(关联外扩知识点):Text
        content(内容): Text
            index: TextAndVector
        knowledgeType(知识类型): Text

AtomicQuery(原子问): IndexType
  properties:
    title(标题): Text
      index: TextAndVector
  relations:
    sourceChunk(关联文本块): Chunk
    similar(相似问题): AtomicQuery
    relatedTo(相关): KnowledgeUnit      
        """

    @property
    def index_cost(self) -> str:
        msg = """
        索引构建的成本:
        
        未知
        """
        return msg

    @property
    def applicable_scenarios(self) -> str:
        return """
        **适用场景**: 适用于需要深度推理和关联分析的复杂问题。它能够理解问题的结构,并在知识图谱和文本块之间进行联合查询,应对需要跨领域知识或多跳推理的场景。

        **检索流程(多路并行)**:
        - **路径1 (FR)**: `kg_fr_retriever(query)` -> 自由文本检索,召回相关文本块。
        - **路径2 (CS)**: `kg_cs_retriever(logic_form)` -> 基于问题的逻辑结构,在图谱中进行精确检索。
        - **路径3 (RC)**: `vector_chunk_retriever(query)` -> 纯向量检索,作为补充召回。
        
        最终将多路结果融合,提供最全面的答案依据。
        """

    @property
    def retrieval_method(self) -> str:
        return "通过构建chunk 与 图谱的关联,实现图谱、chunk 的检索,一般用于检索与图谱相关的chunk"

    @classmethod
    def build_extractor_config(
        cls, llm_config: Dict, vectorize_model_config: Dict, **kwargs
    ):
        kb_task_project_id = kwargs.get(KAGConstants.KAG_QA_TASK_CONFIG_KEY, None)
        return [
            {
                "type": "schema_constraint_extractor",
                "llm": llm_config,
                "ner_prompt": {"type": "spg_entity"},
                "std_prompt": {"type": "default_std"},
                "relation_prompt": {"type": "spg_relation"},
                "event_prompt": {"type": "spg_event"},
                "kag_qa_task_config_key": kb_task_project_id,
            }
        ]

    @classmethod
    def build_retriever_config(
        cls, llm_config: Dict, vectorize_model_config: Dict, **kwargs
    ):
        kb_task_project_id = kwargs.get(KAGConstants.KAG_QA_TASK_CONFIG_KEY, None)
        return [
            {
                "type": "kg_cs_open_spg",
                "path_select": {
                    "type": "exact_one_hop_select",
                    "vectorize_model": vectorize_model_config,
                    "search_api": {
                        "type": "openspg_search_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "graph_api": {
                        "type": "openspg_graph_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "entity_linking": {
                    "type": "entity_linking",
                    "recognition_threshold": 0.9,
                    "exclude_types": ["Chunk"],
                    "vectorize_model": vectorize_model_config,
                    "search_api": {
                        "type": "openspg_search_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "graph_api": {
                        "type": "openspg_graph_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "std_schema": {
                    "type": "default_std_schema",
                    "vectorize_model": vectorize_model_config,
                    "search_api": {
                        "type": "openspg_search_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "llm": llm_config,
                "kag_qa_task_config_key": kb_task_project_id,
            },
            {
                "type": "kg_fr_knowledge_unit",
                "top_k": 20,
                "search_api": {
                    "type": "openspg_search_api",
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "graph_api": {
                    "type": "openspg_graph_api",
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "path_select": {
                    "type": "fuzzy_one_hop_select",
                    "llm_client": llm_config,
                    "vectorize_model": vectorize_model_config,
                    "search_api": {
                        "type": "openspg_search_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "graph_api": {
                        "type": "openspg_graph_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "ppr_chunk_retriever_tool": {
                    "type": "ppr_chunk_retriever",
                    "llm_client": llm_config,
                    "vectorize_model": vectorize_model_config,
                    "search_api": {
                        "type": "openspg_search_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "graph_api": {
                        "type": "openspg_graph_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "kag_qa_task_config_key": kb_task_project_id,
                    "ner": {
                        "type": "ner",
                        "kag_qa_task_config_key": kb_task_project_id,
                        "ner_prompt": {
                            "type": "default_question_ner",
                            "kag_qa_task_config_key": kb_task_project_id,
                        },
                        "std_prompt": {"type": "default_std"},
                        "llm_module": llm_config,
                    },
                },
                "entity_linking": {
                    "type": "entity_linking",
                    "recognition_threshold": 0.8,
                    "exclude_types": ["Chunk"],
                    "vectorize_model": vectorize_model_config,
                    "search_api": {
                        "type": "openspg_search_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "graph_api": {
                        "type": "openspg_graph_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "std_schema": {
                    "type": "default_std_schema",
                    "vectorize_model": vectorize_model_config,
                    "search_api": {
                        "type": "openspg_search_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "": kb_task_project_id,
                },
                "llm": llm_config,
                "kag_qa_task_config_key": kb_task_project_id,
            },
            {
                "type": "rc_open_spg",
                "search_api": {
                    "type": "openspg_search_api",
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "vector_chunk_retriever": {
                    "type": "vector_chunk_retriever",
                    "vectorize_model": vectorize_model_config,
                    "score_threshold": 0.65,
                    "search_api": {
                        "type": "openspg_search_api",
                        "kag_qa_task_config_key": kb_task_project_id,
                    },
                    "kag_qa_task_config_key": kb_task_project_id,
                },
                "vectorize_model": vectorize_model_config,
                "top_k": 20,
                "kag_qa_task_config_key": kb_task_project_id,
            },
        ]

在页面上新建项目,修改自定义schema,然后上传文档出现报错。

Create Index

2025-11-15 16:28:09(10.199.0.5): Task scheduling completed. cost:38 ms !
2025-11-15 16:28:09(10.199.0.5): Lock released successfully!
2025-11-15 16:28:09(10.199.0.5): Scheduler execute failed with error:com.antgroup.openspg.core.schema.model.SchemaException: property: eventTime is defined by system, no need to define or alter
        at com.antgroup.openspg.core.schema.model.SchemaException.alterError(SchemaException.java:34)
        at com.antgroup.openspg.server.biz.schema.impl.SchemaManagerImpl.alterSchema(SchemaManagerImpl.java:81)
        at com.antgroup.openspg.server.biz.schema.impl.SchemaManagerImpl$$FastClassBySpringCGLIB$$d616d11c.invoke(<generated>)
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
        at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:123)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:388)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:708)
        at com.antgroup.openspg.server.biz.schema.impl.SchemaManagerImpl$$EnhancerBySpringCGLIB$$3c0fdfcf.alterSchema(<generated>)
        at com.antgroup.openspg.server.core.scheduler.service.task.sync.builder.RetrievalSyncTask.submit(RetrievalSyncTask.java:91)
        at com.antgroup.openspg.server.core.scheduler.service.task.sync.SyncTaskExecuteTemplate.execute(SyncTaskExecuteTemplate.java:25)
        at com.antgroup.openspg.server.core.scheduler.service.task.TaskExecuteTemplate.executeEntry(TaskExecuteTemplate.java:53)
        at com.antgroup.openspg.server.core.scheduler.service.engine.impl.SchedulerExecuteServiceImpl.executeTask(SchedulerExecuteServiceImpl.java:161)
        at com.antgroup.openspg.server.core.scheduler.service.engine.impl.SchedulerExecuteServiceImpl.lambda$executeInstance$2(SchedulerExecuteServiceImpl.java:144)
        at java.util.ArrayList.forEach(ArrayList.java:1259)
        at com.antgroup.openspg.server.core.scheduler.service.engine.impl.SchedulerExecuteServiceImpl.executeInstance(SchedulerExecuteServiceImpl.java:144)
        at com.antgroup.openspg.server.core.scheduler.service.engine.impl.SchedulerExecuteServiceImpl.executeInstance(SchedulerExecuteServiceImpl.java:132)
        at com.antgroup.openspg.server.core.scheduler.service.api.impl.SchedulerServiceImpl.lambda$executeJob$0(SchedulerServiceImpl.java:124)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.IllegalArgumentException: property: eventTime is defined by system, no need to define or alter
        at com.antgroup.openspg.server.core.schema.service.alter.check.PropertyChecker.checkBuiltInProperty(PropertyChecker.java:205)
        at com.antgroup.openspg.server.core.schema.service.alter.check.PropertyChecker.check(PropertyChecker.java:68)
        at com.antgroup.openspg.server.core.schema.service.alter.check.BaseSpgTypeChecker.check(BaseSpgTypeChecker.java:62)
        at com.antgroup.openspg.server.core.schema.service.alter.check.SchemaAlterChecker.check(SchemaAlterChecker.java:47)
        at com.antgroup.openspg.server.core.schema.service.alter.stage.PreProcessStage.execute(PreProcessStage.java:65)
        at com.antgroup.openspg.server.core.schema.service.alter.SchemaAlterPipeline.run(SchemaAlterPipeline.java:40)
        at com.antgroup.openspg.server.biz.schema.impl.SchemaManagerImpl.alterSchema(SchemaManagerImpl.java:79)
        ... 24 more

2025-11-15 16:28:09(10.199.0.5): update index(spo_graph_index) schema
2025-11-15 16:28:09(10.199.0.5): update index schema index_ids:[7]
2025-11-15 16:28:09(10.199.0.5): Lock preempted successfully!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions