EmpiriGraph-Psy: Advancing Empirical Relation Graph Extraction in Psychology via LLM Pipelines

Researchers have introduced EmpiriGraph-Psy, a novel dataset and LLM-driven pipeline designed to bridge the gap in scientific relation extraction by mapping psychology abstracts into structured, variable-centered empirical graphs.

Addressing the Domain Gap in Relation Extraction

Current benchmarks for scientific relation extraction are heavily skewed toward domains like computer science, where the primary entities are typically tasks, methods, datasets, or performance metrics. However, this paradigm fails to capture the nuances of variable-oriented empirical fields, such as psychology. In these disciplines, scientific findings are not merely associations between methods and results, but complex relations among psychological constructs, measurements, interventions, and outcomes.

Introducing Variable-Centered Empirical Graph Extraction

To address this limitation, the authors propose the task of variable-centered empirical graph extraction. This approach focuses on transforming unstructured scientific abstracts into typed graphs. In these graphs, nodes represent normalized variables, allowing for a more precise representation of how different psychological factors interact and influence one another.

The EmpiriGraph-Psy Framework

The EmpiriGraph-Psy project provides both a specialized dataset and an LLM pipeline tailored for this extraction process. By focusing on the normalization of variables, the pipeline ensures that synonymous constructs are mapped to a single node, thereby creating a coherent knowledge graph that represents the empirical findings of the research abstracts accurately.

Note: Due to the limited nature of the provided source text, specific details regarding the LLM architecture, the exact size of the dataset, and the quantitative performance metrics of the pipeline were not available.

Original Source
Large Language Models Relation Extraction Knowledge Graphs Psychology Information Extraction