Datalab Introduces lift: A 9B Open-Weights Vision Model for Schema-Constrained PDF Extraction

Datalab has released lift, a 9-billion parameter open-weights vision model designed to solve the reliability gap in structured data extraction by ensuring output validity through decoder-level JSON schema constraints.

Moving Beyond Prompt-Based Extraction

Traditional structured extraction workflows typically rely on general-purpose Large Language Models (LLMs) prompted to return JSON. These methods often necessitate complex "retry loops" and post-processing scripts to handle formatting errors, as the model's output is not inherently guaranteed to adhere to a specific schema. Datalab's new model, lift, shifts this paradigm by integrating the structural guarantee directly into the model's architecture.

Technical Architecture and Capabilities

Unlike standard LLMs, lift is a 9B vision model that decodes directly against a provided JSON schema. This means the output is valid by construction, eliminating the need for external validation loops or corrective prompting. By enforcing the schema at the decoder level, the model ensures that the resulting JSON is syntactically correct and aligned with the user's defined structure.

Multi-Page Document Processing

One of the model's most significant capabilities is its ability to process multi-page documents in a single pass. This allows lift to extract values that span across page breaks, a common failure point for models that process documents as a series of isolated images or chunks.

Key Technical Advantages

  • Deterministic Structure: Decoder-level constraints ensure strict adherence to JSON schemas.
  • Vision-Native: Capable of parsing complex PDF layouts without relying solely on fragile OCR-to-text pipelines.
  • Open-Weights: Providing greater transparency and flexibility for deployment and fine-tuning.
  • Contextual Continuity: Single-pass processing for cross-page value extraction.

Note: Specific benchmark results and training dataset details were not provided in the source material.

Original Source
Vision-Language Models (VLM) Structured Data Extraction JSON Schema Open-Weights PDF Parsing