Liquid AI Introduces LFM2.5-230M: A Compact Edge AI Model for On-Device Inference

Liquid AI has released LFM2.5-230M, a 230M-parameter open-weight model designed for on-device inference. It leverages the LFM2 architecture and integrates support for multiple frameworks, enabling efficient execution of agent loops directly on mobile devices.

Model Architecture and Key Features

The LFM2.5-230M model is built on the LFM2 architecture, which incorporates 8 double-gated LIV convolution blocks and 6 GQA (Grouped Query Attention) layers. This design balances model size with computational efficiency, making it suitable for resource-constrained environments like smartphones. The model was pre-trained on 19 trillion tokens, followed by post-training via distillation from the larger LFM2.5-350M variant, optimizing its performance for specific tasks.

On-Device Inference Focus

Unlike traditional edge AI solutions that rely on cloud-based models quantized for local use, LFM2.5-230M is explicitly engineered to run the "agent loop" directly on the device. This approach minimizes latency and bandwidth usage, addressing a critical gap in current edge AI implementations where cloud dependencies often hinder real-time responsiveness.

Technical Implementation and Frameworks

Liquid AI has integrated LFM2.5-230M with several modern machine learning frameworks, including llama.cpp, MLX, vLLM,