I built a local MCP server that gives AI agents on-device Vision OCR no cloud, no API keys
[Demo of how it works](https://i.redd.it/y1izxx6xfu2h1.gif) I got tired of sending documents and images to cloud APIs just to extract text, so I built [VisionMCP](https://github.com/br3akzero/vision.
โ View original source
Qwen3.6 27B Pure Quant: 40 tok/s on 16 GB VRAM
Hello everyone! I want to share the result of my experiment to make **Qwen3.6 27B** **Q4\_K\_M** fits in to my RTX 5060 Ti 16 GB. Inspired by u/Due-Project-7507's work on [Ununnilium/Qwen3.6-27B-IQ4\
โ View original source
Qwen3.6-35B-A3B Q4 262k context on 8GB 3070 Ti = +30tps
..and on 8GB VRAM I can even push the context to 320K, 400K, 512K, and yes.. 1M. But it does start to slow down noticeably beyond 150k so I'd only do this if I ever really want the larger context. Th
โ View original source
ByteShape Qwen3.6-35B-A3B: 30% faster than Unsloth IQ on 6GB VRAM laptop
A few days ago I posted about my experiments with MTP on a 6GB VRAM laptop. That didn't work so well; CPU offload hurts MTP performance badly. But now I've tried out the [new ByteShape quants](https:/
โ View original sourceScientific Proof Why AGI Cannot Be Achieved by OpenAI, Anthropic or Google
Big Tech is spending $700 billion a year chasing Artificial General Intelligence. The physics says they will never get there — not because…Continue reading on Medium ยป
โ View original sourceperspective-dev /perspective
A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
โ View original sourceai-dynamo /dynamo
A Datacenter Scale Distributed Inference Serving Framework
โ View original source
Free Google search MCP for local LLMs (no API key, no SerpAPI, runs Playwright on your box)
local models can't search. paid options want API keys (SerpAPI free tier is tiny), and the 6 free Google search MCPs I tested all failed. so I wrote one. drives a warm Chrome profile via Playwright.
โ View original source
Experimental "Preserve Thinking" Jinja Template for Gemma4 31B in llama.cpp
[https://huggingface.co/stevelikesrhino/gemma-4-31B-it-nvfp4-GGUF/blob/main/gemma4-improved.jinja](https://huggingface.co/stevelikesrhino/gemma-4-31B-it-nvfp4-GGUF/blob/main/gemma4-improved.jinja) Ya
โ View original source
meituan-longcat/LongCat-Video-Avatar-1.5 ยท Hugging Face
# ๐ Model Introduction We are excited to announce the release of LongCat-Video-Avatar 1.5, an upgraded open-source framework that prioritizes extreme empirical optimization and production-readiness f
โ View original source