reddit/r/localllm ai r/localllm Got tired of OOM errors on my 4GB GPU. Wrote a custom Rust bare-metal engine and hit 66.8 TPS with a 4B model (BitNet 1.58b on RTX 3050). u/CommissionOdd3082 2026-05-26 · 13:06 UTC