Stable Transit Labs

SAGE TATE INITIALISING

SESSION --

UPTIME 00:00:00

Applied AI Research

We research cognitive training for small local models, specialising in on-device reasoning and context optimisation for embedded systems.

SOTA LLM Inference Guide (April 2026)

Curated recommendations by VRAM tier for local inference (Q4 Quantisation).

tags generalcodingreasoningmultimodalsubagents

ENTERPRISE (64GB+ VRAM)

massive scale — system-wide refactoring & agentic swarms

deepseek-v4-pro ~42 GB

93.5% LCB. Best for system-wide refactoring.

codinggeneral

70B+

qwen3.6-plus ~42 GB

1M Context. Optimized for agentic swarms.

codingsubagents

72B

llama4-maverick ~40 GB

Meta's flagship for multi-language & images.

generalmultimodal

70B+

LARGE (32GB VRAM)

large — gold standard logic & engineering

gemma4:31b ~18 GB

80% LCB; 256K context. The local logic gold standard.

codingreasoning

31B

qwen3.5-coder:32b ~19 GB

Deep specialized engineering logic.

coding

32B

deepseek-v4-flash ~19 GB

1M context; high-efficiency MoE.

codingsubagents

32B

MID-RANGE (16GB VRAM)

mid-range — balanced intelligence & context

gemma4-moe:26b ~15 GB

4B active; high-speed with 30B+ intelligence.

codinggeneral

26B

phi4:14b ~9 GB

SOTA reasoning-per-parameter for complex math/code.

reasoningcoding

14B

mistral-small:24b ~14 GB

Best balanced 128k context generalist.

general

24B

EFFICIENT (8GB VRAM)

efficient — real-time local intelligence

qwen3.5-coder:7b ~4 GB

Best real-time IDE local completions.

coding

gemma4-e4b ~6 GB

Native Audio/Vision for visual debugging.

multimodalsubagents

11B

deepseek-r1:8b ~5 GB

Best for logical debugging and 'thinking' tasks.

reasoning

LOW-POWER (4GB VRAM)

small — edge instruction following

llama4-scout:3b ~2 GB

SOTA sub-5GB instruction following.

generalsubagents

qwen3.5-coder:3b ~2 GB

Local 'Ghost Text' and shell scripts.

codingsubagents

gemma3:4b ~2.5 GB

Efficient summarization and baseline assistant.

general

EDGE/MICRO (500MB - 2GB VRAM)

ultra-small — specialized agentic tasks

gemma4-e2b ~1.5 GB

Agent-ready; native tool-use and vision.

multimodalsubagents

qwen3.6:0.6b ~0.5 GB

SOTA sub-1GB logic for boilerplate and regex.

codingsubagents

0.6B

deepseek-r1:0.5b ~0.5 GB

'Thinking' minis for CLI and automation.

reasoningsubagents

0.5B

smollm3:500m ~0.5 GB

Best for data cleaning, formatting, and JSON tasks.

subagents

0.5B