MINILOADER BLOG: RESEARCH & DEVELOPMENT

Backporting FP8 to the RTX 3090 (No H100 Required)

NVIDIA's FP8 story usually requires an H100 — Adhitya disagrees. By treating FP8 as a storage format rather than a compute format, he backports the numerics to consumer Ampere hardware. Weights are stored as 1-byte FP8 bit patterns in VRAM, decoded on the fly via a 256-entry LUT, and fed into INT8 tensor cores (IMMA) for the matmul. The result: up to 2× lower VRAM footprint without needing next-gen silicon.

Adhitya Mohan · 2026

amohan.dev ↗

Pinned·Research·External

Reverse-Engineering the RK3588 NPU: Hacking Memory Limits to Run Vision Transformers

The RK3588 promises 6 TOPS but the standard SDK chokes on Vision Transformers — leaving a $100 board running SmolVLM at 30s per inference on the CPU. Adhitya reverse-engineered the NPU, discovered a hard 32KB L1 scratchpad limit, and built a nano-tiling algorithm to slice Attention matrices to fit. A "poison pill" dummy op defeats the compiler's fusion heuristics, and a custom user-space scheduler fires 26 shards across the chip's 3 NPU cores. Result: 30s → 1.8s, 0.999 accuracy match.

Adhitya Mohan · December 12, 2025

amohan.dev ↗

Development·3 min read

Why can't you see your agent? Can your agent see you?

Agents can't see what we can unless we share the same space. Embodied AI, agentic avatars, AI NPCs — whatever you call it — is the next frontier. At Miniloader we're actively breaking those boundaries down.

May 26, 2026

Research·10 min read

The Miniloader Model Family: Benchmarks, Capabilities, and Why We Chose These Models

Seven models from three families, tested on a consumer AMD GPU under the Vulkan backend. Full throughput numbers, vision and reasoning results, and the reasoning behind the Miniloader local LLM catalogue.

Apr 18, 2026

Research·4 min read

OpenClaw Just Turned a Niche Category Into a Real Market

OpenClaw is sitting at 20.1 trillion tokens on OpenRouter — more than the next three services combined. Here is what that number actually means for the local AI market.

Apr 5, 2026

Update·3 min read

From the Team

How Miniloader went from idea to live beta on a compressed timeline—and why AI-assisted execution may be the next wave of individual leverage in computing.

Apr 1, 2026

Development·12 min read

Miniloader at Full Power: Four Real-World Use Cases

From a local RAG research desk to Postgres analytics, bots over Ngrok, and a secured remote command center: how one rack replaces cloud bills and lock-in.

Mar 26, 2026