-
Qwen 3.6-Plus vs Claude Opus 4.6: 3x the speed, 1/17th the price, and the benchmarks are uncomfortably close
Alibaba dropped Qwen 3.6-Plus on April 2nd, and the numbers are hard to ignore. On SWE-bench Verified — the benchmark that actually matters for coding — it scores 78.8%. Claude Opus 4.6 scores 80.9%. That’s a 2.1-point gap. On Terminal-Bench 2.0, Qwen 3.6-Plus flips the script entirely: 61.6% vs Claude’s 59.3%. And the pricing? Input… Continue reading
-
Baton charges $49 to orchestrate your AI coding agents — in a market where every competitor is free
Running one Claude Code agent is fun. Running four in parallel across different terminal windows is a mess. You’re constantly switching tabs, losing track of which agent is doing what, and praying nobody pushes to the same branch at the same time. This isn’t a hypothetical problem. It’s the exact pain point that spawned an… Continue reading
-
Microsoft Agent Governance Toolkit scores 10/10 on OWASP agentic risks — at 0.1ms per check
Everyone’s shipping AI agents. Nobody’s governing them. Microsoft is betting that’s about to become a very expensive problem. On April 2, Microsoft open-sourced the Agent Governance Toolkit — a seven-package system that sits between your agent framework and the actions agents actually take. The pitch: deterministic policy enforcement with sub-millisecond latency, covering all 10 risks… Continue reading
-
Caveman scores 333 HN points for making Claude talk like a caveman — does it actually save 75% of tokens?
“Why use many token when few token do trick.” That’s the tagline of Caveman, a Claude Code skill by Julius Brussee that went viral over the weekend. The idea is absurdly simple: make Claude drop articles, prepositions, and all the conversational fluff it loves so much. Instead of “I’ll execute the web search tool to… Continue reading
-
Embarrassingly Simple Self-Distillation (SSD) Boosts Qwen3-30B Code Scores by 30% — No Teachers, No RL, No Tricks
Apple researchers just published a paper that made Hacker News lose its mind. 596 points, 180 comments, top AI post of the day. The title alone tells you why: “Embarrassingly Simple Self-Distillation Improves Code Generation.” The pitch is almost too good to believe. Take a model. Have it generate its own code solutions. Filter out… Continue reading
-
Memvid packs AI agent memory into a single file — and outperforms SOTA RAG by 35%
The standard way to give an AI agent memory in 2026: spin up a vector database, build a RAG pipeline, manage embeddings, figure out chunking strategies, handle scaling. It works, but it’s a lot of infrastructure for what is fundamentally a simple problem — “what did we talk about last Tuesday?” Memvid throws all of… Continue reading
-
One GPU, Ten Developers, $10/Month Each: Inside sllm’s Shared Inference Gamble
Running open-source models on cloud GPUs costs real money. A single H100 on-demand runs $2-7/hour depending on the provider. Dedicate one to serving DeepSeek V3 (685B parameters) and you’re looking at roughly $14,000/month. Even smaller setups for 70B-class models land in the $500-2,000/month range. For individual developers and small teams experimenting with local AI, that’s… Continue reading
-
TinyGPU: George Hotz Got Apple to Sign an NVIDIA GPU Driver for Mac
Apple hasn’t allowed third-party GPU drivers on Apple Silicon. Not NVIDIA, not AMD, not anyone. When Apple dropped Intel chips in late 2020, eGPU support died with them. Six years of silence. Then on March 31, 2026, George Hotz’s Tiny Corp announced that Apple officially signed and approved their TinyGPU driver extension. NVIDIA RTX 30/40/50… Continue reading
-
Anthropic Paid $400M for Coefficient Bio — 10 People, 8 Months Old, No Shipped Product
Eight months in existence. Fewer than ten employees. No product on the market. And Anthropic just handed over more than $400 million in stock. Coefficient Bio is the kind of deal that makes traditional VCs either furious or jealous, depending on which side of the cap table they sit on. Founded last August by two… Continue reading
-
Claude Code Found 5 Linux Kernel Vulnerabilities — One Was Hidden for 23 Years
Nicholas Carlini wrote a simple script. He pointed Claude Code at the Linux kernel source, one file at a time, with a prompt that basically said “find vulnerabilities, treat this like a CTF challenge.” No fancy tooling, no custom pipeline, no months of fine-tuning. Just a loop, an LLM, and the entire Linux kernel. What… Continue reading
