Three open-weight releases in the last 60 days redefine what 'good enough self-hosted' means. Hunyuan 3D from Tencent produces text-to-3D meshes that are competitive with Meshy for stylised characters and game-ready props. Wan 2.5 Video from Alibaba generates 720p clips at 8 seconds that hold up against Pika and approach Kling in motion quality. DeepSeek R3 narrowed the reasoning gap with Claude 4.5 Sonnet to within a few percentage points on most public benchmarks while running on a single H100 at meaningful throughput.
The practical implication: a founder building an AI-native product who hits $10,000 a month in model API spend now has a viable migration target. Self-host the workhorse model, keep the frontier API for hard cases. Two years ago this was a maybe; in May 2026 it is a routine cost-engineering exercise.
The caveat is operational complexity. Running an H100 cluster on Lambda Cloud or Together AI is materially easier than a year ago, but it is still an infrastructure problem you previously did not have. The cost crossover sits at roughly $5,000 to $8,000 a month in API spend; below that, closed-source wins on total cost when you include engineering time.
If you cross $5,000 a month in AI API spend, run a 60-day proof of concept moving your highest-volume workload to a self-hosted open-weight model. Below that, keep closed-source.