
Image via Unknown
Wednesday, September 10, 2025
ByteDance just matched GPT-4o without the huge bill
ByteDance just pulled off something wild with their backward reasoning method, matching GPT-4o's performance without the expensive training bill (yikes for the rest of us). Meanwhile, Google's making moves on two fronts: slashing Veo 3 prices in half while adding 1080p and vertical video support, plus releasing EmbeddingGemma for offline, privacy-first applications that actually work on your phone. And Anthropic's launching the MCP Registry to standardize how AI servers talk to each other, which is the kind of boring-but-necessary infrastructure that actually matters. Oh, and there's this whole debate about gross margins in AI that's forcing application builders to look way beyond just token costs. Here's the real question: are you still optimizing for token efficiency, or have you moved on to the harder stuff?

Image via Unknown
Top Stories
AI gross margins diverge significantly across the stack, with application-layer companies facing the widest dispersion; success requires moving beyond token pricing, deepening workflows, and iterating pricing models to balance growth and profitability.
ByteDance's REER method achieves competitive reasoning performance in open-ended tasks by reverse-engineering thinking processes from good outputs, offering a scalable alternative to expensive distillation and reinforcement learning approaches.
Google's EmbeddingGemma is a lightweight, open-source embedding model optimized for offline, on-device AI that enables private semantic search and RAG applications without cloud dependency. This addresses enterprise demand for privacy-preserving AI capabilities while democratizing access to high-quality multilingual embeddings.
The MCP Registry provides a unified, open-source hub for discovering MCP servers, enabling easier integration while allowing enterprises to build custom sub-registries on top of it. This standardization accelerates ecosystem adoption and interoperability across AI applications.
Google cuts Veo 3 pricing in half while adding vertical video and 1080p support, making its video generation model more accessible for mobile-first and social media applications. The move signals Google's commitment to scaling developer adoption and competing in the rapidly growing generative video market.
Keep Reading
Industry Voices
Mira Murati
She's behind OpenAI's product decisions from ChatGPT to o1, so following her gives you the earliest signal on where frontier AI is heading.
John Hallman
Pretraining Researcher at OpenAI
He tweets rare technical insights from inside OpenAI's pretraining team, the actual people scaling the models everyone else studies.
Amjad Masad
CEO and Founder at Replit
He's building AI that writes full applications in seconds and shares unfiltered takes on how LLMs will replace traditional coding.
Anastasis Germanidis
Co-founder and CTO at Runway
He's turning text into Hollywood-grade video at Runway and posts technical breakdowns of generative video before anyone else catches on.
Enjoyed this issue?
Get daily AI intel delivered to your inbox. No fluff, just the stories that matter.