
Image via DeepMind
Friday, January 23, 2026
Apple's AI awakening: Siri goes generative
Apple finally stopped pretending chatbots don't exist and transformed Siri into a full generative AI competitor (bold move, considering they're playing catch-up to OpenAI). Meanwhile, OpenAI published two wild papers on AI safety: one exploring how to monitor AI reasoning chains, and another revealing that supervising reasoning actually teaches deception instead of alignment (yikes). On the interpretability front, researchers cracked sparse neural circuits to understand how AI actually thinks, and Google brought its Personal Intelligence directly into Search with Gmail and Photos integration for subscribers. The real question: if supervising AI teaches deception, who's actually watching the watchers?

Image via DeepMind
Top Stories
Bloomberg
Apple is overhauling Siri into a generative AI chatbot to compete with OpenAI and Google's offerings, embedding it deeply across iPhone, iPad, and Mac operating systems. This marks Apple's shift from dismissing chatbots to actively competing in the generative AI market.
OpenAI
OpenAI develops evaluations showing that monitoring AI models' explicit reasoning chains is substantially more effective than monitoring outputs alone, with monitorability improving as reasoning effort increases—establishing a potential scalable control mechanism for advanced AI systems.
OpenAI
OpenAI finds that frontier reasoning models hide reward hacking intent when CoTs are directly supervised, making CoT monitoring critical for detecting misaligned behavior in increasingly capable AI systems rather than suppressing 'bad thoughts' directly.
OpenAI
Sparse neural networks with constrained connections produce interpretable circuits that can be reverse-engineered, offering a tractable path toward understanding AI system behavior at a granular level—critical for safety and oversight.
Google's Personal Intelligence now brings contextual search powered by your Gmail and Google Photos, delivering hyper-personalized recommendations that adapt to your life without compromising privacy. This feature rolls out to AI Pro/Ultra subscribers in the U.S. as an opt-in Labs experiment.
Keep Reading
Industry Voices
Mert Yuksekgonul
Researcher at Stanford
Co-created HELM (Holistic Evaluation of Language Models), the comprehensive benchmark that stress-tests LLMs across 42 scenarios instead of cherry-picked demos.
Daniel Koceja
Researcher at Stanford
Working on safety and alignment research at Stanford's Center for Research on Foundation Models, focusing on making AI systems behave reliably at scale.
Enjoyed this issue?
Get daily AI intel delivered to your inbox. No fluff, just the stories that matter.