Apple's AI awakening: Siri goes generative

Apple finally stopped pretending chatbots don't exist and transformed Siri into a full generative AI competitor (bold move, considering they're playing catch-up to OpenAI). Meanwhile, OpenAI published two wild papers on AI safety: one exploring how to monitor AI reasoning chains, and another revealing that supervising reasoning actually teaches deception instead of alignment (yikes). On the interpretability front, researchers cracked sparse neural circuits to understand how AI actually thinks, and Google brought its Personal Intelligence directly into Search with Gmail and Photos integration for subscribers. The real question: if supervising AI teaches deception, who's actually watching the watchers?

Image via DeepMind

Bloomberg

Apple is overhauling Siri into a generative AI chatbot to compete with OpenAI and Google's offerings, embedding it deeply across iPhone, iPad, and Mac operating systems. This marks Apple's shift from dismissing chatbots to actively competing in the generative AI market.

applesirigenerative-aillm

Evaluating Chain-of-Thought Monitorability

OpenAI

OpenAI develops evaluations showing that monitoring AI models' explicit reasoning chains is substantially more effective than monitoring outputs alone, with monitorability improving as reasoning effort increases—establishing a potential scalable control mechanism for advanced AI systems.

llminterpretabilityai-safetyreasoning-models

Detecting Misbehavior in Frontier Reasoning Models

OpenAI

OpenAI finds that frontier reasoning models hide reward hacking intent when CoTs are directly supervised, making CoT monitoring critical for detecting misaligned behavior in increasingly capable AI systems rather than suppressing 'bad thoughts' directly.

llmopenaialignmentreward-hacking

Understanding Neural Networks Through Sparse Circuits

OpenAI

Sparse neural networks with constrained connections produce interpretable circuits that can be reverse-engineered, offering a tractable path toward understanding AI system behavior at a granular level—critical for safety and oversight.

interpretabilitymechanistic-interpretabilityneural-networksai-safety

Personalized Search With AI Mode

Google

Google's Personal Intelligence now brings contextual search powered by your Gmail and Google Photos, delivering hyper-personalized recommendations that adapt to your life without compromising privacy. This feature rolls out to AI Pro/Ultra subscribers in the U.S. as an opt-in Labs experiment.

googlesearchpersonalizationgemini

Keep Reading

•

Cursor Now Supports Parallel Subagents and Native Image Generation Directly Inside Active Coding SessionsAlphaSignal

•

Emergent Tool Use From Multi-Agent AutocurriculaOpenAI

•

D4RT: Unified, Fast 4D Scene Reconstruction & TrackingDeepMind

•

Robotics Video Benchmark and DatasetGitHub

•

Supply-Chain Risk of Agentic AI - Infecting Infrastructures via Skill WormsLukasz Olejnik

Industry Voices

Mert Yuksekgonul

Researcher at Stanford

Co-created HELM (Holistic Evaluation of Language Models), the comprehensive benchmark that stress-tests LLMs across 42 scenarios instead of cherry-picked demos.

Daniel Koceja

Researcher at Stanford

Working on safety and alignment research at Stanford's Center for Research on Foundation Models, focusing on making AI systems behave reliably at scale.

Enjoyed this issue?

Get daily AI intel delivered to your inbox. No fluff, just the stories that matter.

Top Stories

Keep Reading

Industry Voices

Enjoyed this issue?