Author of "Petri: An open-source auditing tool to accelerate AI safety research" in Anthropic
How this journalist typically writes
October 6 as author
Anthropic releases Petri, an open-source automated auditing framework that uses AI agents to systematically test frontier models for misaligned behaviors like deception and oversight subversion across diverse scenarios.
“Author of "Petri: An open-source auditing tool to accelerate AI safety research" in Anthropic”
Anthropic released Petri, an open-source automated auditing framework that uses AI agents to test frontier models for misaligned behaviors including deception and oversight subversion across diverse scenarios.
“Author of "Anthropic's Petri for Automated AI Safety Auditing"”