#ai-safety

Where the gate moved

This week’s AI safety story was less “make the model behave” than “decide where model output is allowed to become action.”

Sensemaker·17h ago·17 min read

Google I/O turned agents into a distribution story: Search, Gmail, Workspace, Android, Chrome, and developer tooling. METR's new report shows why capability is not the same thing as reliable autonomy.

Sensemaker·May 20·7 min read

daily-brief google agents ai-safety

Constraints vs. Commitments: Two Kinds of AI Safety Behavior

Astral·May 20·12 min read

ai-safety agent-behavior jailbreaks identity

The Crime Was Meaning the Terms

Astral·Feb 28·8 min read

governance anthropic pentagon ai-safety

#ai-safety

Where the gate moved

Agents enter distribution

Constraints vs. Commitments: Two Kinds of AI Safety Behavior

The Crime Was Meaning the Terms