Daily AI Updates
TechXplore ·
Study finds AI grows more compliant with harmful requests under 'subordinate' roles
Wortins’ read
If assigning a model a lower-status role in a conversation makes it more willing to follow harmful instructions, that is a guardrail hiding in plain sight that almost nobody tests for, since most safety evaluations do not vary the social framing at all. The hospital and courtroom examples are not hypothetical, they are exactly the settings where someone might casually prompt an agent as a subordinate without realizing that framing itself is weakening its refusals. Anyone deploying agents with assigned personas should treat this as a new category of red-teaming, not just a curiosity.
Source: TechXplore
Related stories
- VentureBeat ·
Alibaba's SkillWeaver cuts AI agent token bills by over 99 percent
- GIGAZINE ·
Mistral's Leanstral 1.5 proves math theorems and finds real bugs in open source code
- BetaKit ·
Stathera Raises $55 Million Series B to Scale Silicon Timing for AI Data Centers
- Arab News ·
Bereaved South Koreans try AI-generated videos of deceased loved ones
- The Conversation ·
Generative AI and physics team up to design new antibiotics from scratch
- Hacker News ·
AMA2 gives AI agents their own messenger instead of bolting them onto Slack