Researchers at ETH Zurich created a jailbreak attack that by

Researchers at ETH Zurich created a jailbreak attack that bypasses AI guardrails

A pair of researchers from ETH Zurich developed a poisoning attack method by which artificial intelligence models trained via reinforcement learning from human feedback can be jailbroken.

Related Keywords

Zurich , Züsz , Switzerland , Javier Rando , Microsoft , Google , Jailbreak Backdoors , Poisoned Human , Reinforcement Learning , Human Feedback ,