vimarsana.com

Researchers at ETH Zurich created a jailbreak attack that bypasses AI guardrails

A pair of researchers from ETH Zurich developed a poisoning attack method by which artificial intelligence models trained via reinforcement learning from human feedback can be jailbroken.

Related Keywords

Zurich ,Züsz ,Switzerland ,Javier Rando ,Microsoft ,Google ,Jailbreak Backdoors ,Poisoned Human ,Reinforcement Learning ,Human Feedback ,

vimarsana.com © 2020. All Rights Reserved.