vimarsana.com
Home
Live Updates
Red Teaming Language Models with Language Models : vimarsana
Red Teaming Language Models with Language Models : vimarsana
Red Teaming Language Models with Language Models
In our recent paper, we show that it is possible to automatically find inputs that elicit harmful text from language models by generating inputs using language models themselves.
Related Keywords
,
Microsoft ,
Red Teaming Language Models ,
Stay Twitter ,
Information Generation ,