vimarsana.com
Home
Live Updates
Harmful Task Performance - Breaking News
Pages:
Latest Breaking News On - Harmful task performance - Page 1 : vimarsana.com
LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B — LessWrong
Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Jeffrey Ladish. …
Jeffrey ladish
Seri ml alignment theory scholars program
Theory scholars program
Ongoing release
While llama
Code llama
Refusal evaluation
Unrestricted llama
Model size
Harmful task performance
Attacks semantic influence
vimarsana © 2020. All Rights Reserved.