vimarsana.com
Home
Live Updates
Beware of Unreliable Data in Model Evaluation: A LLM Prompt
Beware of Unreliable Data in Model Evaluation: A LLM Prompt
Beware of Unreliable Data in Model Evaluation: A LLM Prompt Selection case study with Flan-T5
You may choose suboptimal prompts for your LLM (or make other suboptimal choices via model evaluation) unless you clean your test data.
Related Keywords
Jonas Mueller ,
Chris Mauck ,
Community Slack ,
Google Research ,
Linkedin ,
Twitter ,
Unreliable Data ,
Model Evaluation ,
Stanford Politeness Dataset ,
Observed Test ,
Clean Test ,
Clean Test Accuracy ,
Observed Test Accuracy ,
Noisy Evaluation ,
Large Language Model ,
Test Accuracy ,
Available Test Data ,
More Reliable ,
Cleanlab Studio ,
Cleanlab Test ,