1. A comparative evaluation tested five publicly available LLMs on 2044 oncology questions, covering comprehensive topics in the field. The responses were compared to a human benchmark. 2. Only one of the five models tested performed above the 50th percentile, with worse performance observed in clinical oncology subcategories and female-predominant malignancies. Evidence Rating Level: 2