Putting Deepseek Models to the Test
hoch³
2025/05/15
In the latest issue of hoch³, the university magazine of TU Darmstadt, and Prof. Iryna Gurevych from the Irina Bigoulaeva examine the performance of DeepSeek-R1 and R1-Zero – two recently released models known for their reasoning capabilities. UKP Lab
Even cutting-edge models struggle with atypical question formats that deviate from standard training data – highlighting ongoing challenges in robustness and generalization for generative AI.
(opens in new tab) hoch³ 2/2025, Deepseek-Modelle auf dem Prüfstand, May 15, 2025
