Putting Deepseek Models to the Test

hoch³

2025/05/15

In the latest issue of hoch³, the university magazine of TU Darmstadt, Prof. Iryna Gurevych and Irina Bigoulaeva from the UKP Lab examine the performance of DeepSeek-R1 and R1-Zero – two recently released models known for their reasoning capabilities.

Even cutting-edge models struggle with atypical question formats that deviate from standard training data – highlighting ongoing challenges in robustness and generalization for generative AI.

hoch³ 2/2025, Deepseek-Modelle auf dem Prüfstand, May 15, 2025 (opens in new tab)