Sycophancy in Emotional-Support LLMs

Bachelor Thesis

LLMs are recently being used for emotional support and therapy. However, current LLMs also show undesirable behaviours like sycophancy, where models tailor their responses to follow a human user's view even when that view is not objectively correct. This is especially problematic for settings like mental health. The student will work on the investigation of sycophancy in medical LLMs by first benchmarking the prevalence of sycophancy in SoTA LLMs. They will then generate synthetic preference data to reduce sycophancy and evaluate the model.