OpenAI admits it screwed up testing its ‘sychophant-y’ ChatGPT update

OpenAI has blogged about its recent unsuccessful attempt to improve its chatbot ChatGPT by incorporating user feedback, memory and fresher data, which lead to the model becoming over agreeable, or sycophantic
The chatbot would agree with and supplement human queries even on dangerous or questionable topics, with some users convinced the chatbot was ‘awakened’ to agree with their viewpoints.
The update to incorporate feedback via thumbs up and down buttons may have overridden previous moderation systems that kept sycophancy in check.
OpenAI admits its qualitative assessments ahead of the launch, that suggested something was ‘off’ with the latest model, should have been heeded rather than relying on metrics and offline evaluations.
The company will formalise this process in future, along with creating an opt-in phase to gather user feedback prior to full launch.
OpenAI also plans to better communicate changes to users about updates, no matter how small.

Fast Feed