#Who Answers It Better? An In-Depth Analysis of ChatGPT and Stack Overflow Answers (arXiv:2308.02312)

1 messages · Page 1 of 1 (latest)

spare thorn
#

https://arxiv.org/abs/2308.02312

This paper has been making the rounds on Twitter over the past couple of days.

Summary of their findings:

  • ChatGPT’s answers are more incorrect, significantly lengthy, and not consistent with human answers half of the time. However, ChatGPT’s answers are very comprehensive and successfully cover all aspects of the questions and the answers.
  • Many answers are incorrect due to ChatGPT’s incapability to understand the underlying context of the question being asked. Whereas, ChatGPT makes less amount of factual errors compared to conceptual errors.
  • ChatGPT rarely makes syntax errors for code answers. The majority of the code errors are due to applying wrong logic or implementing non-existing or wrong API, Library, or Functions.
  • The Popularity, Type, and Recency of the SO questions affect the correctness and quality of ChatGPT answers. Answers to more Popular and Older posts are less incorrect and more verbose. Debugging answers are more inconsistent, but less verbose, and Conceptual and How-to answers are the most verbose.
  • Compared to SO answers, ChatGPT answers are more formal, express more analytic thinking, showcase more efforts towards achieving goals, and exhibit less negative emotion.
  • ChatGPT answers portray significantly more positive sentiments compared to SO answers.
  • Participants can correctly discern ChatGPT answers from SO answers 80.75% of the time. They look for factors such as formal language, structured writing, length of answers, or unusual errors to decide whether an answer is generated by ChatGPT.
  • Participants preferred SO answers more than ChatGPT answers (65.18% of the time). Participants found SO answers to be more concise, and useful. A few reasons for SO preferences are – correctness, conciseness, casual and spontaneous language, etc.
  • Users overlook incorrect information in ChatGPT answers (39.34% of the time) due to the comprehensive, well-articulated, and humanoid insights in ChatGPT answers.