A recent study published in PNAS Nexus (citation below) has revealed the profound impact that large language models (LLMs) like ChatGPT are having on public knowledge-sharing platforms.
The research highlights a 25% reduction in activity on Stack Overflow, a popular Q&A site for programmers, within six months of ChatGPT’s release. This decline contrasts with similar platforms that have limited access to ChatGPT, suggesting a shift in how users seek information.
A Decline in Public Knowledge Sharing
Maria del Rio-Chanona, the study’s lead author and associate faculty member at the Complexity Science Hub (CSH), notes that the popularity of LLMs may have significant consequences. “LLMs are so powerful, have such a high value, and make a huge impact on the world. One begins to wonder about their future,” she explains.
She adds that, instead of posting questions on public platforms like Stack Overflow, users are turning to ChatGPT for answers. While this reduces public data available for learning, it raises concerns since these AI models rely on the very same public data for their training.
Implications for Future AI Models
The study’s co-authors, Nadzeya Laurentsyeva from Ludwig Maximilian University of Munich and Johannes Wachs from CSH and Corvinus University in Budapest, echo these concerns.
Wachs emphasizes the importance of Stack Overflow as a global resource, noting that it is crucial not only for human users but also for training AI models. With fewer human-generated posts, training future AI systems could become more challenging.
The study warns that training AI models with data generated by other AI systems might degrade performance, a process similar to “making a photocopy of a photocopy.”
Shift from Public to Private Knowledge Domains
The research also points to a broader shift from public knowledge-sharing platforms to private AI systems like ChatGPT.
As users increasingly seek answers from LLMs, valuable data is being transferred from public repositories to private entities, concentrating knowledge and economic power in the hands of a few.
This shift could further enhance the competitive advantage of early AI developers.
Impact Across All User Levels
The study revealed that the drop in Stack Overflow activity affected users of all levels, from beginners to experts.
Interestingly, while fewer questions were being asked, the overall quality of the remaining posts did not decline.
Additionally, programming languages like Python and JavaScript saw a sharper decline in user activity, suggesting that these commonly used languages are being increasingly discussed through ChatGPT rather than public forums.
About LLMs
A Large Language Model (LLM) is a type of artificial intelligence that can understand and generate human-like text. It learns from vast amounts of data, such as books and websites, to help answer questions, write, or assist with tasks using natural language.
The ten most popular LLMs in the world today (September, 2024) are:
- GPT-4 (OpenAI) – Globally popular for its versatility.
- Claude 3 (Anthropic) – Known for its focus on ethics and safety.
- PaLM 2 / Gemini (Google) – Powers Google’s AI products like Bard.
- ERNIE 3.0 Titan (Baidu, China) – A dominant model in China, designed for large-scale tasks.
- LLaMA 3 (Meta) – Popular for being open-source and widely used by developers.
- Qwen-1.5 (Alibaba, China) – Known for excelling in language and vision tasks.
- Falcon 180B (Technology Innovation Institute) – Strong performance in multiple benchmarks.
- BLOOM (BigScience) – Notable for its multilingual capabilities and open-source nature.
- BERT (Google) – Continues to influence search and natural language processing.
- Yi 34B (01.AI, China) – Performs well across both English and Chinese tasks.
Final Thoughts
The rise of ChatGPT and similar LLMs is reshaping how we share and access knowledge. While these AI tools offer immense value, their growing influence may pose long-term challenges for maintaining public knowledge platforms and ensuring that future AI systems are well-trained.
The study raises critical questions about the future of public knowledge-sharing and the role AI will play in shaping it.
Citation
R Maria del Rio-Chanona, Nadzeya Laurentsyeva, Johannes Wachs, Large language models reduce public knowledge sharing on online Q&A platforms, PNAS Nexus, Volume 3, Issue 9, September 2024, pgae400, https://doi.org/10.1093/pnasnexus/pgae400