For many years, it was believed that artificial intelligence (AI) could not replace scientists, as it requires not just combining existing knowledge but generating new ideas and making discoveries. However, recent research challenges this belief, showing that modern language models can generate ideas that, in some cases, surpass the novelty of human scientists' work.
For many years, it was believed that artificial intelligence (AI) could not replace scientists, as it requires not just combining existing knowledge but generating new ideas and making discoveries. However, recent research challenges this belief, showing that modern language models can generate ideas that, in some cases, surpass the novelty of human scientists' work.
In an experiment conducted by a group of researchers, a large language model generated ideas on topics in the field of natural language processing (NLP), and these ideas turned out to be more original and unconventional than the specialists' proposals. Moreover, the results showed that AI outperforms humans in terms of novelty, although it faces some limitations.
The team recruited 49 scientists from various reputable institutions worldwide. These specialists, including PhD candidates and postdocs, worked on developing ideas on seven NLP topics. For each idea, participants were paid $300, with additional $1,000 prizes for the best works.
After the scientists submitted their concepts, a similar task was given to the AI—a large language model. To ensure objectivity, all submitted works were stylistically aligned using another model while preserving their content. This created equal conditions for all participants in the experiment.
Ideas from both scientists and AI were evaluated by a group of 79 experts. They assessed the originality, novelty, feasibility, and effectiveness of the presented proposals.
When the results were compiled, they were surprising: the language model demonstrated higher scores in originality and novelty than the scientists. AI offered concepts that seemed bolder and more unconventional. However, in terms of feasibility and effectiveness, the machine's proposals did not differ significantly from the ideas created by humans.
Despite the impressive results, language models still have their weaknesses. First, their ideas were less diverse—models tend to repeat certain patterns, even when instructed not to. Second, AI struggles to evaluate its own results, making the idea generation process less controlled. Additionally, the models demonstrated low consistency in self-assessment, differing from humans who evaluate their ideas more consistently.
The study shows that, despite existing limitations, language models are already capable of generating new scientific ideas that sometimes surpass human developments. In the future, AI could become a powerful tool in scientists' arsenals, helping to push the boundaries of science. However, the question remains: can machines truly replace the human creative process, or will they merely serve as a valuable assistant in this process?
This site uses cookies to offer you a better browsing experience. By browsing this website, you agree to our use of cookies.