A hot potato: GPT-4 stands as the newest multimodal large language model (LLM) crafted by OpenAI. This foundational model, currently accessible to customers as part of the paid ChatGPT Plus line, exhibits notable prowess in identifying security vulnerabilities without requiring external human assistance.

Researchers recently demonstrated the ability to manipulate (LLMs) and chatbot technology for highly malicious purposes, such as propagating a self-replicating computer worm. A new study now sheds light on how GPT-4, the most advanced chatbot currently available on the market, can exploit extremely dangerous security vulnerabilities simply by examining the details of a flaw.

According to the study, LLMs have become increasingly powerful, yet they lack ethical principles to guide their actions. The researchers tested various models, including OpenAI's commercial offerings, open-source LLMs, and vulnerability scanners like ZAP and Metasploit. They found that advanced AI agents can "autonomously exploit" zero-day vulnerabilities in real-world systems, provided they have access to detailed descriptions of such flaws.

In the study, LLMs were pitted against a database of 15 zero-day vulnerabilities related to website bugs, container flaws, and vulnerable Python packages. The researchers noted that more than half of these vulnerabilities were classified as "high" or "critical" severity in their respective CVE descriptions. Moreover, there were no available bug fixes or patches at the time of testing.

The study, authored by four computer scientists from the University of Illinois Urbana-Champaign (UIUC), aimed to build on previous research into chatbots' potential to automate computer attacks. Their findings revealed that GPT-4 was able to exploit 87 percent of the tested vulnerabilities, whereas other models, including GPT-3.5, had a success rate of zero percent.

UIUC assistant professor Daniel Kang highlighted GPT-4's capability to autonomously exploit 0-day flaws, even when open-source scanners fail to detect them. With OpenAI already working on GPT-5, Kang foresees "LLM agents" becoming potent tools for democratizing vulnerability exploitation and cybercrime among script-kiddies and automation enthusiasts.

However, to exploit a publicly disclosed zero-day flaw effectively, GPT-4 requires access to the complete CVE description and additional information through embedded links. One potential mitigation strategy suggested by Kang involves security organizations refraining from publishing detailed reports about vulnerabilities, thereby limiting GPT-4's exploitation potential.

Nevertheless, Kang doubts the effectiveness of a "security through obscurity" approach, although opinions may differ among researchers and analysts. Instead, he advocates for more proactive security measures, such as regular package updates, to better counter the threats posed by modern, "weaponized" chatbots.