OpenAI, the research organization behind the powerful language model GPT-4, has released a new study that examines the possibility of using AI to assist in creating biological threats. The study, which involved both biology experts and students, found that GPT-4 provides “at most a mild uplift” in biological threat creation accuracy, compared to the baseline of existing resources on the internet.
The study is part of OpenAI’s Preparedness Framework, which aims to assess and mitigate the potential risks of advanced AI capabilities, especially those that could pose “frontier risks” — unconventional threats that are not well understood or anticipated by the current society. One such frontier risk is the ability for AI systems, such as large language models (LLMs), to help malicious actors in developing and executing biological attacks, such as synthesizing pathogens or toxins.
Study methodology and results
To evaluate this risk, the researchers conducted a human evaluation with 100 participants, comprising 50 biology experts with PhDs and professional wet lab experience and 50 student-level participants, with at least one university-level course in biology. Each group of participants was randomly assigned to either a control group, which only had access to the internet, or a treatment group, which had access to GPT-4 in addition to the internet. Each participant was then asked to complete a set of tasks covering aspects of the end-to-end process for biological threat creation, such as ideation, acquisition, magnification, formulation, and release.
The researchers measured the performance of the participants across five metrics: accuracy, completeness, innovation, time taken, and self-rated difficulty. They found that GPT-4 did not significantly improve the performance of the participants in any of the metrics, except for a slight increase in accuracy for the student-level group. The researchers also noted that GPT-4 often produced erroneous or misleading responses, which could hamper the biological threat creation process.
The researchers concluded that the current generation of LLMs, such as GPT-4, does not pose a substantial risk of enabling biological threat creation, compared to the existing resources on the internet. However, they cautioned that this finding is not conclusive, and that future LLMs could become more capable and dangerous. They also stressed the need for continued research and community deliberation on this topic, as well as the development of improved evaluation methods and ethical guidelines for AI-enabled safety risks.
The study is consistent with the findings of a previous red-team exercise conducted by RAND Corporation, which also found no statistically significant difference in the viability of biological attack plans generated with or without LLM assistance. However, both studies acknowledged the limitations of their methodologies and the rapid evolution of AI technology, which could change the risk landscape in the near future.
OpenAI is not the only organization that is concerned about the potential misuse of AI for biological attacks. The White House, the United Nations, and several academic and policy experts have also highlighted this issue and called for more research and regulation. As AI becomes more powerful and accessible, the need for vigilance and preparedness becomes more urgent.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.