microsoft
GeekWire File Photo

A Microsoft AI engineering leader says he discovered vulnerabilities in OpenAI’s DALL-E 3 image generator in early December allowing users to bypass safety guardrails to create violent and explicit images, and that the company impeded his previous attempt to bring public attention to the issue.

The emergence of explicit deepfake images of Taylor Swift last week “is an example of the type of abuse I was concerned about and the reason why I urged OpenAI to remove DALL·E 3 from public use and reported my concerns to Microsoft,” writes Shane Jones, a Microsoft principal software engineering lead, in a letter Tuesday to Washington state’s attorney general and Congressional representatives.

404 Media reported last week that the fake explicit images of Swift originated in a “specific Telegram group dedicated to abusive images of women,” noting that at least one of the AI tools commonly used by the group is Microsoft Designer, which is based in part on technology from OpenAI’s DALL-E 3.

“The vulnerabilities in DALL·E 3, and products like Microsoft Designer that use DALL·E 3, makes it easier for people to abuse AI in generating harmful images,” Jones writes in the letter to U.S. Sens. Patty Murray and Maria Cantwell, Rep. Adam Smith, and Attorney General Bob Ferguson, which was obtained by GeekWire.

He adds, “Microsoft was aware of these vulnerabilities and the potential for abuse.”

Jones writes that he discovered the vulnerability independently in early December. He reported the vulnerability to Microsoft, according to the letter, and was instructed to report the issue to OpenAI, the Redmond company’s close partner, whose technology powers products including Microsoft Designer. He writes that he did report it to OpenAI.

“As I continued to research the risks associated with this specific vulnerability, I became aware of the capacity DALL·E 3 has to generate violent and disturbing harmful images,” he writes. “Based on my understanding of how the model was trained, and the security vulnerabilities I discovered, I reached the conclusion that DALL·E 3 posed a public safety risk and should be removed from public use until OpenAI could address the risks associated with this model.”

Shane Jones, Microsoft principal software engineering lead. (Image via LinkedIn)

On Dec. 14, he writes, he posted publicly on LinkedIn urging OpenAI’s non-profit board to withdraw DALL-E 3 from the market.

He informed his Microsoft leadership team of the post, according to the letter, and was quickly contacted by his manager, saying that Microsoft’s legal department was demanding that he delete the post immediately, and would follow up with an explanation or justification.

He agreed to delete the post on that basis but never heard from Microsoft legal, he writes.

“Over the following month, I repeatedly requested an explanation for why I was told to delete my letter,” he writes. “I also offered to share information that could assist with fixing the specific vulnerability I had discovered and provide ideas for making AI image generation technology safer. Microsoft’s legal department has still not responded or communicated directly with me.”

“Artificial intelligence is advancing at an unprecedented pace. I understand it will take time for legislation to be enacted to ensure AI public safety,” he adds. “At the same time, we need to hold companies accountable for the safety of their products and their responsibility to disclose known risks to the public. Concerned employees, like myself, should not be intimidated into staying silent.”

The text of his post is attached to his letter Tuesday morning. (See below.)

Asked by GeekWire if he considers himself a whistleblower, and whether he would seek legal protection as such if necessary, Jones responded yes.

His letter calls on the government to create a system for reporting and tracking AI risks and issues, with assurances to employees of companies developing AI that they can use the system without fear of retaliation.

Jones concludes by asking Murray, Cantwell, Smith, and Ferguson to “look into the risks associated with DALL·E 3 and other AI image generation technologies and the corporate governance and responsible AI practices of the companies building and marketing these products.”

GeekWire has contacted Microsoft and OpenAI for comment on the letter.

Microsoft CEO Satya Nadella is scheduled to appear Tuesday evening on a pre-recorded interview on NBC Nightly News, in which anchor Lester Holt asked Nadella about topics including the Taylor Swift deepfakes. Nadella called the issue of deepfakes “alarming and terrible,” and said, “we have to act,” according to a partial transcript.

In a statement last week following the emergence of the deepfakes, a Microsoft spokesperson said the company is “committed to providing a safe and respectful experience for everyone.”

Although it was unclear where the images originated, the spokesperson said, “Out of extreme caution we’re investigating and have strengthened our existing safety systems to prevent our services from being used to help generate these images.”

Microsoft reports earnings Tuesday afternoon, and investors are watching closely for the impact of new and emerging AI products for businesses on the company’s revenue.

Here is the full text of Jones’ Jan. 30 letter, including the text of his LinkedIn post.

AI – DALL-E 3 – Shane Jones Letter by GeekWire on Scribd

Source link