So you want your company to begin using artificial intelligence. Before rushing to adopt AI, consider the potential risks including legal issues around data protection, intellectual property, and liability. Through a strategic risk management framework, businesses can mitigate major compliance risks and uphold customer trust while taking advantage of recent AI advancements.
Check your training data
First, assess whether the data used to train your AI model complies with applicable laws such as India’s 2023 Digital Personal Data Protection Bill and the European Union’s General Data Protection Regulation, which address data ownership, consent, and compliance. A timely legal review that determines whether collected data may be used lawfully for machine-learning purposes can prevent regulatory and legal headaches later.
That legal assessment involves a deep dive into your company’s existing terms of service, privacy policy statements, and other customer-facing contractual terms to determine what permissions, if any, have been obtained from a customer or user. The next step is to determine whether such permissions will suffice for training an AI model. If not, additional customer notification or consent likely will be required.
Different types of data bring different issues of consent and liability. For example, consider whether your data is personally identifiable information, synthetic content (typically generated by another AI system), or someone else’s intellectual property. Data minimization—using only what you need—is a good principle to apply at this stage.
Pay careful attention to how you obtained the data. OpenAI has been sued for scraping personal data to train its algorithms. And, as explained below, data-scraping can raise questions of copyright infringement. In addition, U.S. civil action laws can apply because scraping could violate a website’s terms of service. U.S. security-focused laws such as the Computer Fraud and Abuse Act arguably might be applied outside the country’s territory in order to prosecute foreign entities that have allegedly stolen data from secure systems.
Watch for intellectual property issues
The New York Times recently sued OpenAI for using the newspaper’s content for training purposes, basing its arguments on claims of copyright infringement and trademark dilution. The lawsuit holds an important lesson for all companies dealing in AI development: Be careful about using copyrighted content for training models, particularly when it’s feasible to license such content from the owner. Apple and other companies have considered licensing options, which likely will emerge as the best way to mitigate potential copyright infringement claims.
To reduce concerns about copyright, Microsoft has offered to stand behind the outputs of its AI assistants, promising to defend customers against any potential copyright infringement claims. Such intellectual property protections could become the industry standard.
Companies also need to consider the potential forinadvertent leakage of confidential and trade-secret information by an AI product. If allowing employees to internally use technologies such as ChatGPT (for text) and Github Copilot (for code generation), companies should note that such generative AI tools often take user prompts and outputs as training data to further improve their models. Luckily, generative AI companies typically offer more secure services and the ability to opt out of model training.
Look out for hallucinations
Copyright infringement claims and data-protection issues also emerge when generative AI models spit out training data as their outputs.
That is often a result of “overfitting” models, essentially a training flaw whereby the model memorizes specific training data instead of learning general rules about how to respond to prompts. The memorization can cause the AI model to regurgitate training data as output—which could be a disaster from a copyright or data-protection perspective.
Memorization also can lead to inaccuracies in the output, sometimes referred to as “hallucinations.” In one interesting case, a New York Times reporter was experimenting with Bing AI chatbot Sydney when it professed its love for the reporter. The viral incident prompted a discussion about the need to monitor how such tools are deployed, especially by younger users, who are more likely to attribute human characteristics to AI.
Hallucinations also have caused problems in professional domains. Two lawyers were sanctioned, for example, after submitting a legal brief written by ChatGPT that cited nonexistent case law.
Such hallucinations demonstrate why companies need to test and validate AI products to avoid not only legal risks but also reputational harm. Many companies have devoted engineering resources to developing content filters that improve accuracy and reduce the likelihood of output that’s offensive, abusive, inappropriate, or defamatory.
Keeping track of data
If you have access to personally identifiable user data, it’s vital that you handle the data securely. You also must guarantee that you can delete the data and prevent its use for machine-learning purposes in response to user requests or instructions from regulators or courts. Maintaining data provenance and ensuring robust infrastructure is paramount for all AI engineering teams.
“Through a strategic risk management framework, businesses can mitigate major compliance risks and uphold customer trust while taking advantage of recent AI advancements.”
Those technical requirements are connected to legal risk. In the United States, regulators including the Federal Trade Commission have relied on algorithmic disgorgement, a punitive measure. If a company has run afoul of applicable laws while collecting training data, it must delete not only the data but also the models trained on the tainted data. Keeping accurate records of which datasets were used to train different models is advisable.
Beware of bias in AI algorithms
One major AI challenge is the potential for harmful bias, which can be ingrained within algorithms. When biases are not mitigated before launching the product, applications can perpetuate or even worsen existing discrimination.
Predictive policing algorithms employed by U.S. law enforcement, for example, have been shown to reinforce prevailing biases. Black and Latino communities wind up disproportionately targeted.
When used for loan approvals or job recruitment, biased algorithms can lead to discriminatory outcomes.
Experts and policymakers say it’s important that companies strive for fairness in AI. Algorithmic bias can have a tangible, problematic impact on civil liberties and human rights.
Be transparent
Many companies have established ethics review boards to ensure their business practices are aligned with principles of transparency and accountability. Best practices include being transparent about data use and being accurate in your statements to customers about the abilities of AI products.
U.S. regulators frown on companies that overpromise AI capabilities in their marketing materials. Regulators also have warned companies against quietly and unilaterally changing the data-licensing terms in their contracts as a way to expand the scope of their access to customer data.
Take a global, risk-based approach
Many experts on AI governance recommend taking a risk-based approach to AI development. The strategy involves mapping the AI projects at your company, scoring them on a risk scale, and implementing mitigation actions. Many companies incorporate risk assessments into existing processes that measure privacy-based impacts of proposed features.
When establishing AI policies, it’s important to ensure the rules and guidelines you’re considering will be adequate to mitigate risk in a global manner, taking into account the latest international laws.
A regionalized approach to AI governance might be expensive and error-prone. The European Union’s recently passed Artificial Intelligence Act includes a detailed set of requirements for companies developing and using AI, and similar laws are likely to emerge soon in Asia.
Keep up the legal and ethical reviews
Legal and ethical reviews are important throughout the life cycle of an AI product—training a model, testing and developing it, launching it, and even afterward. Companies should proactively think about how to implement AI to remove inefficiencies while also preserving the confidentiality of business and customer data.
For many people, AI is new terrain. Companies should invest in training programs to help their workforce understand how best to benefit from the new tools and to use them to propel their business.