Frances M. Green, Of Counsel in the Employment, Labor & Workforce Management practice, in the firm’s New York office, co-authored an article in the Society of Actuaries Newsletter, Actuarial Intelligence Bulletin, titled “What’s in Your AI?” with Raymond K. Sheh, PHD, and Karen Geappen, MCSSD.
Following is an excerpt:
When you buy a box of cookies, the label tells you what’s inside: flour, sugar, butter, and eggs. There may be a “nutfree” assurance. If contamination is found, sources can be traced, and batches recalled. When an AI system generates a summary of an important email or flags a fraudulent transaction, what “ingredients” went into that output? If something goes wrong, can the source be traced? As AI systems are increasingly integrated, explicitly or implicitly, into decision-making, we face a growing challenge in assessing, weighing, and managing the risks they entail.
The Challenges of Hidden Complexity An actuary might use an AI system to draft regulatory reports, transforming tables of reserve calculations and assumption changes into clear explanations for state insurance departments. On a more sophisticated level, an actuary could deploy an AI agent to continuously monitor emerging mortality data across multiple databases, automatically flagging significant deviations from expected trends and assembling preliminary impact analyses on life insurance reserves. How can the risks associated with something important being omitted or misrepresented be managed?
AI systems have an intricate, dynamic web of dependencies beyond those for traditional software. The "ingredients" contributing to each output may span dozens of entities across disparate industries and contexts. Even if the AI system was trained on high-quality data sources such as historic reports and regulatory texts, the foundational models that enable it to understand language may be trained on web-scale datasets that include historically biased data, jokes, sarcasm, humor, and incorrect homework answers posted to public forums.
Beyond data quality and ethics concerns, these web-scale datasets are often just uncurated links to data hosts, such as websites and social media platforms, making them susceptible to “data poisoning” attacks. These include attackers registering expired domains or social media accounts and replacing previously valid data with their own before AI systems retrain on that dataset.