
Testing ChatGPT’s understanding against my expertise.
By Cody Webb, FCAS CPCU (Actuary)
March 2023
Introduction:
Since its launch last November, OpenAI’s ChatGPT has mesmerized the internet community. Quickly becoming the fastest growing web application of all time[1], the viral chatbot evokes visions of a future that may be as horrifying for some as it is inspiring to others. ChatGPT has demonstrated the potential of the technology to swiftly provide superhuman capabilities for some users, while just as rapidly rendering many professionals’ years of training and education obsolete.
I have been fascinated with ChatGPT and a have found a wide array of practical use cases, ranging from slogan suggestions for my actuarial consulting practice[2], to automatic instructions for homemade tamales, allowing me to avoid the tedium of scrolling through advertisements that clutter the typical recipe website.
I have no doubt that AI tools can be extremely helpful to existing workflows. But, as with many users, I’ve become aware of its limitations and caveats. We must avoid being seduced into the illusion of comprehension and discerning synthesis by ChatGPT’s dazzling speed and flawless grammar.
In my work as an actuary[3], many assignments begin with a review of existing studies, articles, or data sources. This step is often very useful, alerting me to potential pitfalls or providing valuable ideas for my own work, but it can be quite time-consuming to thoroughly review literature in search a few kernels of insight. As a result, I often desire a quick summary to determine whether an additional time investment in each source would be worthwhile. Since abstracts are not always provided by authors, it is tempting to use AI to generate them for me, allowing me to quickly understand the main thrusts of articles, and then to decide whether to invest time in reading them.
Clearly, the usefulness of this exercise depends on the chatbot’s reading comprehension. Can it understand an article well enough to summarize its key questions and conclusions? Or, does it give responses that are vague, generalized, or confused? To test ChatGPT’s capability in this regard, I tasked it with summarizing a series of articles in which I consider myself an expert, because I was one of the authors.
Question 1: Can Chat GPT Be Used to Summarize Articles?
My first request of ChatGPT was to gauge its confidence in summary and synthesis. Here is the Q&A:

Question 1 Evaluation:
As described in the first paragraph, ChatGPT represents that it can perform this task, given its “deep understanding of the meaning behind words and sentences.” Given its confidence, it is my opinion that it the evaluation that follows should be fair game.
However, to the chatbot’s credit, the two paragraphs that follow provide careful and advisable caveats, so I cannot claim that it makes any inaccurate or unwarranted claims. Nevertheless, the question isn’t whether it provides sufficient limitations, the question is whether ChatGPTs reading comprehension is good enough to make it a useful summarization tool.
To proceed with my evaluation, I asked ChatGPT to summarize three articles which I co-authored during my time as a consultant at the global actuarial consulting firm Milliman, Inc. All three articles focus on one of my areas of expertise – the relationship between natural catastrophes and insurance. All three articles are part of the public domain and were published before ChatGPT’s current data cutoff date of December 31, 2021.

Question 2: On Insurance Regulation for Wildfires:
In 2018, my Colleague Eric Xu and I wrote an article entitled “The California Wildfire Conundrum”[4] discussing how recent devastating wildfires in California combined with an outdated regulatory methodology for insurance pricing, had prompted a withdrawal of insurance companies from the market, resulting in a situation where many homeowners lost coverage, or found new coverage difficult to obtain. ChatGPT’s summary of this article appears below:
Question 2 Evaluation:
While the individual elements of ChatGPT’s summary are generally not incorrect, the synthesis misses the main point of the article.
First, the article is focused on the regulatory aspects of wildfire risk, and how rules unique to California have exacerbated an already challenging situation. ChatGPT’s summary does not capture this point at all. The only point at which government is mentioned is that a collaborative approach is recommended. Instead of government and regulatory challenges, the ChatGPT summary focuses on insurer challenges, which are part of the article, but not the main point.
Second, the article discusses the challenges that of Proposition 103, a California law, creates for insurance companies operating in the state. While ChatGPT’s summary does note the “challenges that insurance companies face”, it does not provide detail on the specific challenges noted by the article. As a second example, the main recommendation of the article is that the state consider allowing insurers to use simulation models to determine prices, but this point does not appear in ChatGPT’s summary.
Finally, ChatGPT’s summary states that “the authors note that wildfires have become more frequent and severe in recent years.” This claim, which may be true, and appears in many other articles, does not appear in this article. A more accurate description of the claim of this article would be that “wildfires that more severe than have historically occurred, have occurred repeatedly in recent years.” There is a difference between something that has occurred with higher-than-expected frequency, and something that has become more frequent.
While this distinction is subtle, it demonstrates the key problem with this functionality. The user is unable to determine whether ChatGPT extracted the claims from the article itself, or whether it injected generalities gleaned from elsewhere in the web.
Question 3: On Earthquake Insurance
In this 2019 article, Ridgecrest, California was struck by a moderately sized earthquake. The Earthquake was severe enough to be uncommon, but far from devastating. My Milliman Colleagues, David D. Evans and Eric Xu, joined me to write an article[5] describing the dynamics of the Earthquake insurance market. In particular, we lamented on the low rate of earthquake purchase in high-risk areas, and speculated on the potential consequences of a bigger quake in a more populated area.
Question 3 Evaluation:
Unlike ChatGPT’s previous summary, this one is largely incorrect. Not only does it miss the main point of the article, it injects other claims that do not appear in the article. In particular, the article describes how the 1994 Northridge Earthquake caused disruption in the market, and did not suggest that the 2019 event could cause any such disruption. The article postulates that many homeowners don’t buy insurance because of the currently high cost of premiums, but expresses no concern about increased premiums or decreased coverage resulting form the 2019 event. Finally, the summary mentions retrofitting and building codes, where the article being summarized contains no mention of these aspects of earthquake risk management.
Ultimately, this summary suffers from the same deficiencies as the prior one. While ChatGPT’s response appears coherent and contains many claims that may be true, those claims were not retrieved from the article it was asked to summarize, and instead were surely gleaned from an unknown elsewhere in its vast training set. Thus, a reader of ChatGPT’s synthesis may be given the illusion of an accurate summary, but the summary itself is so vague and generalized that it could have been written about any article on the subject, rather than the specific article on which it was meant to focus.
Question 4: On Volcano Insurance
In the third article, John W Rollins joined me to discuss Volcano insurance following an unexpected series of lava flows in Hawaii[6]. Our main purpose was to explain which insurance policies do or don’t provide coverage for volcanoes, and to detail some of the challenges and opportunities society faces in managing this risk.

Question 4 Evaluation:
Just as before, while the ChatGPT summary does contain elements that appear in the article, it misses the main points, especially the discussions around insurance policy coverages, risk modeling, and potential extinction-level events. The ChatGPT responses while true, seem to be the AI’s answer to the question posed by the article’s title (“How does volcano risk differ from other natural disasters”), rather than a summary of the article itself. As with the prior summaries, the ChatGPT response is so general, it could serve as a similarly mediocre summary of almost any article on the subject, rather than distilling the main points of this specific article. Finally, the summary contains some disappointing similarities with the other summaries, noting the need for “a comprehensive and collaborative approach”. While risk management approaches should almost always be “comprehensive and collaborative”, this article does not emphasize that. Instead, this appears to be another example where ChatGPT’s vague responses serve as filler to lend the response the appearance of more completeness than actually exists.

Conclusion:
ChatGPT is an impressive tool that has gained widespread adoption in record time and offers many powerful use cases. However, it’s important to approach artificial intelligence tools with rational expectations, and to be discerning about which capabilities to use. In this article, I tested ChatGPT’s reading comprehension and summarization capabilities by asking it to distill the main points of three articles in which I was a co-author. In all three cases, ChatGPT correctly identified certain points of the articles, and the individual elements of each synopsis were usually more correct than incorrect. However, in all three cases, the summaries provided missed the main points of each articles, and were so generalized as to render the articles indistinguishable from many similar articles. Thus, if I were a reader, rather than a writer, of the articles, I would have gained very little value from generating the ChatGPT summaries rather than reading the articles myself.
Ultimately, in my opinion, ChatGPT’s reading comprehension is not useful or accurate enough to be relied upon. For now, I’ll continue to do my own literature reviews.
[1] The Guardian: ChatGPT reaches 100 million users two months after launch. Accessed February 25, 2023 https://www.theguardian.com/technology/2023/feb/02/chatgpt-100-million-users-open-ai-fastest-growing-app. Accessed February 25, 2023
[2] See www.cuantoconsulting.com for a description of services and details on my background.
[3] For those unfamiliar with the actuarial profession: “Actuaries use mathematics, statistics, and financial theory to analyze the economic costs of risk and uncertainty.” Definition from BLS.Gov
[4] Milliman, Inc: The California Wildfire Conundrum By Cody Webb and Eric J Xu. https://www.milliman.com/en/insight/the-california-wildfire-conundrum. Accessed February 25, 2023.
[5] Milliman, Inc: The Ridgecrest earthquake: Will recent quakes shake up the California insurance market? By Devid D. Evans, Cody Webb, and Eric J Xu. https://www.milliman.com/en/insight/the-ridgecrest-earthquake-will-recent-quakes-shake-up-the-california-insurance-marketAccessed February 25, 2023.
[6] Milliman, Inc: How does volcano risk differ from other natural disasters? By John W. Rollins and Cody Webb. https://www.milliman.com/en/insight/how-does-volcano-risk-differ-from-other-natural-disasters Accessed February 25, 2023.
This article was written by Cody Webb, He is member of “Socialmeet The Best Meritocracy Club in the World”: https://meritocracy.club/ and you can see his schedule if you want to talk with him HERE
Cody Webb is an experienced actuary and insurance specialist, holding Fellow of the Casualty Actuarial Society (FCAS) and Chartered Property Casualty Underwriter (CPCU) designations. His areas of expertise include pricing homeowners and auto insurance, property catastrophe research, predictive analytics, and the intersection of insurance with public policy. Webb has authored research reports for esteemed organizations like the Society of Actuaries, the Casualty Actuarial Society, and the American Property Casualty Insurance Association on subjects ranging from wildfire risk mitigation to catastrophe litigation. In 2022, he founded Cuanto Consulting after serving as a pricing and research actuary at AAA Insurance and a Principal and Consulting Actuary at Milliman, Inc.
Cody Webb is an expert in:
-Property & Casualty Actuarial Consulting
-Personal Lines (Homeowners and Auto) Ratemaking and Product Development
-Rate Reviews/Regulatory Filings
-California and Florida
-Competitive Intelligence/Market Analysis
-Predictive Analytics/Pricing Strategy
-Catastrophe Risk: Wildfire, Earthquake, Hurricane
-Research Reports/Analytical White Papers
https://www.cuantoconsulting.com/