Using third-party generative AI services requires transmitting user inputs to those providers for processing. That puts fourth-party AI vendors squarely within the jurisdiction of your organization’s Vendor Risk Management program. In other words, when third parties share your data with fourth-party service providers, you need to know the data is handled in accordance with your governance standards. The current state of AI regulation and adoption makes it all the more urgent to understand when those fourth-party vendors are providing AI services.
Risk appetites for AI-enabled services vary widely between companies. We have heard from organizations that require no exposure to AI services and others that want to ensure they are realizing AI-enabled productivity gains. These divergent risk appetites mirror the high variance in how regulations apply by geography and industry.
The EU has already had an AI Act in force since August 2024. The US has a patchwork of existing and emerging state laws, with no federal regulation in sight. Australia’s policy approach is in progress. Whereas existing data privacy laws have focused on data types, AI regulation looks at decision types, creating greater regulatory focus for new industries.
As if a rapidly evolving risk environment weren’t challenging enough, AI adoption is also proliferating rapidly. As of the end of 2024, ChatGPT claimed over 300 million weekly active users and 1.3 million developer accounts.
In this report, we show that at least 30% of companies are using AI services to process user data. Thus, the urgency: regulatory requirements for AI are coming to a head at the same time that the amount of AI usage in the digital supply chain is increasing, creating a ticking time bomb for those who ignore it.
This report provides specific examples of how AI can be detected in the supply chain, the frequency of different types of use, and how to incorporate it into your Vendor Risk Management program.
Artificial Intelligence and the Future of TPRM
We break down the differences between various AI technologies, explain how AI provides value in the TPRM workspace, and go under the hood of UpGuard’s advanced AI-powered TPRM solution: Vendor Risk.
Collecting evidence of AI usage
The UpGuard platform collects evidence by scanning public websites for technical information and by gathering security documentation from vendors. This report uses data collected by scanning websites for third-party code and crawling websites for public data subprocessor pages. We used data collected for the 250 most commonly monitored vendors in the UpGuard platform to best represent the impacts these issues can have on the real-world supply chain.
AI vendors visible via external scanning
Websites reveal part of their software supply chain by using scripts hosted on third-party vendor domains. Those domains can be associated with vendors, which can, in turn, be classified as AI vendors based on their services.
Across the 250 most used vendors, 36 (14%) had websites configured to run code from a third-party vendor providing AI services.
.png)
Embedding scripts on a website has long been a way to deploy marketing analytics tools quickly. Amongst the AI vendors detected this way, there are still many analytics tools, but the hallmark capability of generative AI is right in the name of ChatGPT.
Chat agents are the most common type of AI capability deployed via third-party code. The function of a chat agent speaks to the kind of data likely being password to the AI fourth party: sales and support inquiries.

Artificial Intelligence and the Future of TPRM
We break down the differences between various AI technologies, explain how AI provides value in the TPRM workspace, and go under the hood of UpGuard’s advanced AI-powered TPRM solution: Vendor Risk.
AI vendors detected using subprocessor disclosures
When your company’s third-party providers share personal data with fourth parties, those fourth parties become what GDPR calls a “data subprocessor.” A typical example would be cloud hosting services. You give your payroll vendor your employees’ information, and they store it in a database hosted in AWS, and now AWS is a subprocessor of your data. GDPR-compliant companies must disclose their data subprocessors to their customers.
Many companies voluntarily make their data subprocessor lists public. Out of the 250 companies in this survey, UpGuard researchers identified public subprocessor pages for 147. The other companies almost certainly have data subprocessors but elect to disclose that information only on demand.
There is no standard structure for data subprocessor pages, creating challenges for automated data collection. The most typical implementation of a subprocessor page is an HTML table, though the number and labeling of columns in that table vary between companies.
The information can also be in PDFs or other embedded documents that, again, have arbitrary structures. Large companies might have different subprocessors for different products and regions. These factors make confident automated analysis of subprocessor pages possible for most but not all instances. Out of the 147 pages, 119 could be analyzed programmatically.
Because “OpenAI” and “Anthropic” have unique vendor names, subprocessor pages could be confidently searched to determine whether these companies were listed as subprocessors. “Gemini” and “Vertex” were used to identify Google AI services.
No false positives were discovered when manually verifying the results generated by automated analysis. We omitted results for Microsoft AI services because there were relatively few results in our exploration, and Microsoft AI services were often also identified as OpenAI delivered through Azure.
Out of 147 subprocessor pages, 36% of companies listed OpenAI as a data subprocessor, 10% listed Google Gemini or Vertex, and 9% listed Anthropic.

Whereas the chatbots embedded in internet-facing webpages are distributed amongst many small companies, the AI model services used by backend systems for processing production data are highly concentrated in a handful of vendors, most notably in OpenAI.
Over one-third of the companies analyzed (53 out of 147) are processing personal data with OpenAI. That is a very conservative measurement of OpenAI usage– it is likely more companies are using them in ways that do not require disclosure as a data subprocessor.
Interestingly, OpenAI also allows users to publish custom GPTs in the ChatGPT store. To prevent brand impersonation, companies must add a DNS record to their domain to prove they own it. DNS records are another public information source that UpGuard scans, making it easy to determine which companies had OpenAI domain verification records.
There is no meaningful correlation between the two kinds of OpenAI usage– 53 companies use OpenAI as a subprocessor, 36 have verified their domain for publishing custom GPTs, and only 10 companies have done both.

How to use this in your VRM program
Hearing that vendors are, most likely, passing your company's personal data to OpenAI sounds alarming. And if you don't incorporate that information into your Vendor Risk Management program, it should be. However, the point of data subprocessor disclosures is that by knowing who processes your data, you can accurately assess whether they pose a risk.
In the case of OpenAI, which UpGuard uses, the terms of service for the Enterprise and Platform plans ensure that data submitted to OpenAI is not used for training models. Add in their other documented security measurements to keep that data confidential; the risk is comparable to that of cloud hosting providers.
By documenting OpenAI as a data subprocessor and collecting evidence from OpenAI that shows it does not use data for training, we can effectively treat the risk it poses as a data subprocessor. You should be able to follow that process for all your vendors; they should disclose their AI third parties and whether they use input data for training models.
Ultimately, documented data subprocessors should be a layup for vendor management. If a fourth party like OpenAI is listed as a subprocessor, everyone involved—you, your vendor, and OpenAI—knows their privacy policies are being assessed.
This information can be collected by visiting the websites of each of your vendors and locating the subprocessor's page or by using a vendor risk automation platform like UpGuard that centralizes evidence collection. Shadow AI requires multiple detection strategies, but automated detection of fourth parties via website code scanning provides an easy place to get started.