UK spy agency: Don't feed LLMs with sensitive corporate data

The UK government's spy agency is warning corporations of the risks of feeding sensitive data into public large language models, including ChatGPT, saying they are opening themselves up for a world of potential pain unless correctly managed.

Google, Microsoft and others are currently shoe-horning LLMs - the latest craze in tech, - into their enterprise products, and Meta's LLaMa recently leaked. They are impressive but responses can be flawed, and now Government Communications Headquarters (GCHQ) wants to highlight the security angle.

Authors David C, a tech director for Platform Research and Paul J, a tech director for Data Science Research, ask: "Do loose prompts sink ships?" Yes, they conclude, in some cases.

The common worry is that an LLM may "learn" from a prompt by users and provide that information to others querying it for similar matters.

"There is some cause for concern here, but not for the reason many consider. Currently, LLMs are trained, and then the resulting model is queried. An LLM does not (as of writing) automatically add information from queries to its model for others to query. That is, including information in a query will not result in that data being incorporated into the LLM."

The query will be visible to the LLM provider (OpenAI for ChatGPT), and will be stored and "almost certainly be used for developing the LLM service or model at some point. This could mean that the LLM provider (or its partners/contractors) are able to read queries, and may incorporate them in some way into future versions. As such, the terms of use and privacy policy need to be thoroughly understood before asking sensitive questions," the GCHQW duo write.

Examples of sensitive data - quite apt in the current climate - could include a CEO found to be asking "how best to lay off an employee" or a person asking specific health or relationship questions, the agency say. We at The Reg would be worried - on many levels - if an exec was asking an LLM about redundancies.

The pair add: "Another risk, which increases as more organizations produce LLMs, is that queries stored online may be hacked, leaked, or more likely accidentally made publicly accessible. This could include potentially user-identifiable information. A further risk is that the operator of the LLM is later acquired by an organization with a different approach to privacy than was true when data was entered by users."

GCHQ is far from the first to highlight the potential for a security foul-up. Internal Slack messages from a senior general counsel at Amazon, seen by Insider, warned staff not to share corporate information with LLMs, saying there were instances of ChatGPT responses that appear similar to Amazon's own internal data.

"This is important because your inputs may be used as training data for a further iteration of ChatGPT, and we wouldn't want its output to include or resemble our confidential information," she said, adding it already had.

Research by Cyberhaven Labs this month indicates sensitive data accounts for 11 percent of the information employees enter into ChatGPT. It analyzed ChatGPT usage for 1.6 million workers at companies that uses its data security service, and found 5.6 percent had tried it at least once at work and 11 percent had input sensitive data.

JP Morgan, Microsoft and WalMart are among other corporations to warn their employees of the potential perils.

Back at GCHQ, Messieurs David C and Paul J advise businesses to not input data they'd not like to be made public, use cloud-provided LLMs, and be very aware of the privacy policies, or use a self-hosted LLMs.

We have asked Microsoft, Google and OpenAI to comment. ®

About Us
Website HardCracked provides softwares, patches, cracks and keygens. If you have software or keygens to share, feel free to submit it to us here. Also you may contact us if you have software that needs to be removed from our website. Thanks for use our service!
IT News
Mar 21
Russian developers blocked from contributing to FOSS tools

Opinion The war in Ukraine is bad and wrong... but does blocking these contributions help Ukraine?

Mar 21
Hospital to test AI 'copilot' for doctors that jots notes on patient care

The hope? Reducing piles of admin for clinicians freeing them up for medical work

Mar 21
Edinburgh Uni finds extra £8M for vendors after troubled ERP go-live

Staff and suppliers paid late last year, new requirements lead to contract price hike

Mar 21
Curl, the URL code that can, marks 25 years of transfers

Utility that began as a personal project found its way into billions of devices

Mar 21
Baidu's ERNIE chatbot has nothing to say about Xi Jinping

Bot also botches some requests, but is about to be baked into cloud services anyway

Mar 21
Stanford sends 'hallucinating' Alpaca AI model out to pasture over safety, cost

Meta-made small language model can produce misinformation, toxic text

Mar 20
Microsoft to give more than microsecond's thought about your Windows 11 needs

Concerns over consistent dialog boxes, pinning, default apps mulled