The open secret of open washing - why companies pretend to be open source

Opinion If you believe Mark Zuckerberg, Meta's AI large language model (LLM) Llama 3 is open source.

It's not, despite what he says. The Open Source Initiative (OSI) spells it out in the Open Source Definition, and Llama 3's license - with clauses on litigation and branding - flunks it on several grounds.

Meta, unfortunately, is far from unique in wanting to claim that some of its software and models are open source. Indeed, the concept has its own name: open washing.

This is a deceptive practice in which companies or organizations present their products, services, or processes as "open" when they are not truly open in the spirit of transparency, access to information, participation, and knowledge sharing. This term is modeled after "greenwashing" and was coined by Michelle Thorne, an internet and climate policy scholar, in 2009.

With the rise of AI, open washing has become commonplace, as shown in a recent study. Andreas Liesenfeld and Mark Dingemanse of Radboud University's Center for Language Studies surveyed 45 text and text-to-image models that claim to be open. The pair found that while a handful of lesser-known LLMs, such as AllenAI's OLMo and BigScience Workshop + HuggingFace with BloomZ could be considered open, most are not. Would it surprise you to know that according to the study, the big-name ones from Google, Meta, and Microsoft aren't? I didn't think so.

But why do companies do this? Once upon a time, companies avoided open source like the plague. Steve Ballmer famously proclaimed in 2001 that "Linux is a cancer," because: "The way the license is written, if you use any open source software, you have to make the rest of your software open source." But that was a long time ago. Today, open source is seen as a good thing. Open washing enables companies to capitalize on the positive perception of open source and open practices without actually committing to them. This can help improve their public image and appeal to consumers who value transparency and openness.

Some corporations use open washing to shield their models and practices from scientific and regulatory scrutiny while benefiting from the "open" label.

Another major factor is that the EU AI Act provides special exemptions for "open source" models. This creates a powerful incentive for open washing: if their models count as open, they'll have far less restrictive requirements. That, in turn, means they'll need less money to meet regulatory requirements or have to clean their datasets of copyright and other intellectual property (IP) issues.

However, the EU still doesn't have a clear definition of open source AI. In all fairness, no one does yet. The OSI will release its open source AI definition in the next few days. That said, the current crop of open washing licenses fail by anyone's definition - other than their creators.

That's not to say all the big-name AI companies are lying about their open source street cred. For example, IBM's Granite 3.0 LLMs really are open source under the Apache 2 license.

Why is this important? Why do people like me insist that we properly use the term open source? It's not like, after all, the OSI is a government or regulatory organization. It's not. It's just a nonprofit that has created some very useful guidelines.

As Dan Lorenc, CEO of security company Chainguard, said in his keynote speech at the Secure Open Source Software (SOSS) Fusion Conference in Atlanta this week, no one can "force you to use the OSI's definitions." But "fortunately, many people, particularly lawyers, believe in this definition. They trust the work that the OSI does, and they trust and understand the protections that companies are granted when they use these licenses when they meet the open source criteria. That's why we see it showing up in procurement contracts of big companies all over the world."

Open source isn't just a legal and business matter. Open source gives developers the freedom to operate the way they do. Without it, they'll "lose the benefits that we've all grown accustomed to of being able to freely use code without having to know about or care about all the different terms in these licenses."

If we need to check every license for every bit of code, "developers are going to go to legal reviews every time you want to use a new library. Companies are going to be scared to publish things on the internet if they're not clear about the liabilities they're encountering when that source code becomes public."

Lorenc continued: "You might think this is only a big company problem, but it's not. It's a shared problem. Everybody who uses open source is going to be affected by this. It could cause entire projects to stop working. Security bugs aren't going to get fixed. Maintenance is going to get a lot harder. We must act together to preserve and defend the definition of open source. Otherwise, the lawyers are going to have to come back. No one wants the lawyers to come back."

I must add that I know a lot of IP lawyers. They do not need or want these headaches. Real open source licenses make life easier for everyone: businesses, programmers, and lawyers. Introducing "open except for someone who might compete with us" or "open except for someone who might deploy the code on a cloud" is just asking for trouble.

In the end, open washing will dirty the legal, business, and development work for everyone. Including, ironically, the shortsighted companies now supporting this approach. After all, almost all their work, especially in AI, is ultimately based on open source. ®

Search
About Us
Website HardCracked provides softwares, patches, cracks and keygens. If you have software or keygens to share, feel free to submit it to us here. Also you may contact us if you have software that needs to be removed from our website. Thanks for use our service!
IT News
Oct 25
Perplexity AI decries News Corp's 'simply false' data scraping claims

'They prefer to live in a world where publicly reported facts are owned by corporations'

Oct 25
The open secret of open washing - why companies pretend to be open source

Opinion Allowing pretenders to co-opt the term is bad for everyone

Oct 25
Hackers love GitHub dorks - SecOps love outsmarting them

Partner Content How GitGuardian enables auditing of GitHub footprints to mitigate past, present, and future leaks

Oct 25
Your computer's not working? Sure, I can fix that problem - which I caused

On Call Not paying what you agreed for a job can prove expensive in the long run

Oct 25
OpenAI loses another senior figure, disperses safety research team he led

Artificial General Intelligence readiness advisor Miles Brundage bails, because nobody is ready

Oct 25
Polish radio station ditches DJs, journalists for AI-generated college kids

Station claims its visionary, ex-employees claim it cynical; reality appears way more fiscal