Rethinking open source generative AI: open washing and the EU AI Act

Salamander@mander.xyz · 1 year ago

Rethinking open source generative AI: open washing and the EU AI Act

TootSweet@lemmy.world · 1 year ago

Thank you for bringing more awareness of this. I’m what you might call an “AI skeptic” and don’t really care what happens in the AI space as long as it doesn’t screw up things I care about.

But I care deeply about FOSS and AI is screwing it up. I don’t want to have to explain why XYZ thing absolutely is not Open Source and that “Open Source” has a specific meaning beyond “you can look at (at least some of) the source code.”

(Compare it to the term “hacker” that has among at least a lot of muggles taken on the exclusive meaning of committing some kind of fraud with computers. Originally it meant something very different. And it’s unfortunate the world has forgotten the old meaning.)

Another project that is diluting the term “Open Source” is Grayjay, a video streaming app that is a FUTO project (and FUTO is a Louis Rossman thing.) Rossman has called it Open Source in YouTube videos, but it’s not Open Source. (The license is here and forbids things like “commercial use” (selling the software or derivative works) and removing facilites for paying the FUTO project from derivative works. Which is a lot less restrictive than the license was last time I checked it. Previously it didn’t allow redistribution or derivative works at all. But it’s not Open Source even now.)

Salamander@mander.xyz · 1 year ago

I did not know of the term “open washing” before reading this article. Unfortunately it does seem like the pending EU legislation on AI has created a strong incentive for companies to do their best to dilute the term and benefit from the regulations.

There are some paragraphs in the article that illustrate the point nicely:

In 2024, the AI landscape will be shaken up by the EU’s AI Act, the world’s first comprehensive AI law, with a projected impact on science and society comparable to GDPR. Fostering open source driven innovation is one of the aims of this legislation. This means it will be putting legal weight on the term “open source”, creating only stronger incentives for lobbying operations driven by corporate interests to water down its definition.

[…] Under the latest version of the Act, providers of AI models “under a free and open licence” are exempted from the requirement to “draw up and keep up-to-date the technical documentation of the model, including its training and testing process and the results of its evaluation, which shall contain, at a minimum, the elements set out in Annex IXa” (Article 52c:1a). Instead, they would face a much vaguer requirement to “draw up and make publicly available a sufficiently detailed summary about the content used for training of the general-purpose AI model according to a template provided by the AI Office” (Article 52c:1d).

If this exemption or one like it stays in place, it will have two important effects: (i) attaining open source status becomes highly attractive to any generative AI provider, as it provides a way to escape some of the most onerous requirements of technical documentation and the attendant scientific and legal scrutiny; (ii) an as-yet unspecified template (and the AI Office managing it) will become the focus of intense lobbying efforts from multiple stakeholders (e.g., [12]). Figuring out what constitutes a “sufficiently detailed summary” will literally become a million dollar question.

Thank you for pointing out Grayjay, I had not heard of it. I will look into it.

chicken@lemmy.dbzer0.com · edit-2 1 year ago

As long as they aren’t putting ridiculous terms on model usage like SD3 and the weights are provided I’m happy with it

lily33@lemm.ee · edit-2 1 year ago

A bunch of these columns are outright absurd TBH, to the extend I’m not sure the author really knows what FOSS is about. What’s open API access even supposed to be - API access is closed by definition.

Also there has never been a requirement that open source software needs to be documented - and for good reason - so I’m not a fan of the documentation column as well.

Norah (pup/it/she)@lemmy.blahaj.zone · edit-2 1 year ago

and for good reason

I’d love to hear that reasoning. Personally, I will avoid using a FOSS product if the documentation is terrible or non-existent. Obviously I have grace for new* or bleeding-edge projects. But I’ve avoided using some FOSS stalwarts simply because I don’t have the time to dedicate to trial and error learning.

lily33@lemm.ee · 1 year ago

Because FOSS shouldn’t add burdens. You publish your work and let everyone else use it. That shouldn’t add extra obligations on you. Usually, you’d also write some docs - after all, without them nobody will know how to use your program, so why bother publishing - but it shouldn’t be an obligation. Make it easy for people to open up their code without this attaching strings.

Documentation is nice, but it’s kind of different thing that open source: a program can be open and undocumented, or closed but well documented - and I don’t see why we’d want it different for models.

Norah (pup/it/she)@lemmy.blahaj.zone · 1 year ago

That’s fair, thank you for explaining. I was going to say but forgot, this is assessing specifically for “openness” not ‘open source-ness’ though.

lily33@lemm.ee · 1 year ago

upcoming EU AI Act that regulates open source systems differently, creating an urgent need for practical openness assessment

So when they say “openness” they do put it in the context of open source rather accessibility.