Having over a decade of open source software I've written freely available online, I actually really appreciate the value that AI && LLMs have provided me.
The thing that leaves a bad taste in my mouth is the fact that my works were likely included in the training data and, if it doesn't violate my licenses (GNU 2/3), it certainly feels against the spirit of what I intended when distributing my works.
I was made redundant recently "due to AI" (questionable) and it feels like my works in some way contributed to my redundancy where my works contributed to the profits made by these AI megacorps while I am left a victim.
I wish I could be provided a dividend or royalty, however small, for my contribution to these LLMs but that will never happen.
I've been looking for a copy-left "source available" license that allows me to distribute code openly but has a clause that says "if you would like to use these sources to train an LLM, please contact me and we'll work something out". I haven't yet found that.
I'm guessing that such a license would not be enforceable because I am not in the US, but at least it would be nice to declare my intent and who knows what the future looks like.
FWIW, a lot of open source caused other people to lose their jobs too, all pre AI. So what goes around comes around. The Free Software movement was from day one built on cloning proprietary programs - UNIX was a commercial OS that AT&T sold, the early Linux desktop environments all looked exactly like a mashup of Windows 95 and commercial DEs, etc. Every commercial UNIX got wiped out except Apple, do you think that didn't lead to layoffs? Because it very much did. Nor did it ever really change. SystemD started out as "heavily inspired" by Launch Services. Wayland is basically the same ideas as SkyLight in macOS, etc.
And who was it who benefited from this stuff? A lot of the benefit went to "megacorps" who took the savings and banked the higher profits.
So I don't think open source, which for many years was unashamedly about just cloning designs that were funding other people's salaries, can really cry too much about LLMs. And I say that as someone who has written a lot of open source software, including working on Wine.
Fwiw, AIX and to a far lesser extent Solaris still exist. I'm not exactly sure why people are using them (AIX I can maybe understand because "no one got fired for buying IBM" or whatever but there really isn't any excuse to be running Solaris nowadays since ZFS runs on Linux and and 2 of the BSD based systems and oracle seems desperate to let it die)
So wait, sparc solaris is the only production unix with hardware memory tagging but also linux has it? Are we talking strict SUS compliant systems (current or former because for some reason solaris is no longer listed as such despite ostensibly still being compliant unless the SRUs have seriously FUBARed some things) or unices in general? because I'd argue anyone running SUS compliant systems out of anything other than their choice happening to be compliant is arguably even more niche than running AIX or Solaris for anything else
Why would someone use unsupported OpenBSD on SPARC for the clients that pay for it? Probably the same reason so many servers run on Rocky or Alma instead of RHEL, money. Perhaps they bought the hardware without the support contract for Solaris, or they don't want to keep paying for it
I think there's no meaningful case by the letter of the law that use of training data that include GPL-licensed software in models that comprise the core component of modern LLMs doesn't obligate every producer of such models to make both the models and the software stack supporting them available under the same terms. Of course, it also seems clear in the present landscape that the law often depends more on the convenience of the powerful than its actual construction and intent, but I would love to be proven wrong about that, and this kind of outcome would help
> I think there's no meaningful case by the letter of the law that use of training data that include GPL-licensed software in models that comprise the core component of modern LLMs doesn't obligate every producer of such models to make both the models and the software stack supporting them available under the same terms.
Why do you think "fair use" doesn't apply in this case? The prior Bartz vs Anthropic ruling laid out pretty clearly how training an AI model falls within the realm of fair use. Authors Guild vs Google and Authors Guild vs HathiTrust were both decided much earlier and both found that digitizing copyrighted works for the sake of making them searchable is sufficiently transformative to meet the standards of fair use. So what is it about GPL licensed software that you feel would make AI training on it not subject to the same copyright and fair use considerations that apply to books?
I’m not a lawyer, but I read the decision, and how is this section not a ruling on fair use?
“To summarize the analysis that now follows, the use of the books at issue to train Claude
and its precursors was exceedingly transformative and was a fair use under Section 107 of the
Copyright Act. And, the digitization of the books purchased in print form by Anthropic was
also a fair use but not for the same reason as applies to the training copies. Instead, it was a
fair use because all Anthropic did was replace the print copies it had purchased for its central
library with more convenient space-saving and searchable digital copies for its central
library — without adding new copies, creating new works, or redistributing existing copies.
However, Anthropic had no entitlement to use pirated copies for its central library. Creating a
permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.”
Or in the final judgement, “This order grants summary judgment for Anthropic that the training use was a fair use.
And, it grants that the print-to-digital format change was a fair use for a different reason.”
> it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library
It is only fair use where Anthropic had already purchased a license to the work. Which has zero to do with scraping - a purchase was made, an exchange of value, and that comes with rights.
The second, which involves a section of the judgement a little before your quote:
> And, as for any copies made from central library copies but not used for training, this order does not grant summary judgment for Anthropic.
This is where the court refused to make any ruling. There was no exchange of value here, such as would happen with scraping. The court made no ruling.
I believe you are misinterpreting the ruling. Remember that a copyright claim must inherently argue that copies of the work are being made. To that end, the case analyzes multiple "copies" alleged to have been made.
1) "Copies used to train specific LLMs", for which the ruling is:
> The copies used to train specific LLMs were justified as a fair use.
> Every factor but the nature of the copyrighted work favors this result.
> The technology at issue was among the most transformative many of us will see in our lifetimes.
Notable here is that all of the "copies used to train specific LLMs" were copies made from books Anthropic purchased. But also of note is that Anthropic need not have purchased them, as long as they had obtained the original sources legally. The case references the Google Books lawsuit as an example of something Anthropic could have done to avoid pirating the books they did pirate where in Google obtained the original materials on loan from willing and participating libraries, and did not purchase them.
2) "The copies used to convert purchased print library copies into digital library copies", where again the ruling is:
> justified, too, though for a different fair use. The first factor strongly
> favors this result, and the third favors it, too. The fourth is neutral. Only
> the second slightly disfavors it. On balance, as the purchased print copy was
> destroyed and its digital replacement not redistributed, this was a
> fair use.
Here one might argue where the use of GPL code is different in that in making the copy, no original was destroyed. But it's also very likely that this wouldn't apply at all in the case of GPL code because there was also no original physical copy to convert into a digital format. The code was already digitally available.
3) "The downloaded pirated copies used to build a central library" where the court finds clearly against fair use.
4) "And, as for any copies made from central library copies but not used for training" where as you note Judge Alsup declined to rule. But notice particularly that this is referring to copies made FROM the central library AND NOT for the purposes of training an LLM. The copies made from purchased materials to build the central library in the first place were already deemed fair use. And making copies from the central library to train an LLM from those copies was also determined to be fair use.The copies obtained by piracy were not. But for uses not pertaining to the training of an LLM, the judge is declining to make a ruling here because there was not enough evidence about what books from the central library were copied for what purposes and what the source of those copies was. As he says in the ruling:
> Anthropic is not entitled to an order blessing all copying “that Anthropic has ever made after obtaining the data,” to use its words
This declination applies both to the purchased and pirated sources, because it's about whether making additional copies from your central library copies (which themselves may or may not have been fair use), automatically qualifies as fair use. And this is perfectly reasonable. You have a right as part of fair use to make a copy of a TV broadcast to watch at a later time on your DVR. But having a right to make that copy does not inherently mean that you also have a right to make a copy from that copy for any other purposes. You may (and almost certainly do) have a right to make a copy to move it from your DVR to some other storage medium. You may not (and almost certainly do not) have a right to make a copy and give it to your friend.
At best, an argument that GPL software wouldn't be covered under the same considerations of fair use that this case considers would require arguing that the copies of GPL code obtained by Anthropic were not obtained legally. But that's likely going to be a very hard argument to make given that GPL code is freely distributed all over the place with no attempts made to restrict who can access that code. In fact, GPL code demands that if you distribute the software derived from that code, you MUST make copies of the code available to anyone you distribute the software to. Any AI trainer would simply need to download Linux or emacs and the GPL requires the person they downloaded that software from to provide them with the source code. How could you then argue that the original source from which copies were made was obtained illicitly when the terms of downloading the freely available software mandated that they be given a copy?
> How could you then argue that the original source from which copies were made was obtained illicitly when the terms of downloading the freely available software mandated that they be given a copy?
By the license and terms such copies are under.
> For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
You _must_ show the terms. If you copy the GPL code, and it inherits the license, as the terms say it does, then you must also copy the license.
The GPL does not give you an unfettered right to copy, it comes with terms and conditions protecting it under contract law. Thus, you must follow the contract.
The GPL goes to some lengths to define its terms.
> A "covered work" means either the unmodified Program or a work based on the Program.
> Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.
It is not just the source code that you must convey.
Which clause of the GPL requires the receiver of GPL code to agree to the terms of the GPL before being allowed to receive the source code that they are entitled to under the license? Because that would expressly contradict the first sentence of section 9:
You are not required to accept this License in order to receive or run a copy of the Program.
Isn't that one of the key points to the GPL? That the provisions of it only apply to you IF you decide to distribute GPL software but that they do not impose any restrictions on the users of the software? Surely you're not suggesting that anyone who has ever seen the source code of a GPLed piece of software is permanently barred from contributing to or writing similar software under a non-GPL license simply by the fact that they received the GPLed source code.
> If you copy the GPL code, and it inherits the license, as the terms say it
> does, then you must also copy the license.
> The GPL does not give you an unfettered right to copy, it comes with terms
> and conditions protecting it under contract law. Thus, you must follow the
> contract.
I agree that the GPL does not give you an unfettered right to copy. But the GPL like all such licenses are still governed by copyright law. And "fair use" is an exception to the copyright laws that allow you to make copies that you are not otherwise authorized to make. No publisher can put additional terms in their book, even if they wrap it in shrinkwrap that denies you the right to use that book for various fair use purposes like quoting it for criticism or parody. The Sony terms and conditions for the Play Station very clearly forbid copying the BIOS or decompiling it. But those terms are null and void when you copy the BIOS and decompile it for making a new emulator (at least in the US) because the courts have already ruled that such use is fair use.
So it is with the GPL. By default you have no right to make copies of the software at all. The GPL then grants you additional rights you normally wouldn't have under copyright law, but only to the extent that when exercising those rights, you comply with the terms of the GPL. But "Fair Use" then goes beyond that and says that for certain purposes, certain types and amounts of copies can be made, regardless of what rights the publisher does or does not reserve. This would be why the GPL specifically says:
This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.
Fair use (and its analogs in other countries) supersede the GPL. And even the GPL FAQ[1] acknowledges this fact:
Do I have “fair use” rights in using the source code of a GPL-covered program? (#GPLFairUse)
Yes, you do. “Fair use” is use that is allowed without any special
permission. Since you don't need the developers' permission for such use, you
can do it regardless of what the developers said about it—in the license or
elsewhere, whether that license be the GNU GPL or any other free software
license.
> So what is it about GPL licensed software that you feel would make AI training on it not subject to the same copyright and fair use considerations that apply to books?
The poster doesn't like it, so it's different. Most of the "legal analysis" and "foregone conclusions" in these types of discussions are vibes dressed up as objective declarations.
You seem like the type of person that will believe anything as long as someone cites a case without looking into it. Bartz v Anthropic only looked at books, and there was still a 1.5 billion settlement that Anthropic paid out because it got those books from LibGen / Anna's Archive, and the ruling also said that the data has to be acquired "legitimately".
Whether data acquired from a licence that specifically forbids building a derivative work without also releasing that derivative under the same licence counts as a legitimate data gathering operation is anyone's guess, as those specific circumstances are about as far from that prior case as they can be.
As long as they don't distribute the model's weights, even a strict interpretation of the GPL should be fine. Same reason Google doesn't have to upstream changes to the Linux kernel they only deploy in-house.
> This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.
It is legitimate to acquire GPL software. The requirements of the license only occur if you're distributing the work AND fair use does not apply.
Training certainly doesn't count as distribution, so the buck passes to inference, which leaves us dealing with substantial similarity test, and still, fair use.
You sound like you're citing the general Internet understanding of "fair use", which seems to amount to "I can do whatever I like to any copyrighted content as long as maybe I mutilate it enough and shout 'FAIR USE!' loudly enough."
On the real measures of "fair use", at least in the US: https://fairuse.stanford.edu/overview/fair-use/four-factors/ I would contend that it absolutely face plants on all four measures. The purpose is absolutely in the form of a "replacement" for the original, the nature is something that has been abundantly proved many times over in court as being something copyrightable as a creative expression (with limited exceptions for particular bits of code that are informational), the "amount and substantiality" of the portions used is "all of it", and the effect of use is devastating to the market value of the original.
You may disagree. A long comment thread may ensue. However, all I really need for my point here is simply that it is far, far from obvious that waving the term "FAIR USE!" around is a sufficient defense. It would be a lengthy court case, not a slam-dunk "well duh it's obvious this is fair use". The real "fair use" and not the internet's "FAIR USE!" bear little resemblance to each other.
A sibling comment mentions Bartz v. Anthropic. Looking more at the details of the case I don't think it's obvious how to apply it, other than as a proof that just because an AI company acquired some material in "some manner" doesn't mean they can just do whatever with it. The case ruled they still had to buy a copy. I can easily make a case that "buying a copy" in the case of a GPL-2 codebase is "agreeing to the license" and that such an agreement could easily say "anything trained on this must also be released as GPL-2". It's a somewhat lengthy road to travel, where each step could result in a failure, but the same can be said for the road to "just because I can lay my hands on it means I can feed it to my AI and 100% own the result" and that has already had a step fail.
Broadly speaking, GPL is a license that has specific provisions about creating derivative software from the licensed work, and just saying "fair use" doesn't exempt you from those provisions. More specifically, an advertised use case (in fact, arguably the main one at this stage) of the most popular closed models as they're currently being used is to produce code, some of which is going to be GPL licensed. As such, the code used is part of the functionality of the program. The fact that this program was produced from the source code used by a machine learning algorithm rather than some other method doesn't change this fundamental fact.
The current supreme court may think that machine learning is some sort of magic exception, but they also seem to believe whatever oligarchs will bribe them to believe. Again, I doubt the law will be enforced as written, but that has more to do with corruption than any meaningful legal theory. Arguments against this claim seem to ignore that courts have already ruled these systems to not have intellectual property rights of their own, and the argument for fair use seems to rely pretty heavily on some handwavey anthropomorphization of the models.
Intellectual property never made much sense to begin with. But it certainly makes no sense now, where the common creator has no protections against greedy corporate giants who are happy to wield the full weight of the courts to stifle any competition for longer than we'll be alive.
Or, in the case of LLMs, recklessly swing about software they don't understand while praying to find a business model.
If there was going to be a case, it's derivative works. [1]
What makes it all tricky for the courts is there's not a good way to really identify what part the generated code is a derivative of (except in maybe some extreme examples).
That's always what laws existed for, a law is just a formal way of saying "we will use violence against you if you do something we don't like" and that has always going to be primary written by and for the people that already have the power to do that, it's not the worst, certainly better than Kings just being able to do as they please.
If you use GitHub, you’re automatically opted into having your code used for training. Private repo or not. You have to actually opt out and even then, will they honor that? No…
> I was made redundant recently "due to AI" (questionable) and it feels like my works in some way contributed to my redundancy where my works contributed to the profits made by these AI megacorps while I am left a victim.
I think anyone here can understand and even share that feeling. And I agree with your "questionable" - its just the lame HR excuse du jour.
My 2c:
- AI megacorps aren't the only ones gaining, we all are. the leverage you have to build and ship today is higher than it was five years ago.
- It feels like megacorps own the keys right now, but that’s a temporary. In a world of autonomous agents and open-weight models, control is decentralized.inference costs continue to drop, you dont need to be running on megacorp stacks. Millions (billions?) of agents finding and sharing among themselves. How will megacorps stop?
- I see the advent of LLMs like the spread of literacy. Scribes once held a monopoly on the written word, which felt like a "loss" to them when reading/writing became universal. But today, language belongs to everyone. We aren't losing code; we are making the ability to code a universal human "literacy."
If I look around in the FLOSS communities, I see a lot of skepticism towards LLMs. The main concerns are:
1. they were trained on FLOSS repositories without consent of the authors, including GPL and AGPL repos
2. the best models are proprietary
3. folks making low-effort contribution attempts using AI (PRs, security reports, etc).
I agree those are legitimate problems but LLMs are the new reality, they are not going to go away. Much more powerful lobbies than the OSS ones are losing fights against the LLM companies (the big copyright holders in media).
But while companies can use LLMs to build replacements for GPL licensed code (where those LLMs have that GPL code probably in their training set), the reverse thing can also be done: one can break monopolies open using LLMs, and build so much open source software using LLMs.
All the infrastructure that runs the whole AI-over-the-internet juggernaut is
essentially all open source.
Heck, even Claude Code would be far less useful without grep, diff, git, head, etc., etc., etc. And one can easily see a day where something like a local sort Claude Code talking to Open Weight and Open Source models is the core dev tool.
It’s such a fun time to have 1+ decade(s) of experience in software. Knowing what simple and good are (for me), and being able to articulate it has let me create so much personal software for myself and my family. It has really felt like turning ideas into reality, about as fast as I can think of them or they can suggest them. And adding specific features, just for our needs. The latest one was a slack canvas replacement, as we moved from slack to self-hosted matrix + element but missed the multiplayer, persistent monthly notes file we used. Even getting matrix set up in the first place was a breeze.
$20/month with your provider of choice unlocks a lot.
Edit: the underlying point being, yes to the article. Either building upon the foundations of open source to making personal things, or just modifying a fork for my own needs.
I’m not so sure… what I see as more likely is that coding agents will just strip parts from open source libraries to build bespoke applications for users. Users will be ecstatic because they get exactly what they want and they don’t have to worry about upstream supply chain attacks. Maintainers get screwed because no one contributes back to the main code base. In the end open source software becomes critical to the ecosystem, but gets none of the credit.
FOSS came up around the core idea of liberating software for hardware, and later on was sustained by the idea of a commodity of commons we can build on. But with LLMs we have alternative pathways/enablement for the freedoms:
Freedom 0 (Run): LLMs troubleshoot environments and guide installations, making software executable for anyone.
Freedom 1 (Study/Change): make modifications, including lowering bar of technical knowledge.
Freedom 2 (Redistribute): LLMs force redistribution by building specs and reimplementing if needed.
Freedom 3 (Improve/Distribute): Everyone gets the improvement they want.
As we can see LLM makes these freedoms more democratic, beyond pure technical capability.
For those that cared only about these 4 freedoms, LLMs enable these in spades. But those who looked additionally for business, signalling and community values of free software (I include myself in this), these were not guaranteed by FOSS, and we find ourselves figuring out how to make up for these losses.
Good piece, but two things work against the thesis:
The Sunsama example actually argues the opposite direction. He spent an afternoon hacking around a closed system with an agent and it worked. If agents are good enough to reverse-engineer and workaround proprietary software today, the urgency to switch to open source decreases, not increases. "Good enough" workarounds are how SaaS stays sticky.
And agents don't eliminate the trust problem, they move it. Today you trust Sunsama with your workflows. In this vision, you trust your agent to correctly interpret your intent, modify code safely, and not introduce security holes. Non-technical users can't audit agent-modified code any better than they could audit the original source. You've traded one black box for another.
I worry people are lacking context about how SaaS products are purchased if they think LLMs and "vibe coding" are going to replace them. It's almost never the feature set. Often it's capex vs opex budgeting (i.e., it's easier to get approval for a monthly cost than a upfront capital cost) but the biggest one is liability.
Companies buy these contracts for support and to have a throat to choke if things go wrong. It doesn't matter how much you pay your AI vendor, if you use their product to "vibe code" a SaaS replacement and it fails in some way and you lose a bunch of money/time/customers/reputation/whatever, then that's on you.
This is as much a political consideration as a financial one. If you're a C-suite and you let your staff make something (LLM generated or not) and it gets compromised then you're the one who signed off on the risky project and it's your ass on the line. If you buy a big established SaaS, do your compliance due-diligence (SOC2, ISO27001, etc.), and they get compromised then you were just following best practice. Coding agents don't change this.
The truth is that the people making the choice about what to buy or build are usually not the people using the end result. If someone down the food chain had to spend a bunch of time with "brittle hacks" to make their workflow work, they're not going to care at all. All they want is the minimum possible to meet whatever the requirement is, that isn't going to come back to bite them later.
SaaS isn't about software, it's about shifting blame.
Open source has never been more alive for me. I have been publishing low key for years, and AI has expanded that capability more than 100 fold, in all directions. I had previously published packages in multiple languages but recently started to cut back to just one manually. But now with AI, I started to expand languages again. Instead of feeling constrained by toolchains I feel comfortable with, I feel freedom to publish more and more.
The benefits to publishing AI generated code as open source are immense including code hosting and CI/CD pipelines for build, test, lint, security scans, etc. In additional to CI/CD pipelines, my repos have commits authored by Claude, Dependabot, GitHub Advanced Security Bot, Copilot, etc. All of this makes the code more reliable and maintainable, for both human and AI authored code.
Some thoughts on two recent posts:
1. 90% of Claude-linked output going to GitHub repos w <2 stars (https://news.ycombinator.com/item?id=47521157): I'm generally too busy to publishing code to promote, but at some time it might settle down. Additionally, with how fast AI can generate and refactor code, it can take some time before the code is stable enough to promote.
2. So where are all the AI apps? (https://news.ycombinator.com/item?id=47503006): They are in GitHub with <2 stars! They are there but without promotion it takes a while to get started in popularity. That being said, I'm starting to get some PRs.
5 years ago, I set out to build an open-source, interoperable marketplace powered by open-source SaaS. It felt like a pipe dream, but AI has made the dream into fruition. People are underestimating how much AI is a threat to rent seeking middlemen in every industry.
This is an interesting take. I think a critical missing piece from this article is how the use of coding agents will essentially enable the circumvention of copyleft licenses. Some project that was recently posted on HN is already selling this service [1][2]. It rewrites code/modules/projects to less restrictive licenses with no legal enforcement mechanisms. It's the opposite of freeing code.
[1] Malus.sh ; Initially a joke but, in the end, not. You can actually pay for their service.
[2] Your new code is delivered under the MalusCorp-0 License—a proprietary-friendly license with zero attribution requirements, zero copyleft, and zero obligations.
I don’t know what SaaS has to do with FOSS. The point of FOSS was to allow me to modify the software I run on my system. If the device drivers for some hardware I depend on are no longer supported by the company I bought it from, if it’s open source, I can modify and extend the software myself.
The Copy Left licenses ensure that I share my modifications back if I distribute them. It’s a thing for the public good.
Agent-based software development walls people off from that. Mostly by ensuring that the provenance of the code it generates is not known and by deskilling people so that they don’t know what to prompt or how to fix their code.
> SaaS scaled by exploiting a licensing loophole that let vendors avoid sharing their modifications.
AI is going to exploit even more: "Given the repository -> Construct tech spec -> Build project based on tech spec"
At this stage, I want everyone just close their source, stop working on open source until this issue of licensing gets resolved.
Any improvement you make to the open source code will be leveraged in ways you didn't intend it to be used, eventually making you redundant in the process
The point this article makes, that suddenly agents can do the work of customizing free software, completely makes sense. But, the reality is that the Free Software movement is opposed to the way Lemons are built today, and would not accept a world like this. (Rightfully!)
My belief is that Lemons effectively kill open source in the long run, and generally speaking, people forget that Free Software is even a thing. The reasoning for that is simple: it’s too easy to produce a “clean” derivative with just the parts you need. Lemons do much better with a fully Lemoned codebase than they do with a hybrid. Incentives to “rewrite” also free people from “licensing burdens” while the law is fuzzy.
One thing I keep noticing is that agents are getting better at implementation faster than they’re getting better at judgment.
They can often wire up a library or scaffold a migration, but they’re still pretty shaky at the “should we choose this at all?” layer — pricing cliffs, version floors, lock-in, EOLs, migration blockers, etc.
If coding agents do end up making free software more useful again, I think part of that will come from making open docs / changelogs / migration guides more usable at decision time, not just at implementation time.
“Their relationship with the software is one of pure dependency, and when the software doesn’t do what they need, they just… live with it”
Or, more likely, they churn off the product.
The SaaS platforms that will survive are busy RIGHT NOW revamping their APIs, implementing oauth, and generally reorganizing their products to be discovered and manipulated by agents. Failing in this effort will ultimately result in the demise of any given platform. This goes for larger SaaS companies, too, it’ll just take longer.
A question I have for those doing Agentic coding - what is the development process used? How are agents organised?
Top down with a "manager" agent telling "coding" agents what to do? I.e. mirroring the existing corporate interpretation of "agile"/scrum development.
I was thinking and seeing the title of this article, it would be interesting to setup a agent environment that mirrors a typical open source project involving a discussion forum (where features are thrown around) and a github issue/PR (where implementation details are discussed) and then have a set of agents that are "mergers" - acting as final review instances.
I assume that agents can be organised in any form at all, it's just a matter of setting up the system prompt and then letting them go for it. A discourse forum could be set up where agents track the feature requests of users of the software and then discuss how to implement it or how to workaround it.
The reason I ask is because one could then do a direct comparison of development processes, i.e. the open source model versus the corporate top-down process. It would interest me to see which process performance better in terms of maintainability, quality and feature richness.
Coding agents and LLMs basically tivoize open source.
When AI will eventually become the primary means to write code (because hand-programming is going to be slow enough that no company can continue like before) then that means AI becomes your new compiler that comes with a price tag, a subscription.
Programmers were held hostage to commercial compilers until free compilers reached sufficient level of quality, but now it doesn't matter if your disk is full of free/open toolchains if it's not you who is commanding them but commercial AI agents.
Undoubtedly there will be open-source LLMs eventually, of various levels of quality. But to write a free compiler you need a laptop while to train a free programming LLM you need a lot of money. And you also need money to run it.
Programming has been one of the rare arts that even a poor, lower class kid can learn on his own with an old, cheap computer salvaged from a scrap bin and he can raise himself enough intellectual capital to become a well-paid programmer later. I wonder what the equivalent path will be in the future.
Maybe, but I don't really believe users can or want to start designing software, if it was even possible which today it isn't really unless you already have software dev skills.
That would basically make users a product manager and UX designer, which they aren't really capable of currently. At most they will discover what they think they want isn't what they actually want.
agree completely. When the megacorps are building hundreds of datacenters and openly talking about plans to charge for software "like a utility," there has never been a clearer mandate for the need for FOSS, and IMO there has never been as much momentum behind it either.
these are exciting times, that are coming despite any pessimism rooted in our out-dated software paradigms.
It compares and contrasts open source and free software, and then gives an example of how free software is better than closed software.
But if the premise of the article, that the agent will take the package you pick and adapt it to your needs, is correct, then honestly the agent won't give a rat's ass whether the starting point was free source or open source.
First of all, free software still matters. Then, being a slave to a $200 subscription to a oligarch application that launders other people's copyright is not what Stallman envisioned.
The AI propaganda articles are getting more devious my the minute. It's not just propaganda---it's Bernays-level manipulation!
The debate in the comment section here really boils down to: upstream freedom vs downstream freedom.
Copyleft licenses like GPL/Apache mandate upstream freedom: Upstream has the "freedom" to use anything downstream, including anything written by a corporation.
Non-copyleft FOSS licenses like MIT/BSD are about downstream freedom, which is more of a philosophically utilitarian view, where anyone who receives the software is free to use it however they want, including not giving their changes back to the community, on the assumption that this maximizes the utility of this free software in the world.
If you prioritize the former goal, then coding agents are a huge problem for you. If the latter, then coding agents are the best thing ever, because they give everyone access to an effectively unlimited amount of cheap code.
I think the opposite. It will make all software matter less.
If trendlines continue... It will be faster for AI to vibe code said software to your customized specifications than to sign up for a SaaS and learn it.
"Claude, create a project management tool that simplifies jira, customize it to my workflow."
So a lot of apps will actually become closed source personalized builds.
i've learned a lot from open source, and i'm building open source myself.
so individually? yeah this stings a bit.
but zooming out to the ecosystem level — i think there's still something genuinely positive happening here.
the knowledge compounds, even if the credit doesn't.
(luckily my projects are unpopular enough that nobody bothered training on them lol)
tl-didn't finish but I absolutely do this already. Much of the software I use is foss and codex adjusts it to my needs. Sometimes it's really good software and I end up adding something that already exists. Whatever, tokens are free...
This article misses the point completely. Open source isn't great because it's easy to extract value from it. Open source is great because of the people creating value with it.
Value isn't just slapping a license on something and pushing to GitHub. It's maintaining and curating that software over years, focusing the development towards a goal. It's as much telling users what features you're not willing to add and maintain as it is extending the project to interoperate with others.
And that long term commitment to maintenance hasn't come out of the vibe coded ecosystem. Commitment is exactly what they don't want, rather they want the fast sugar high before they drop it and move on to the next thing.
The biggest threat to open source is the strip mining of the entire ecosystem, destroying communities and practices that have made it thrive for decades. In the past, open source didn't win because it always had the best implementation, but because it was good enough to solve problems for enough people that it became self sustaining from the contribution of value.
I feel that will continue, but it's also going to take a set back from those that aren't interested in contributing value back into the ecosystem from which they have extracted so much.
Unfortunately for me, I believe that the algorithms won't allow me to get exposure for my work no matter how good it is so there is literally no benefit for me to do open source. Though I would love to, I'm not in a position to work for free. Exposure is required to monetize open source. It has to reach a certain scale of adoption.
The worst part is building something open source, getting positive feedback, helping a couple of startups and then some big corporation comes along and implements a similar product and then everyone gets forced by their bosses to use the corporate product against their will and people eventually forget your product exists because there are no high-paying jobs allowing people to use it.
With hindsight, Open Source is basically a con for corporations to get free labor. When you make software free for everyone, really you're just making it free for corporations to Embrace, Extend, Extinguish... They invest a huge amount of effort to suppress the sources of the ideas.
Our entire system is heavily optimized for decoupling products from their makers. We have almost no idea who is making any of the products we buy. I believe there is a reason for that. Open source is no different.
When we lived in caves, everyone in the tribe knew who caught the fish or who speared the buffalo. They would rightly get credit. Now, it's like; because none of the rich people are doing any useful work, they can only maintain credibility by obfuscating the source of the products we buy. They do nothing but control stuff. Controlling stuff does not add value. Once a process is organized, additional control only serves to destroy value through rent extraction.
If most of the "free software" is AI slop, then it's going to make me read a lot more source code for free software, if the free software is also open-source. If it isn't open-source, oh boy, no way.
AI backdoors are already a well known problem, and vibe-coded free software is always going to present a substantial risk. We'll see how it plays out in time, but I can already see where it's heading.
After enough problems, reputation and humans in the loop could finally become important again. But I have a feeling humanity is going to have to learn the hard way first (again).
I wonder if there will be a different phenomena — namely everyone just developing their own personal version of what they want rather than relying on what someone else built. Nowadays, if the core functionality is straightforward enough, I find that I just end up building it myself so I can tailor it to my exact needs. It takes less time than trying to understand and adapt someone else’s code base, especially if it’s (mostly) AI generated and contains a great deal of code slop.
307 comments
The thing that leaves a bad taste in my mouth is the fact that my works were likely included in the training data and, if it doesn't violate my licenses (GNU 2/3), it certainly feels against the spirit of what I intended when distributing my works.
I was made redundant recently "due to AI" (questionable) and it feels like my works in some way contributed to my redundancy where my works contributed to the profits made by these AI megacorps while I am left a victim.
I wish I could be provided a dividend or royalty, however small, for my contribution to these LLMs but that will never happen.
I've been looking for a copy-left "source available" license that allows me to distribute code openly but has a clause that says "if you would like to use these sources to train an LLM, please contact me and we'll work something out". I haven't yet found that.
I'm guessing that such a license would not be enforceable because I am not in the US, but at least it would be nice to declare my intent and who knows what the future looks like.
And who was it who benefited from this stuff? A lot of the benefit went to "megacorps" who took the savings and banked the higher profits.
So I don't think open source, which for many years was unashamedly about just cloning designs that were funding other people's salaries, can really cry too much about LLMs. And I say that as someone who has written a lot of open source software, including working on Wine.
Something that some security conscious folks care about.
Probably impractical for most server workloads (so not an alternative to Solaris on SPARC) but worth mentioning.
By the way Linux does it, but then again Oracle has its own distro.
https://www.oracle.com/servers/sparc
Linux while not being UNIX, does support it as Oracle upstreamed SPARC ADI support during the brief time they had Oracle Linux support for SPARC.
https://docs.kernel.org/arch/sparc/adi.html
Now why someone would use unsupported OpenBSD on SPARC for the kinds of clients that pay for this stuff, beats me.
Assuming it does even support SPARC ADI.
Why would someone use unsupported OpenBSD on SPARC for the clients that pay for it? Probably the same reason so many servers run on Rocky or Alma instead of RHEL, money. Perhaps they bought the hardware without the support contract for Solaris, or they don't want to keep paying for it
I'm sure Oracle milks commercial Solaris efficiently, but I imagine it's hard to find new customers for it.
> I think there's no meaningful case by the letter of the law that use of training data that include GPL-licensed software in models that comprise the core component of modern LLMs doesn't obligate every producer of such models to make both the models and the software stack supporting them available under the same terms.
Why do you think "fair use" doesn't apply in this case? The prior Bartz vs Anthropic ruling laid out pretty clearly how training an AI model falls within the realm of fair use. Authors Guild vs Google and Authors Guild vs HathiTrust were both decided much earlier and both found that digitizing copyrighted works for the sake of making them searchable is sufficiently transformative to meet the standards of fair use. So what is it about GPL licensed software that you feel would make AI training on it not subject to the same copyright and fair use considerations that apply to books?
“To summarize the analysis that now follows, the use of the books at issue to train Claude and its precursors was exceedingly transformative and was a fair use under Section 107 of the Copyright Act. And, the digitization of the books purchased in print form by Anthropic was also a fair use but not for the same reason as applies to the training copies. Instead, it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies. However, Anthropic had no entitlement to use pirated copies for its central library. Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.”
Or in the final judgement, “This order grants summary judgment for Anthropic that the training use was a fair use. And, it grants that the print-to-digital format change was a fair use for a different reason.”
The first:
> it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library
It is only fair use where Anthropic had already purchased a license to the work. Which has zero to do with scraping - a purchase was made, an exchange of value, and that comes with rights.
The second, which involves a section of the judgement a little before your quote:
> And, as for any copies made from central library copies but not used for training, this order does not grant summary judgment for Anthropic.
This is where the court refused to make any ruling. There was no exchange of value here, such as would happen with scraping. The court made no ruling.
1) "Copies used to train specific LLMs", for which the ruling is:
> The copies used to train specific LLMs were justified as a fair use.
> Every factor but the nature of the copyrighted work favors this result.
> The technology at issue was among the most transformative many of us will see in our lifetimes.
Notable here is that all of the "copies used to train specific LLMs" were copies made from books Anthropic purchased. But also of note is that Anthropic need not have purchased them, as long as they had obtained the original sources legally. The case references the Google Books lawsuit as an example of something Anthropic could have done to avoid pirating the books they did pirate where in Google obtained the original materials on loan from willing and participating libraries, and did not purchase them.
2) "The copies used to convert purchased print library copies into digital library copies", where again the ruling is:
> justified, too, though for a different fair use. The first factor strongly
> favors this result, and the third favors it, too. The fourth is neutral. Only
> the second slightly disfavors it. On balance, as the purchased print copy was
> destroyed and its digital replacement not redistributed, this was a
> fair use.
Here one might argue where the use of GPL code is different in that in making the copy, no original was destroyed. But it's also very likely that this wouldn't apply at all in the case of GPL code because there was also no original physical copy to convert into a digital format. The code was already digitally available.
3) "The downloaded pirated copies used to build a central library" where the court finds clearly against fair use.
4) "And, as for any copies made from central library copies but not used for training" where as you note Judge Alsup declined to rule. But notice particularly that this is referring to copies made FROM the central library AND NOT for the purposes of training an LLM. The copies made from purchased materials to build the central library in the first place were already deemed fair use. And making copies from the central library to train an LLM from those copies was also determined to be fair use.The copies obtained by piracy were not. But for uses not pertaining to the training of an LLM, the judge is declining to make a ruling here because there was not enough evidence about what books from the central library were copied for what purposes and what the source of those copies was. As he says in the ruling:
> Anthropic is not entitled to an order blessing all copying “that Anthropic has ever made after obtaining the data,” to use its words
This declination applies both to the purchased and pirated sources, because it's about whether making additional copies from your central library copies (which themselves may or may not have been fair use), automatically qualifies as fair use. And this is perfectly reasonable. You have a right as part of fair use to make a copy of a TV broadcast to watch at a later time on your DVR. But having a right to make that copy does not inherently mean that you also have a right to make a copy from that copy for any other purposes. You may (and almost certainly do) have a right to make a copy to move it from your DVR to some other storage medium. You may not (and almost certainly do not) have a right to make a copy and give it to your friend.
At best, an argument that GPL software wouldn't be covered under the same considerations of fair use that this case considers would require arguing that the copies of GPL code obtained by Anthropic were not obtained legally. But that's likely going to be a very hard argument to make given that GPL code is freely distributed all over the place with no attempts made to restrict who can access that code. In fact, GPL code demands that if you distribute the software derived from that code, you MUST make copies of the code available to anyone you distribute the software to. Any AI trainer would simply need to download Linux or emacs and the GPL requires the person they downloaded that software from to provide them with the source code. How could you then argue that the original source from which copies were made was obtained illicitly when the terms of downloading the freely available software mandated that they be given a copy?
> How could you then argue that the original source from which copies were made was obtained illicitly when the terms of downloading the freely available software mandated that they be given a copy?
By the license and terms such copies are under.
> For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
You _must_ show the terms. If you copy the GPL code, and it inherits the license, as the terms say it does, then you must also copy the license.
The GPL does not give you an unfettered right to copy, it comes with terms and conditions protecting it under contract law. Thus, you must follow the contract.
The GPL goes to some lengths to define its terms.
> A "covered work" means either the unmodified Program or a work based on the Program.
> Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.
It is not just the source code that you must convey.
> By the license and terms such copies are under.
Which clause of the GPL requires the receiver of GPL code to agree to the terms of the GPL before being allowed to receive the source code that they are entitled to under the license? Because that would expressly contradict the first sentence of section 9:
Isn't that one of the key points to the GPL? That the provisions of it only apply to you IF you decide to distribute GPL software but that they do not impose any restrictions on the users of the software? Surely you're not suggesting that anyone who has ever seen the source code of a GPLed piece of software is permanently barred from contributing to or writing similar software under a non-GPL license simply by the fact that they received the GPLed source code.> If you copy the GPL code, and it inherits the license, as the terms say it
> does, then you must also copy the license.
> The GPL does not give you an unfettered right to copy, it comes with terms
> and conditions protecting it under contract law. Thus, you must follow the > contract.
I agree that the GPL does not give you an unfettered right to copy. But the GPL like all such licenses are still governed by copyright law. And "fair use" is an exception to the copyright laws that allow you to make copies that you are not otherwise authorized to make. No publisher can put additional terms in their book, even if they wrap it in shrinkwrap that denies you the right to use that book for various fair use purposes like quoting it for criticism or parody. The Sony terms and conditions for the Play Station very clearly forbid copying the BIOS or decompiling it. But those terms are null and void when you copy the BIOS and decompile it for making a new emulator (at least in the US) because the courts have already ruled that such use is fair use.
So it is with the GPL. By default you have no right to make copies of the software at all. The GPL then grants you additional rights you normally wouldn't have under copyright law, but only to the extent that when exercising those rights, you comply with the terms of the GPL. But "Fair Use" then goes beyond that and says that for certain purposes, certain types and amounts of copies can be made, regardless of what rights the publisher does or does not reserve. This would be why the GPL specifically says:
Fair use (and its analogs in other countries) supersede the GPL. And even the GPL FAQ[1] acknowledges this fact: [1]: https://www.gnu.org/licenses/gpl-faq.en.html#GPLFairUse> So what is it about GPL licensed software that you feel would make AI training on it not subject to the same copyright and fair use considerations that apply to books?
The poster doesn't like it, so it's different. Most of the "legal analysis" and "foregone conclusions" in these types of discussions are vibes dressed up as objective declarations.
Whether data acquired from a licence that specifically forbids building a derivative work without also releasing that derivative under the same licence counts as a legitimate data gathering operation is anyone's guess, as those specific circumstances are about as far from that prior case as they can be.
> This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.
It is legitimate to acquire GPL software. The requirements of the license only occur if you're distributing the work AND fair use does not apply.
Training certainly doesn't count as distribution, so the buck passes to inference, which leaves us dealing with substantial similarity test, and still, fair use.
On the real measures of "fair use", at least in the US: https://fairuse.stanford.edu/overview/fair-use/four-factors/ I would contend that it absolutely face plants on all four measures. The purpose is absolutely in the form of a "replacement" for the original, the nature is something that has been abundantly proved many times over in court as being something copyrightable as a creative expression (with limited exceptions for particular bits of code that are informational), the "amount and substantiality" of the portions used is "all of it", and the effect of use is devastating to the market value of the original.
You may disagree. A long comment thread may ensue. However, all I really need for my point here is simply that it is far, far from obvious that waving the term "FAIR USE!" around is a sufficient defense. It would be a lengthy court case, not a slam-dunk "well duh it's obvious this is fair use". The real "fair use" and not the internet's "FAIR USE!" bear little resemblance to each other.
A sibling comment mentions Bartz v. Anthropic. Looking more at the details of the case I don't think it's obvious how to apply it, other than as a proof that just because an AI company acquired some material in "some manner" doesn't mean they can just do whatever with it. The case ruled they still had to buy a copy. I can easily make a case that "buying a copy" in the case of a GPL-2 codebase is "agreeing to the license" and that such an agreement could easily say "anything trained on this must also be released as GPL-2". It's a somewhat lengthy road to travel, where each step could result in a failure, but the same can be said for the road to "just because I can lay my hands on it means I can feed it to my AI and 100% own the result" and that has already had a step fail.
The current supreme court may think that machine learning is some sort of magic exception, but they also seem to believe whatever oligarchs will bribe them to believe. Again, I doubt the law will be enforced as written, but that has more to do with corruption than any meaningful legal theory. Arguments against this claim seem to ignore that courts have already ruled these systems to not have intellectual property rights of their own, and the argument for fair use seems to rely pretty heavily on some handwavey anthropomorphization of the models.
Are you saying that you believe that untested but technically; models trained on GPL sources need to distribute the resulting LLMs under GPL?
Or, in the case of LLMs, recklessly swing about software they don't understand while praying to find a business model.
What makes it all tricky for the courts is there's not a good way to really identify what part the generated code is a derivative of (except in maybe some extreme examples).
[1] https://en.wikipedia.org/wiki/Derivative_work
> I was made redundant recently "due to AI" (questionable) and it feels like my works in some way contributed to my redundancy where my works contributed to the profits made by these AI megacorps while I am left a victim.
I think anyone here can understand and even share that feeling. And I agree with your "questionable" - its just the lame HR excuse du jour.
My 2c:
- AI megacorps aren't the only ones gaining, we all are. the leverage you have to build and ship today is higher than it was five years ago.
- It feels like megacorps own the keys right now, but that’s a temporary. In a world of autonomous agents and open-weight models, control is decentralized.inference costs continue to drop, you dont need to be running on megacorp stacks. Millions (billions?) of agents finding and sharing among themselves. How will megacorps stop?
- I see the advent of LLMs like the spread of literacy. Scribes once held a monopoly on the written word, which felt like a "loss" to them when reading/writing became universal. But today, language belongs to everyone. We aren't losing code; we are making the ability to code a universal human "literacy."
1. they were trained on FLOSS repositories without consent of the authors, including GPL and AGPL repos
2. the best models are proprietary
3. folks making low-effort contribution attempts using AI (PRs, security reports, etc).
I agree those are legitimate problems but LLMs are the new reality, they are not going to go away. Much more powerful lobbies than the OSS ones are losing fights against the LLM companies (the big copyright holders in media).
But while companies can use LLMs to build replacements for GPL licensed code (where those LLMs have that GPL code probably in their training set), the reverse thing can also be done: one can break monopolies open using LLMs, and build so much open source software using LLMs.
In the end, the GPL is only a means to an end.
All the infrastructure that runs the whole AI-over-the-internet juggernaut is essentially all open source.
Heck, even Claude Code would be far less useful without grep, diff, git, head, etc., etc., etc. And one can easily see a day where something like a local sort Claude Code talking to Open Weight and Open Source models is the core dev tool.
$20/month with your provider of choice unlocks a lot.
Edit: the underlying point being, yes to the article. Either building upon the foundations of open source to making personal things, or just modifying a fork for my own needs.
> Why does this matter? Because the “open source” rebrand wasn’t just a marketing change — it was a philosophical amputation.
I cringe whenever I see such an AI generated sentence and unfortunately it devalues the article
FOSS came up around the core idea of liberating software for hardware, and later on was sustained by the idea of a commodity of commons we can build on. But with LLMs we have alternative pathways/enablement for the freedoms:
Freedom 0 (Run): LLMs troubleshoot environments and guide installations, making software executable for anyone.
Freedom 1 (Study/Change): make modifications, including lowering bar of technical knowledge.
Freedom 2 (Redistribute): LLMs force redistribution by building specs and reimplementing if needed.
Freedom 3 (Improve/Distribute): Everyone gets the improvement they want.
As we can see LLM makes these freedoms more democratic, beyond pure technical capability.
For those that cared only about these 4 freedoms, LLMs enable these in spades. But those who looked additionally for business, signalling and community values of free software (I include myself in this), these were not guaranteed by FOSS, and we find ourselves figuring out how to make up for these losses.
The Sunsama example actually argues the opposite direction. He spent an afternoon hacking around a closed system with an agent and it worked. If agents are good enough to reverse-engineer and workaround proprietary software today, the urgency to switch to open source decreases, not increases. "Good enough" workarounds are how SaaS stays sticky.
And agents don't eliminate the trust problem, they move it. Today you trust Sunsama with your workflows. In this vision, you trust your agent to correctly interpret your intent, modify code safely, and not introduce security holes. Non-technical users can't audit agent-modified code any better than they could audit the original source. You've traded one black box for another.
Companies buy these contracts for support and to have a throat to choke if things go wrong. It doesn't matter how much you pay your AI vendor, if you use their product to "vibe code" a SaaS replacement and it fails in some way and you lose a bunch of money/time/customers/reputation/whatever, then that's on you.
This is as much a political consideration as a financial one. If you're a C-suite and you let your staff make something (LLM generated or not) and it gets compromised then you're the one who signed off on the risky project and it's your ass on the line. If you buy a big established SaaS, do your compliance due-diligence (SOC2, ISO27001, etc.), and they get compromised then you were just following best practice. Coding agents don't change this.
The truth is that the people making the choice about what to buy or build are usually not the people using the end result. If someone down the food chain had to spend a bunch of time with "brittle hacks" to make their workflow work, they're not going to care at all. All they want is the minimum possible to meet whatever the requirement is, that isn't going to come back to bite them later.
SaaS isn't about software, it's about shifting blame.
The benefits to publishing AI generated code as open source are immense including code hosting and CI/CD pipelines for build, test, lint, security scans, etc. In additional to CI/CD pipelines, my repos have commits authored by Claude, Dependabot, GitHub Advanced Security Bot, Copilot, etc. All of this makes the code more reliable and maintainable, for both human and AI authored code.
Some thoughts on two recent posts:
1. 90% of Claude-linked output going to GitHub repos w <2 stars (https://news.ycombinator.com/item?id=47521157): I'm generally too busy to publishing code to promote, but at some time it might settle down. Additionally, with how fast AI can generate and refactor code, it can take some time before the code is stable enough to promote.
2. So where are all the AI apps? (https://news.ycombinator.com/item?id=47503006): They are in GitHub with <2 stars! They are there but without promotion it takes a while to get started in popularity. That being said, I'm starting to get some PRs.
[1] Malus.sh ; Initially a joke but, in the end, not. You can actually pay for their service.
[2] Your new code is delivered under the MalusCorp-0 License—a proprietary-friendly license with zero attribution requirements, zero copyleft, and zero obligations.
I don’t know what SaaS has to do with FOSS. The point of FOSS was to allow me to modify the software I run on my system. If the device drivers for some hardware I depend on are no longer supported by the company I bought it from, if it’s open source, I can modify and extend the software myself.
The Copy Left licenses ensure that I share my modifications back if I distribute them. It’s a thing for the public good.
Agent-based software development walls people off from that. Mostly by ensuring that the provenance of the code it generates is not known and by deskilling people so that they don’t know what to prompt or how to fix their code.
> SaaS scaled by exploiting a licensing loophole that let vendors avoid sharing their modifications.
AI is going to exploit even more: "Given the repository -> Construct tech spec -> Build project based on tech spec"
At this stage, I want everyone just close their source, stop working on open source until this issue of licensing gets resolved.
Any improvement you make to the open source code will be leveraged in ways you didn't intend it to be used, eventually making you redundant in the process
My belief is that Lemons effectively kill open source in the long run, and generally speaking, people forget that Free Software is even a thing. The reasoning for that is simple: it’s too easy to produce a “clean” derivative with just the parts you need. Lemons do much better with a fully Lemoned codebase than they do with a hybrid. Incentives to “rewrite” also free people from “licensing burdens” while the law is fuzzy.
They can often wire up a library or scaffold a migration, but they’re still pretty shaky at the “should we choose this at all?” layer — pricing cliffs, version floors, lock-in, EOLs, migration blockers, etc.
If coding agents do end up making free software more useful again, I think part of that will come from making open docs / changelogs / migration guides more usable at decision time, not just at implementation time.
Or, more likely, they churn off the product.
The SaaS platforms that will survive are busy RIGHT NOW revamping their APIs, implementing oauth, and generally reorganizing their products to be discovered and manipulated by agents. Failing in this effort will ultimately result in the demise of any given platform. This goes for larger SaaS companies, too, it’ll just take longer.
Top down with a "manager" agent telling "coding" agents what to do? I.e. mirroring the existing corporate interpretation of "agile"/scrum development.
I was thinking and seeing the title of this article, it would be interesting to setup a agent environment that mirrors a typical open source project involving a discussion forum (where features are thrown around) and a github issue/PR (where implementation details are discussed) and then have a set of agents that are "mergers" - acting as final review instances.
I assume that agents can be organised in any form at all, it's just a matter of setting up the system prompt and then letting them go for it. A discourse forum could be set up where agents track the feature requests of users of the software and then discuss how to implement it or how to workaround it.
The reason I ask is because one could then do a direct comparison of development processes, i.e. the open source model versus the corporate top-down process. It would interest me to see which process performance better in terms of maintainability, quality and feature richness.
When AI will eventually become the primary means to write code (because hand-programming is going to be slow enough that no company can continue like before) then that means AI becomes your new compiler that comes with a price tag, a subscription.
Programmers were held hostage to commercial compilers until free compilers reached sufficient level of quality, but now it doesn't matter if your disk is full of free/open toolchains if it's not you who is commanding them but commercial AI agents.
Undoubtedly there will be open-source LLMs eventually, of various levels of quality. But to write a free compiler you need a laptop while to train a free programming LLM you need a lot of money. And you also need money to run it.
Programming has been one of the rare arts that even a poor, lower class kid can learn on his own with an old, cheap computer salvaged from a scrap bin and he can raise himself enough intellectual capital to become a well-paid programmer later. I wonder what the equivalent path will be in the future.
That would basically make users a product manager and UX designer, which they aren't really capable of currently. At most they will discover what they think they want isn't what they actually want.
these are exciting times, that are coming despite any pessimism rooted in our out-dated software paradigms.
It compares and contrasts open source and free software, and then gives an example of how free software is better than closed software.
But if the premise of the article, that the agent will take the package you pick and adapt it to your needs, is correct, then honestly the agent won't give a rat's ass whether the starting point was free source or open source.
The AI propaganda articles are getting more devious my the minute. It's not just propaganda---it's Bernays-level manipulation!
Copyleft licenses like GPL/Apache mandate upstream freedom: Upstream has the "freedom" to use anything downstream, including anything written by a corporation.
Non-copyleft FOSS licenses like MIT/BSD are about downstream freedom, which is more of a philosophically utilitarian view, where anyone who receives the software is free to use it however they want, including not giving their changes back to the community, on the assumption that this maximizes the utility of this free software in the world.
If you prioritize the former goal, then coding agents are a huge problem for you. If the latter, then coding agents are the best thing ever, because they give everyone access to an effectively unlimited amount of cheap code.
If trendlines continue... It will be faster for AI to vibe code said software to your customized specifications than to sign up for a SaaS and learn it.
"Claude, create a project management tool that simplifies jira, customize it to my workflow."
So a lot of apps will actually become closed source personalized builds.
(luckily my projects are unpopular enough that nobody bothered training on them lol)
Value isn't just slapping a license on something and pushing to GitHub. It's maintaining and curating that software over years, focusing the development towards a goal. It's as much telling users what features you're not willing to add and maintain as it is extending the project to interoperate with others.
And that long term commitment to maintenance hasn't come out of the vibe coded ecosystem. Commitment is exactly what they don't want, rather they want the fast sugar high before they drop it and move on to the next thing.
The biggest threat to open source is the strip mining of the entire ecosystem, destroying communities and practices that have made it thrive for decades. In the past, open source didn't win because it always had the best implementation, but because it was good enough to solve problems for enough people that it became self sustaining from the contribution of value.
I feel that will continue, but it's also going to take a set back from those that aren't interested in contributing value back into the ecosystem from which they have extracted so much.
The worst part is building something open source, getting positive feedback, helping a couple of startups and then some big corporation comes along and implements a similar product and then everyone gets forced by their bosses to use the corporate product against their will and people eventually forget your product exists because there are no high-paying jobs allowing people to use it.
With hindsight, Open Source is basically a con for corporations to get free labor. When you make software free for everyone, really you're just making it free for corporations to Embrace, Extend, Extinguish... They invest a huge amount of effort to suppress the sources of the ideas.
Our entire system is heavily optimized for decoupling products from their makers. We have almost no idea who is making any of the products we buy. I believe there is a reason for that. Open source is no different.
When we lived in caves, everyone in the tribe knew who caught the fish or who speared the buffalo. They would rightly get credit. Now, it's like; because none of the rich people are doing any useful work, they can only maintain credibility by obfuscating the source of the products we buy. They do nothing but control stuff. Controlling stuff does not add value. Once a process is organized, additional control only serves to destroy value through rent extraction.
AI backdoors are already a well known problem, and vibe-coded free software is always going to present a substantial risk. We'll see how it plays out in time, but I can already see where it's heading.
After enough problems, reputation and humans in the loop could finally become important again. But I have a feeling humanity is going to have to learn the hard way first (again).
> a free software license alone does not empower users to be truly free if they lack the expertise to exercise those freedoms
This is a bullshit argument, and I'm surprised that people aware enough of these issues would try to push it.
Closed (or online-only) software prevents not only the end user from modifying it, but also 'unlicensed' hackers that the end user can ask for help.
See the "right to repair" movement as a very close example. The possibility of an 'ecosystem' of middlemen like these, matters !
Some kind of artisan "proper" quality work, compared to cheap enterprise AI slop.
>agents don’t leave
I think Pete Hegseth would disagree with this statement.