Be intentional about how AI changes your codebase (aicode.swerdlow.dev)

by benswerd 102 comments 170 points
Read article View on HN

102 comments

[−] AgentOrange1234 57d ago
"Every optional field is a question the rest of the codebase has to answer every time it touches that data,"

This is a beautiful articulation of a major pet peeve when using these coding tools. One of my first review steps is just looking for all the extra optional arguments it's added instead of designing something good.

[−] shepherdjerred 57d ago
There's nothing specific to AI about this. Humans make the same mistake.

To solve this permanently, use a linter and apply a "ratchet" in CI so that the LLM cannot use ignore comments

[−] oblio 57d ago
Is there a Python linter that does this?
[−] datsci_est_2015 57d ago
Not that I’m aware of (writing Python 10+ years). Suppose you could vibecode one yourself though.
[−] raulparada 57d ago
Not out-of-the-box afaik, but we use https://ast-grep.github.io (on a pre-commit hook) for such cases, which bridges the linter gaps nicely.
[−] KronisLV 57d ago
I've been writing my own linter that's supposed to check projects regardless of the technology (e.g. something that focuses on architecture and conventions, alongside something like Oxlint/Oxfmt and Ruff and so on), with Go and goja: https://github.com/dop251/goja

Basically just a bunch of .js rules that are executed like:

  projectlint run --rules-at ./projectlint-rules ./src
Which in practice works really well and can be in the loop during AI coding. For example, I can disallow stuff like eslint-disable for entire files and demand a reason comment to be added when disabling individual lines (that can then be critiqued in review afterwards), with even the error messages giving clear guidelines on what to do:

  var WHAT_TO_DO = "If you absolutely need to disable an ESLint rule, you must follow the EXACT format:\n\n" +
  "// prebuild-ignore-disallow-eslint-disable reason for disabling the rule below: [Your detailed justification here, at least 32 characters]\n" +
  "// eslint-disable-next-line specific-rule-name\n\n" +
  "Requirements:\n" +
  "- Must be at least 32 characters long, to enforce someone doesn't leave just a ticket number\n" +
  "- Must specify which rule(s) are being disabled (no blanket disables for ALL rules)\n" +
  "- File-wide eslint-disable is not allowed\n\n" +
  "This is done for long term maintainability of the codebase and to ensure conscious decisions about rule violations.";
The downside is that such an approach does mean that your rules files will need to try to parse what's in the code based on whatever lines of text there are (hasn't been a blocker yet), but the upside is that with slightly different rules I can support Java, .NET, Python, or anything else (and it's very easy to check when a rule works).

And since the rules are there to prevent AI (or me) from doing stupid shit, they don't have to be super complex or perfect either, just usable for me. Furthermore, since it's Go, the executable ends up being a 10 MB tool I can put in CI container images, or on my local machine, and for example add pre-run checks for my app, so that when I try to launch it in a JetBrains IDE, it can also check for example whether my application configuration is actually correct for development.

Currently I have plenty in regards to disabling code checks, that reusable components should show up in a showcase page in the app, checking specific configuration for the back end for specific Git branches, how to use Pinia stores on the front end, that an API abstraction must be used instead of direct Axios or fetch, how Celery tasks must be handled, how the code has to be documented (and what code needs comments, what format) and so on.

Obviously the codebase is more or less slop so I don't have anything publish worthy atm, but anyone can make something like that in a weekend, to supplement already existing language-specific linters. Tbh ECMAScript is probably not the best choice, but hey, it's just code with some imports like:

  // Standalone eslint-disable-next-line without prebuild-ignore
  if (trimmed.indexOf("// eslint-disable-next-line") === 0) {
    projectlint.error(file, "eslint-disable-next-line must be preceded by: " + IGNORE_MARKER, {
      line: lineNum,
      whatToDo: WHAT_TO_DO
    });
    continue;
  }
Can personally recommend the general approach, maybe someone could even turn it into real software (not just slop for personal use that I have), maybe with a more sane scripting language for writing those rules.
[−] pipes 57d ago
Optional field can be addressed with good defaults. Well, that is how I think about it. I.e. if they aren't passed in then they are set to a default value.
[−] AgentOrange1234 56d ago
Here is an example of what I mean when I say optional stuff is a sign of failure to design.

I inherited some vibe-coded scripts that dealt with AWS services like Bedrock and S3. These scripts needed create various AWS SDK clients. These clients needed to know which account/region to use.

Had this been well designed, there would have some function/module responsible for deciding what account/region to use. This decision point might be complex: it might consider things like environment variables, configuration files, and command line arguments. It might need to impose some precedence among these options. Whatever these details, the decision would be authoritative once made. The rest of the code base should have expected a clear decision and just do what was decided.

Instead, the coding assistant added optional account/region arguments in many submodules. These arguments were nullable. When left unspecified, "convenient" logic did its own environment lookups and similar. The result was many "works on my machine" failures because command-line arguments affected only certain portions of the program, environment variables others, config files still others.

This is grim stuff. It's a ton of code that should not exist, spreading the decision all over the code.

[−] pipes 56d ago
I see what you mean. Thanks.
[−] fatata123 57d ago
[dead]
[−] xiaolu627 57d ago
What changed for me isn’t that AI writes bad code by default, but that it lowers the friction to adding code faster than the team can properly absorb it. The dangerous part is not obvious bugs, it’s subtle erosion of consistency.
[−] vinnymac 57d ago
Well said. I have to review PRs of non-software developers nowadays.

The “what is this trying to do?” has never been harder to answer than before. It creates scenarios where 99% is correct, but the most important area is subtly broken. I prefer it to be human, where 60-80% will be correct, and the problematic areas begin to smell more and more gradually.

In my experience LLMs, at times, may hide the truth from you in a haystack made of needles.

[−] thienannguyencv 57d ago
This very matches my observation. The error isn't due to incorrect code—it's code that looks specific to your system but is actually generic patterns applied from the training process. The structure is correct, the logic is sound, it just doesn't interact with what your source code actually does.

Harder to catch because nothing is factually wrong. You have to ask: could this output have been produced without actually reading my codebase?

[−] mrvinhpro 52d ago
[dead]
[−] riteshkew1001 57d ago
[flagged]
[−] ChrisMarshallNY 58d ago
Because of the way that I use AI, I am constantly looking at the code. I usually leave it alone, if I can; even if I don't really like it.

I will, often go back, after the fact, and ask for refactors and documentation.

It works. Probably a lot slower than using agents, but I test every step, and it is a lot faster than I would do it, unassisted.

[−] benswerd 58d ago
I don't think testing the product alone is good enough, because when you give it tests it has to pass it prioritizes passing them at the expense of everything else — including code quality. I've seen it pull in random variables, break semantic functions, etc.
[−] theshrike79 57d ago
Code quality can also be codified. If you can't express "code quality" deterministically, then it's all just feels.

And if you can define "quality" in a way the agent can check against it, it will follow the instructions.

[−] embedding-shape 57d ago

> then it's all just feels

Would that be so bad? "Readability" sure is subjective, so it seems "code quality" is.

Ask 10 programmers what quality a snippet of code is, and you'll get 10 different answers.

[−] theshrike79 57d ago
And there is the problem. Then you start arguing about brace positions and function names and whether simple data classes should have docstrings on properties or not.

All that time it's people arguing with people and wasting time on pure feels. People will get offended and angry and defensive, nothing good ever comes from it.

But when you pick a style and enforce it with a tool like gofmt or black both locally and in the CI, the arguments go away. That's the style all code merged to the codebase must look like and you will deal with it like a professional.

Go proverb: "Gofmt's style is no one's favorite, yet gofmt is everyone's favorite."

[−] embedding-shape 57d ago
"Style" is such a small part about what people generally care about when they talk about code quality though, useful/intuitive abstractions, the general design and more tends to be a lot more important and core to the whole code quality debate.
[−] theshrike79 57d ago
Linters can be set check for cyclomatic complexity, using old/inefficient styles of programming (go fix ftw) etc. Formatting is just an easy and clear example that everyone should understand.
[−] embedding-shape 57d ago
Right, but all of those are easy, left is the actually hard stuff...
[−] deadbabe 57d ago
Amateurs are the one who argue about syntax.

Code quality is about how well a piece of code expresses what it intends to do. It’s like quality writing.

[−] ChrisMarshallNY 57d ago
Syntax and style can be very important, when transferring code.

I’m generally of the opinion that LLM-supplied code is “prolix,” but works well. I don’t intend to be personally maintaining the code, and plan to have an LLM do that, so I ask the LLM to document the code, with the constraint being, that an LLM will be reading the code.

It tends to write somewhat wordy documentation, but quite human-understandable.

In fact, it does such a good job, that I plan on having an LLM rewrite a lot of my docs (and I have a lot of code documentation. My cloc says that it’s about 50/50, between code and documentation).

Personally, I wish Apple would turn an LLM loose on the header docs for their SwiftUI codebase. It would drastically improve their docs (which are clearly DocC).

[EDITED TO ADD] By the way, it warms my heart to see actual discussion threads on code Quality, on HN.

[−] theshrike79 57d ago
You start to care about standard syntactic rules and enforced naming conventions when you're the one waking up 4 in the morning on a Saturday to an urgent production issue and you need to fix someone else's code that's written in a completely incoherent style.

It "expresses what it intends to do" prefectly well - for the original author. Nobody else can decipher it without spending significant amounts of memory cycles.

Jack Kerouac is "quality writing" as is the Finnish national epic Kalevala.

But neither are the kind you want to read in a hurry when you need to understand something.

I want the code at work to be boring, standard and easy to understand. I can get excited by fancy expressive tricks on my own time.

[−] deadbabe 57d ago
What do you mean exactly? Are you the type that hates seeing comprehensions and high order functions and would rather just see long for loops and nested ifs?
[−] mchaver 57d ago

> And there is the problem. Then you start arguing about brace positions and function names and whether simple data classes should have docstrings on properties or not.

In my 15 years of experience I have not worked at a place like this. Those are distractions. Anytime something about style has been brought up, the solution was to just enforce a linter/pre-commit process/blacklist for certain functions, etc. It can easily be automated. When those tools don't exist for particular ecosystems we made our own.

[−] datsci_est_2015 57d ago

> And there is the problem. Then you start arguing about brace positions and function names and whether simple data classes should have docstrings on properties or not.

Holy strawman Batman!

Have you ever given a code review? These are the lowest items on the totem pole of things usually considered critical for a code review.

Here’s an example code review from this week from me to a colleague, paraphrased:

“We should consider using fewer log statements and raising more exceptions in functions like this. This condition shouldn’t happen very often, and a failure of this service is more desirable than it silently chugging along but filling STDOUT with error messages.”

[−] theshrike79 57d ago
So you're fine with people using, for example, different brace styles at random? Or one person uses var everywhere, other uses definite types. One adds standard docstrings on every function and property, one never comments a single line of code.

Don't you have "format on save" enabled in your editor? When you open a file, change two lines and save -> boom 500 changed lines because the previous programmer had different formatting rules than you. Whoops.

This is why the low totem pole stuff needs to be enforced automatically so that actual humans can focus on the higher stuff that's about feels and intuition - things that are highly context dependent and can't be codified into rules.

[−] dotancohen 57d ago
You're bikeshedding in a conversation about real issues.

After a certain point in your career you don't care what brace style the new dev used, even if the project has lint rules. You do care if critical errors are ignored and possibly incorrect data is returned. These two situations are in no way equivalent, no need to bikeshed the former when discussing the latter.

[−] datsci_est_2015 57d ago

> Code quality can also be codified. If you can't express "code quality" deterministically, then it's all just feels. And if you can define "quality" in a way the agent can check against it, it will follow the instructions.

> This is why the low totem pole stuff needs to be enforced automatically so that actual humans can focus on the higher stuff that's about feels and intuition - things that are highly context dependent and can't be codified into rules.

I’m confused, have you switched your position on this topic over the course of this thread? Maybe I’ve misinterpreted your position entirely. If so, my bad.

[−] benswerd 57d ago
I disagree with this.

My team will send me random snippets from OSS libraries and we all go WTF what is that, and my team will also send really clever lines and we'll go wow.

"Good code" is subjective, but good engineers have good taste, and taste is real.

[−] datsci_est_2015 57d ago

> Code quality can also be codified.

Do you think that no one has tried this over the past 80 years with human programmers, but now with LLMs we can suddenly manage to do it? Why do linters and formal verification and testing exist if we could’ve jus codified coding quality in the first place?

To me, this is like telling a carpenter that we can codify what makes a chair comfortable or not.

[−] stpedgwdgfhgdd 57d ago
You can ask it to /simplify

Related, it seems to me that there are two types of tests, the ones created in a TDD style and can be modified and the ones that come from acceptance criteria and should only be changed very carefully.

[−] ChrisMarshallNY 58d ago
Oh, no. I test. Each. and. Every. Step.

I use a test harness, and step through the code, look at debug logs, and abuse the code, as much as possible.

Kind of a pain, but I find unit tests are a bit of a "false hope" kind of thing: https://littlegreenviper.com/testing-harness-vs-unit/

[−] cindyllm 57d ago
[dead]
[−] butILoveLife 57d ago
[dead]
[−] mattacular 58d ago
Code cannot and should not be self documenting at scale. You cannot document "the why" with code. In my experience, that is only ever used as an excuse not to write actual documentation or use comments thoughtfully in the codebase by lazy developers.
[−] bdangubic 58d ago
this always starts out right but over the years the code changes and its documentation seldom does, even on the best of teams. the amount of code documentation that I have seen that is just plain wrong (it was right at some point) far outnumbers the amount of code documentation that was actually in-sync with the code. 30 years in the industry so large sample size. now I prefer no code documentation in general
[−] layer8 57d ago
The good thing about having documentation in the (version-controlled) code is that it allows you to retrace when it was correct (using git blame or equivalent), and that gives you background about why certain things are the way they are. I 100% prefer outdated documentation in the code to no documentation.
[−] derrak 57d ago
Are there any good systems that somehow enforce consistency between documentation and code? Maybe the problem is fundamentally ill-posed.
[−] theshrike79 57d ago
Simon Willison had this idea of "Documentation unit tests" in 2018: https://simonwillison.net/2018/Jul/28/documentation-unit-tes...

It's not a massively complex AI monstrosity (it's from 2018 after all) or a perfect solution, but it's a good jumping off point.

With a slight sprinkling of LLM this could be improved quite a bit. Not by having the agent write the documentation necessarily, but for checking the parity and flagging it for users.

For example a CI job that checks that relevant documentation has been created / updated when new functionality is added or old one is changed.

[−] dec0dedab0de 57d ago
interesting that they don’t mention doctest which has been a python built-in for quite a while.

It allows you to write simple unit tests directly in your doc strings, by essentially copying the repl output so it doubles as an example.

combined with something like sphinx that is almost exactly what you’re looking for.

doctest kind of sucks for anything where you need to set up state, but if you’re writing functional code it is often a quick and easy way to document and test your code/documentation at the same time.

https://docs.python.org/3/library/doctest.html

[−] theshrike79 57d ago
Doctest is writing unit tests in doctstrings.

That system is an unit test that checks that functions are documented in the documentation. Nothing to do with docstrings.

[−] dec0dedab0de 57d ago
right but docstrings are documentation, so if your doctest is working, then at least that part of the documentation is correct.

Even without doctest, generating your documentation from docstrings is much easier to keep updated than writing your documentation somewhere else, because it is right there as you are making changes.

[−] array_key_first 55d ago
If the documentation and code could be in-sync, then the documentation would just be code, like type hints. But good documentation that the parent is talking about cannot be in-sync.

Programming languages can't understand semantics, and that's why we program in the first place. I can't tell a computer "I would like a program to achieve this goal", instead I have to instruct it how to achieve the goal. Then, I would need to document elsewhere what the goal is and why I'm doing it.

LLMs change that, we can now legitimately ask the model "I would like a program for this goal". But the documentation is lost in the code if we don't save comments or save the prompt.

Git commits are also a good source of documentation. They shouldn't describe what we're doing, because I can just read the code. But often, I come across code and I'm thinking "why are we doing this? Can I change this? If I change it, what are the side effects?" If I'm lucky, the git blame will answer those questions for me.

[−] sgc 57d ago
I am not saying it doesn't matter because it does, but how much does it matter now since we can get documentation on the fly?

I started working on something today I hadn't touched in a couple years. I asked for a summary of code structure, choices I made, why I made them, required inputs and expected outputs. Of course it wasn't perfect, but it was a very fast way to get back up to speed. Faster than picking through my old code to re-familiarize myself for sure.

[−] codingdave 57d ago
We cannot get full documentation on the fly, though. We can get "what this does" level of documentation for the system that AI is looking at. And if all you are doing is writing some code, maybe that is enough. But AI cannot offer the bigger picture of where it fits in the overall infrastructure, nor the business strategy. It cannot tell you why technical debt was chosen on some feature 5-10 years ago. And those types of documentation are far more important these days, as people write less of the code by hand.

This is the same discussion that goes round ad nauseum about comments. Nobody needs comments to tell us what the code does. We need comments to explain why choices were made.

[−] reverius42 57d ago
Keeping the documentation in the repo (Markdown files) and using an AI coding agent to update the code seems to work quite well for keeping documentation up to date (especially if you have an AGENTS.md/CLAUDE.md in the repo telling it to always make sure the documentation is up to date).
[−] jurgenburgen 57d ago
Ultimately the code is the documentation.
[−] array_key_first 55d ago
Code can only ever document "what" by definition, never "why". If it could document "why", then no computer programmers would exist. So, we have to supplement the "why" using natural language. There's a 100% loss conversion there when we convert it to code.
[−] benswerd 57d ago
This is correct. Comments serve a purpose too, but they should only be used when code fails to self document which should be the exception.
[−] earljwagner 57d ago
The concepts of Semantic Functions and Pragmatic Functions seem to be analogous to a Functional Core and Imperative shell (FCIS):

https://testing.googleblog.com/2025/10/simplify-your-code-fu...

The key insight of FCIS is that complicated logic with large dependencies leads to a large test suite that runs slowly. The solution is to isolate the complicated logic in the functional core. Test that separately from the simpler, more sequential tests of the imperative shell.

[−] bcjdjsndon 57d ago
I think it's much better put in your link. Op is too vague on what constitutes pragmatic v semantic... when what he should just say is make it pure functional because then you don't have to simulate a database in your test suite.
[−] abcde666777 58d ago
My intentionality is that I'll never let it make the changes. I make the changes. I might make changes it suggests, but only upon review and only written with my hands.
[−] benswerd 58d ago
I think this style of work will go away. I was skeptical but I now write the majority of my code through agents.
[−] abcde666777 57d ago
I don't think it will go away, I think there will remain a niche for code where we care about precision. Maybe that niche will get smaller over time, but I think it will be a hold out for quite a while. A loose analogy I've found myself using of late is comparing it to bespoke vs off the shelf suits.

For instance, two things I'm currently working on: - A reasonably complicated indie game project I've been doing solo for four years. - A basic web API exposing data from a legacy database for work.

I can see how the API could be developed mostly by agents - it's a pretty cookie cutter affair and my main value in the equation is just my knowledge of the legacy database in question.

But for the game... man, there's a lot of stuff in there that's very particular when it comes to performance and the logic flow. An example: entities interacting with each other. You have to worry about stuff like the ordering of events within a frame, what assumptions each entity can make about the other's state, when and how they talk to each other given there's job based multi-threading, and a lot of performance constraints to boot (thousands of active entities at once). And that's just a small example from a much bigger iceberg.

I'm pretty confident that if I leaned into using agents on the game I'd spend more time re-explaining things to them than I do just writing the code myself.

[−] benswerd 57d ago
I write systems rust on the cutting edge all day. My work is building instant MicroVM sandboxes.

I was shocked recently when it helped me diagnose a musl compile issue, fork a sys package, and rebuild large parts of it in 2 hours. Would've taken me atleast 2 weeks to do it without AI.

Don't want to reveal the specific task, but it was a far out of training data problem and it was able to help me take what would've normally taken 2 weeks down to 2 hours.

Since then I've been going pretty hard at maximizing my agent usage, and tend to have a few going at most times.

[−] brabel 57d ago
Yeah a lot of us were at the point the other guy is now and thinking that writing code by hand is still an acceptable way to go. It just isn’t anymore unless you can justify spending 5 times more time in a task just because you have some principle that code needs to be written by hand. And the funny thing is that the more complex the code base, it actually becomes the more appropriate to only touch it with AI since AI can keep a lot more concepts in its mind than us human with our poultry 7 or so. I think only a few die hard programmers will keep thinking that in a year from now.
[−] abcde666777 57d ago
That you even describe it as holding concepts in its mind sounds like confusion to me.

As does the reductionist idea that human thinking is something crude in comparison.

[−] layer8 57d ago
Diagnosis is very different from writing code, though. I fully agree that it can be very helpful for analysis and search, but I don’t let it write code.
[−] newsicanuse 57d ago
People like OP are the reason why the demand for software engineers will rise exponentially.
[−] dougg 57d ago
I see this a lot in research as well, unfortunately including myself. I do miss college where I would hand write a few thousand lines of code in a month, but i’m just so much more productive now.
[−] thepukingcat 58d ago
+1 for this, once you have a solid plan with the AI and prompt it to make one small changes at a time and review as you go, you could still be in control of your code without writing a single line
[−] android521 57d ago
unfortunately, unless you are god level good (i would say top 100 developers in the entire world), you will be fired eventually.
[−] bandrami 57d ago
Dude I still get contracts in ColdFusion. You guys have no idea how slowly actual businesses actually move.
[−] abcde666777 56d ago
Ain't that the truth. It's common to see businesses using systems from the 90s in my line of work, or managing crucial business data and workflows in spreadsheets. Once companies establish a workflow they're often very hesitant to change it due to all the dependencies.
[−] bandrami 56d ago
And in a lot of cases they're making the right call. Change is incredibly expensive and risky and if the only real payoff is "it's slightly faster" that's probably not worth it.
[−] sfn42 57d ago
Lol
[−] gravitronic 58d ago
*adds "be intentional" to the prompt*

Got it, good idea.

[−] xmcqdpt2 57d ago
This could have been html instead of whatever awful moving pattern it is.
[−] clbrmbr 58d ago
Page not rendering well on iPhone Safari.

Good content tho!

[−] divyanshu_dev 57d ago
The velocity problem is real. AI makes it easy to add things faster than you can understand what you added. The intentionality has to come before you prompt, not after you review.
[−] deadlypointer 57d ago
site is totally broken on mobile, just because cursor can vibe code a nice rolling scrolling shithole, chances are it will break on some platform/browser
[−] maciejj 57d ago
I've noticed the cleaner the codebase, the better AI agents perform on it. They pick up on existing patterns and follow them. Throw them at a messy repo and they'll invent a new pattern every time.

It's basically like hiring a new developer for one task and letting them go right after. They don't know your conventions, your history, or why things are the way they are. The only thing they have is what they can see in the code. Your code quality is basically the prompt now.

[−] benswerd 58d ago
I've seen a lot of people talking about how AI is making codebases worse. I reject that, people are making codebases worse by not being intentional about how their AI writes code.

This is my take on how to not write slop.

[−] amavashev 57d ago
Agree, you need to your own code review, although as AI gets better, this problem will most likely be solved.
[−] ares623 57d ago
What if it's not _my_ codebase?
[−] theogravity 57d ago
Site renders extremely poorly on mobile safari that it is completely unreadable.
[−] mrbluecoat 58d ago
..but unintentional AI (aka Modern Chaos Monkey) is so much more fun!
[−] heliumtera 57d ago
Dog, be intentional with you web page.

Holy fuck Batman

[−] WWilliam 57d ago
[flagged]
[−] c3z_ 57d ago
[dead]
[−] bobokaytop 57d ago
[dead]
[−] microbuilderco 57d ago
[flagged]
[−] rsmtjohn 57d ago
[flagged]
[−] openclaw01 58d ago
[dead]
[−] fhouser 58d ago
[dead]
[−] lucas36666 57d ago
[dead]
[−] Sense_101856 58d ago
[dead]
[−] mika-el 58d ago
[flagged]
[−] ueda_keisuke 57d ago
AI feels less like an autonomous programmer and more like a very capable junior engineer.

The useful part is not just asking it to write code, but giving it context: how the codebase got here, what constraints are intentional, where the sharp edges are, and what direction we want to take.

With that guidance, it can be excellent. Without it, it tends to produce changes that make sense in isolation but not in the system.