I disagree with the premise. It made all engineering easier. Bad and good.
I believe vibe coding has always existed. I've known people at every company who add copious null checks rather than understanding things and fixing them properly. All we see now is copious null checks at scale. On the other hand, I've also seen excellent engineering amplified and features built by experts in days which would have taken weeks.
The connection to Amdahl's law is totally on point. If you're just using LLMs as a faster way to get _your_ ideas down, but still want to ensure you validate and understand the output, you won't get the mythical 10x improvement so many seem to claim they're getting. And if you do want that 10x speedup, you have to forego the validation and understanding.
I do agree with you, but don't underestimate the projects where you can actually apply this 10x. For example, I wanted to get some analytics out of my database. What would have been a full weekend project was now done in an hour. So for such things there is a huge speed boost.
But once software becomes bigger and more complex, the LLM starts messing up, and the expert has to come in. That basicaly means your months project cannot be done in a week.
My personal prediction: plugins and systems that support plugins will become important. Because a plugin can be written at 10x speed. The system itself, not so much.
Yes, definitely. I also don't think every project is able to create a plugin platform. Sometimes you just have a lot of interconnected components, where they kind of influence each other.
What I was trying to say is that in future developments, as a developer, one of the extra questions on your mind should be: can we turn this into a platform with separate plugins? Because you know those plugins can be written fast, cheap, and don't require top notch engineering work.
But I think I get what you are saying: what you gain in plugin simplicity, you pay in effort to design the platform to support them.
I guess it will depend from project to project, and so the typical "it depends" applies :).
People have been thinking about that a long time though. For that objective, LLMs don't seem to open up any new capabilities. If that problem could be solved, with really clean abstractions that dramatically reduce context needed to understand one "module" at a time, sure LLMs will then be able to take that an run. But it's a fundamentally hard problem.
Well, it's made bad engineering massively easier and good engineering a little easier.
So much so that many people who were doing good engineering before have opted to move to doing three times as much bad engineering instead of doing 10% more good engineering.
# abstract internals for no reason
func doThing(x: bool)
if (x)
return true
else
return false
# make sure our logic works as expected
assert(doThing(true))
# ???
# profit
It's excellent software engineering because there are tests
How will you deal with it? I successfully convinced $big_important_group at $day_job to not implement a policy of failing their builds when code coverage dips below their target threshold > 90%. (Insane target, but that's a different conversation.)
I convinced them that if they wanted to treat uncovered lines of code as tech debt, they needed to add an epic stories to their backlog to write tests. And their artificially setting some high target coverage threshold will produce garbage because developers will write do-nothing tests in order to get their work done and not trip the alarms. I argued that failing the builds on code coverage would be unfair because the tech debt created by past developers would unfairly hinder random current-day devs getting their work done.
Instead, I recommended they pick their current coverage percentage (it was < 10% at the time) and set the threshold to that simply to prevent backsliding as new code was added. Then, as their backlogged, legit tests were implemented, ratchet up the coverage threshold to the new high water mark. This meant all new code would get tests written for them.
And, instead of failing builds, I recommended email blasts to the whole team to indicate there was some recent backsliding in the testing regime and the codebase had grown without accompanying tests. It was not a huge shame event, but good a motivator to the team to keep up the quality. SonarQube was great for long-term tracking of coverage stats.
Finally, I argued the coverage tool needed to have very liberal "ignore" rules that were agreed to by all members of the team (including managers). Anything that did not represent testable logic written by the team: generated code, configurations, tests themselves, should not count against their code coverage percentages.
You could ask the same thing about tests themselves. And I'm not talking about tests that don't exercise the code in a meaningful manner like your assertions on mocks(?!)
I'm saying you could make the same argument about useful tests themselves. What is testing that the tests are correct?
Uncle Bob would say the production code is testing the tests but only in the limited, one-time, acceptance case where the programmer who watches the test fail, implements code, and then watches it pass (in the ideal test-driven development scenario.)
But what we do all boils down to acceptance. A human user or stakeholder continues to accept the code as correct equals a job well done.
Of course, this is itself a flawed check because humans are flawed and miss things and they don't know what they want anyhow. The Agile Manifesto and Extreme Programming was all about organizing to make course corrections as cheap as possible to accommodate fickle humanity.
> Like, what are we even doing here?
What ARE we doing? A slapdash job on the whole. And, AI is just making slapdash more acceptable and accepted because it is so clever and the boards of directors are busy running this next latest craze into the dirt. "Baffle 'em with bullsh*t" works in every sector of life and lets people get away with all manner of sins.
I think what we SHOULD be doing is plying our craft. We should be using AI as a thinking tool, and not treat it like a replacement for ourselves and our thinking.
So there are tests that leverage mocks. Those mocks help validate software is performing as desired by enabling tests to see the software behaves as desired in varying contexts.
If the software fails, it is because the mocks exposed that under certain inputs, undesired behavior occurs, an assert fails, and a red line flags the test output.
Validating that the mocks return the desired output....
Maybe there is a desire that the mocks return a stream of random numbers and the mock validation tests asserts said stream adheres to a particular distribution?
Maybe someone in the past pushed a bad mock into prod, that mock validated a test that would have failed given better mock, and a post mortem when the bad software, now pushed into prod, was traced to a bad mock derived a requirement that all mocks must be validated?
I think it's easy to forget that the LLM is not a magic oracle. It doesn't give great answers. What you do with the LLM's output determines whether the engineering you produce is good or bad. There are places where you can plonk in the LLM's output as-is and places you can't, or times when you have to keep nudging for a better output, and times when nothing the LLM produces is worth keeping.
It makes bad engineering easier because it's easy to fall into the trap of "if the LLM said so, it must be right".
Even if you agree with the OP, there's a large portion of applications where it simply doesn't matter if the quality of the software is good or terrible as long as it sufficiently works.
Yeah, I've seen this too. I like to call them "single-serving apps". I made a flashcard app to study for interviews and one-shot it with Claude Code. I've had it add some features here and there but haven't really looked at the code.
Except vibe coding is not "engineering," but more akin to project management. Engineering presupposes a deep and thorough understanding of your code. If you ship code that you’ve never even looked at, you are no longer an engineer.
> Ive known people at every company who add copious null checks rather than understanding things and fixing them properly.
ynow "defensive programming" is a thing, yeah? Sorry mate, but that statement I'd expect from juniors, which are also often the one's claiming their own technical superiority over others
There are cases where a unit test or a hundred aren’t sufficient to demonstrate a piece of code is correct. Most software developers don’t seem to know what is sufficient. Those heavily using vibe coding even get the machine to write their tests.
Then you get to systems design. What global safety and temporal invariants are necessary to ensure the design is correct? Most developers can’t do more than draw boxes and arrows and cite maxims and “best practices” in their reasoning.
Plus you have the Sussman effect: software is often more like a natural science than engineering. There are so many dependencies and layers involved that you spend more time making observations about behaviour than designing for correct behaviours.
There could be useful cases for using GenAI as a tool in some process for creating software systems… but I don’t think we should be taking off our thinking caps and letting these tools drive the entire process. They can’t tell you what to specify or what correct means.
Every few years a new tool appears and someone declares that the difficult parts of software engineering have finally been solved, or eliminated. To some it looks convincing. Productivity spikes. Demos look impressive. The industry congratulates itself on a breakthrough. Staff reductions kick in in the hopes that the market will respond positively.
As a software engineer, I'd love if the industry had an actual breakthrough, if we found a way to make the hard parts easier and prevent software projects from devolving into balls of chaos and complexity.
But not if the only reward for this would be to be laid off.
So, once again, the old question: If reducing jobs is the only goal, but people are also expected to have jobs to be able to pay for food and housing, what is the end goal here? What is the vision that those companies are trying to realize?
"Coding was never the hard part. Typing syntax into a machine has always been the least interesting part of building a system."
and I think these people are benefitting from it the most, people with expertise, who know their way around and knew what and how to build but did not want to do the grunt work
Juniors that are relying too heavily on AI now will pay the price down the line, when they don't even know the fundamentals, because they just copy and pasted it from a prompt.
It only means job security for people with actual experience.
When I see this: "One of the longest-standing misconceptions about software development is that writing code is the difficult part of the job. It never was."
I don't think I can take this seriously.
Sure, 'writing code' is not the difficult often, but when you have time constraints, 'writing code' becomes a limiting factor. And we all do not have infinite time in our hands.
So AI not only enables something you just could not afford doing in the past, but it also allows to spend more time of 'engineering', or even try multiple approaches, which would have been impossible before.
I’m an AI skeptic and in no sense is it taking my peers job. But it does save me time. I can do research much better than Google, explore a code base, spit out helper functions, and review for obvious mistakes.
I’m seeing a real distinction emerge between “software engineering” and “research”. AI is simply amazing for exploratory research — 10x ability to try new ideas, if not more. When I find something that has promise, then I go into SWE mode. That involves understanding all the code the AI wrote, fixing all the dumb mistakes, and using my decades of experience to make it better. AI’s role in this process is a lot more limited, though it can still be useful.
In terms of the Tech Debt it is obviously allow to make it a lot. But this is controllable if analysing in depth what AI is doing.
I feel I become more like a Product than Software Engineer when reviewing AI code constantly satisfying my needs.
And benefits provided by AI are too good. It allows to prototype near to anything in short terms which is superb.
Like any tool in right hands can be a dealbreaker.
Finally, an AI article i think i fully agree with (or at least mostly).
In my hobby work, AI is indispensable. It's really good at showing me how to solve problems that have been solved, but haven't been applied to my specific scenario. I just ported a WiFi driver for a little RISC-V dev board i bought back in 2023, entirely using Gemini.
Obviously there's lots of training data on compiling such drivers for new systems, and i could fill in the gaps (exact memory addresses, newer kennel function signatures, etc) but it really did write the code, something my JS-dev brain would've taken too long to figure out to be useful.
At the same time, at work, it ignores basic project conventions, hallucinates APIs, writes nonsense documentation; it's very useful, but it my no means "does my job"
There's an obvious disconnect: execs think it can do the anything, engineers think it can do a lot, but not what the execs think
Is that why there are so many outages across many companies adopting AI, including GitHub, Amazon, Cloudflare and Anthropic even with usage?
Maybe if they "prompted the agent correctly", you get your infrastructure above at least 5 9s.
If we continue through this path, not only so-called "engineers" can't read or write code at all, but their agents will introduce seemingly correct code and introduce outages like we have seen already, like this one [0].
AI has turned "senior engineers" into juniors, and juniors back into "interns" and cannot tell what is maintainable code and waste time, money and tokens reinventing a worse wheel.
Naw, I just yesterday caught something in test that would've made it to prod without AI. It happens all the time.
You can't satisfy every single paranoia, eventually you have to deem a risk acceptable and ship it. Which experiments you do run depends on what can be done in what limited time you have. Now that I can bootstrap a for-this-feature test harness in a day instead of a week, I'm catching much subtler bugs.
It's still on you to be a good engineer, and if you're careful, AI really helps with that.
Put a bad driver in an F1 car and you won't make them a racer. You will just help them crash faster. Put a great driver in that same car, and they become unstoppable.
Technology was never equaliser. It just divides more and yes ultimately some developers will get paid a lot more because their skills will be in more demand while other developers will be forced to seek other opportunities.
So somewhere here there is a 2x2 or something based on these factors:
1. Programmers viewing programming through career and job security lens
2. Programmers who love the experience of writing code themselves
3. People who love making stuff
4. People who don't understand AI very well and have knee-jerk cultural / mob reactions against it because that's what's "in" right now in certain circles.
It is fun to read old issues of Popular Mechanics on archive.org from 100+ years ago because you can see a lot of the same personality types playing out.
At the end of the day, AI is not going anywhere, just like cars, electricity and airplanes never went anywhere. It will obviously be a huge part of how people interact with code and a number of other things going forward.
20-30 years from now the majority of the conversations happening this year will seem very quaint! (and a minority, primarily from the "people who love making stuff" quadrant, will seem ahead of their time)
I think we're all in denial about how bad software engineering has gotten. When I look at what's required to publish a web page today vs in 1996, I'm appalled. When someone asks me how to get started, all I can do is look at them and say "I'm so sorry":
So "coding was always the hard part". All AI does is obfuscate how the sausage gets made. I don't see it fixing the underlying fallacies that turned academic computer science into for-profit software engineering.
Although I still (barely) hold onto hope that some of us may win the internet lottery someday and start fixing the fundamentals. Maybe get back to what we used to have with apps like HyperCard, FileMaker and Microsoft Access but for a modern world where we need more than rolodexes. Back to paradigms where computers work for users instead of the other way around.
Until then, at least we have AI to put lipstick on a pig.
There are some interesting points here, but I think this essay is a little too choppy - e.g. the Aircraft Mechanic comparison is a long bow to draw.
The Visual Basic comparison is more salient. I've seen multiple rounds of "the end of programmers", including RAD tools, offshoring, various bubble-bursts, and now AI. Just because we've heard it before though, doesn't mean it's not true now. AI really is quite a transformative technology. But I do agree these tools have resulted in us having more software, and thus more software problems to manage.
The Alignment/Drift points are also interesting, but I think that this appeals to SWE's belief that that taste/discernment is stopping this happening in pre-AI times.
I buy into the meta-point which is that the engineering role has shifted. Opening the floodgates on code will just reveal bottlenecks elsewhere (especially as AI's ability in coding is three steps ahead and accelerating). Rebuilding that delivery pipeline is the engineering challenge.
"AI" (and calling it that is a stretch) is nothing more than a nail gun.
If you gave an experienced house framer a hammer, hand saw and box of nails and a random person off the street a nail gun and powered saw who is going to produce the better house?
A confident AI and an unskilled human are just a Dunning-Kruger multiplier.
Your "don't fucking touch that file" experience is the exact pattern I kept hitting. After 400+ sessions of full-time pair programming with Claude, I stopped trying to fix it with prompt instructions and started treating it as a permissions problem.
The model drifts because nothing structurally prevents it from drifting. Telling it "don't touch X" is negotiating behavior with a probabilistic system — it works until it doesn't. What actually worked: separating the workflow into phases where certain actions literally aren't available. Design phase? Read and propose only. Implementation phase? Edit, but only files in scope.
Your security example is even more telling — the model folding under minimal pushback isn't a knowledge gap, it's a sycophancy gradient. No amount of system prompting fixes that. You need the workflow to not ask the model for a judgment call it can't be trusted to hold.
I can't believe this has to be said, but yeah. Code took time, but it was never the hard part.
I also think that it is radically understated how much developers contribute to UX and product decisions. We are constantly having to ask "Would users really do that?" because it directly impacts how we design. Product people obviously do this more, but engineers do it as a natural part of their process as well. I can't believe how many people do not seem to know this.
Further, in my experience, even the latest models are terrible "experts". Expertise is niche, and niche simply is not represented in a model that has to pack massive amounts of data into a tiny, lossy format. I routinely find that models fail when given novel constraints, for example, and the constraints aren't even that novel - I was writing some lower level code where I needed to ensure things like "a lock is not taken" and "an allocation doesn't occur" because of reentrancy safety, and it ended up being the case that I was better off writing it myself because the model kept drifting over time. I had to move that code to a separate file and basically tell the model "Don't fucking touch that file" because it would often put something in there that wasn't safe. This is with aggressively tuning skills and using modern "make the AI behave" techniques. The model was Opus 4.5, I believe.
This isn't the only situation. I recently had a model evaluate the security of a system that I knew to be unsafe. To its credit, Opus 4.6 did much better than previous models I had tried, but it still utterly failed to identify the severity of the issues involved or the proper solutions and as soon as I barely pushed back on it ("I've heard that systems like this can be safe", essentially) it folded completely and told me to ship the completely unsafe version.
None of this should be surprising! AI is trained on massive amounts of data, it has to lossily encode all of this into a tiny space. Much of the expertise I've acquired is niche, borne of experience, undocumented, etc. It is unsurprising that a "repeat what I've seen before" machine can not state things it has not seen. It would be surprising if that were not the case.
I suppose engineers maybe have not managed to convey this historically? Again, I'm baffled that people don't see to know how much time engineers spend on problems where the code is irrelevant. AI is an incredible accelerator for a number of things but it is hardly "doing my job".
AI has mostly helped me ship trivial features that I'd normally have to backburner for the more important work. It has helped me in some security work by helping to write small html/js payloads to demonstrate attacks, but in every single case where I was performing attacks I was the one coming up with the attack path - the AI was useless there. edit: Actually, it wasn't useless, it just found bugs that I didn't really care about because they were sort of trivial. Finding XSS is awesome, I'm glad it would find really simple stuff like that, but I was going for "this feature is flawed" or "this boundary is flawed" and the model utterly failed there.
113 comments
I believe vibe coding has always existed. I've known people at every company who add copious null checks rather than understanding things and fixing them properly. All we see now is copious null checks at scale. On the other hand, I've also seen excellent engineering amplified and features built by experts in days which would have taken weeks.
Good engineering requires that you still pay attention to the result produced by the agent(s).
Bad engineering might skip over that part.
Therefore, via Amdahl's law, LLM-based agents overall provide more acceleration to bad engineering than they do to good engineering.
But once software becomes bigger and more complex, the LLM starts messing up, and the expert has to come in. That basicaly means your months project cannot be done in a week.
My personal prediction: plugins and systems that support plugins will become important. Because a plugin can be written at 10x speed. The system itself, not so much.
What I was trying to say is that in future developments, as a developer, one of the extra questions on your mind should be: can we turn this into a platform with separate plugins? Because you know those plugins can be written fast, cheap, and don't require top notch engineering work.
But I think I get what you are saying: what you gain in plugin simplicity, you pay in effort to design the platform to support them.
I guess it will depend from project to project, and so the typical "it depends" applies :).
So much so that many people who were doing good engineering before have opted to move to doing three times as much bad engineering instead of doing 10% more good engineering.
I convinced them that if they wanted to treat uncovered lines of code as tech debt, they needed to add an epic stories to their backlog to write tests. And their artificially setting some high target coverage threshold will produce garbage because developers will write do-nothing tests in order to get their work done and not trip the alarms. I argued that failing the builds on code coverage would be unfair because the tech debt created by past developers would unfairly hinder random current-day devs getting their work done.
Instead, I recommended they pick their current coverage percentage (it was < 10% at the time) and set the threshold to that simply to prevent backsliding as new code was added. Then, as their backlogged, legit tests were implemented, ratchet up the coverage threshold to the new high water mark. This meant all new code would get tests written for them.
And, instead of failing builds, I recommended email blasts to the whole team to indicate there was some recent backsliding in the testing regime and the codebase had grown without accompanying tests. It was not a huge shame event, but good a motivator to the team to keep up the quality. SonarQube was great for long-term tracking of coverage stats.
Finally, I argued the coverage tool needed to have very liberal "ignore" rules that were agreed to by all members of the team (including managers). Anything that did not represent testable logic written by the team: generated code, configurations, tests themselves, should not count against their code coverage percentages.
I'm saying you could make the same argument about useful tests themselves. What is testing that the tests are correct?
Uncle Bob would say the production code is testing the tests but only in the limited, one-time, acceptance case where the programmer who watches the test fail, implements code, and then watches it pass (in the ideal test-driven development scenario.)
But what we do all boils down to acceptance. A human user or stakeholder continues to accept the code as correct equals a job well done.
Of course, this is itself a flawed check because humans are flawed and miss things and they don't know what they want anyhow. The Agile Manifesto and Extreme Programming was all about organizing to make course corrections as cheap as possible to accommodate fickle humanity.
> Like, what are we even doing here?
What ARE we doing? A slapdash job on the whole. And, AI is just making slapdash more acceptable and accepted because it is so clever and the boards of directors are busy running this next latest craze into the dirt. "Baffle 'em with bullsh*t" works in every sector of life and lets people get away with all manner of sins.
I think what we SHOULD be doing is plying our craft. We should be using AI as a thinking tool, and not treat it like a replacement for ourselves and our thinking.
So there are tests that leverage mocks. Those mocks help validate software is performing as desired by enabling tests to see the software behaves as desired in varying contexts.
If the software fails, it is because the mocks exposed that under certain inputs, undesired behavior occurs, an assert fails, and a red line flags the test output.
Validating that the mocks return the desired output.... Maybe there is a desire that the mocks return a stream of random numbers and the mock validation tests asserts said stream adheres to a particular distribution?
Maybe someone in the past pushed a bad mock into prod, that mock validated a test that would have failed given better mock, and a post mortem when the bad software, now pushed into prod, was traced to a bad mock derived a requirement that all mocks must be validated?
It makes bad engineering easier because it's easy to fall into the trap of "if the LLM said so, it must be right".
It's just a small CLI app in 3 TypeScript files.
On Error Resume Next> Ive known people at every company who add copious null checks rather than understanding things and fixing them properly.
ynow "defensive programming" is a thing, yeah? Sorry mate, but that statement I'd expect from juniors, which are also often the one's claiming their own technical superiority over others
There are cases where a unit test or a hundred aren’t sufficient to demonstrate a piece of code is correct. Most software developers don’t seem to know what is sufficient. Those heavily using vibe coding even get the machine to write their tests.
Then you get to systems design. What global safety and temporal invariants are necessary to ensure the design is correct? Most developers can’t do more than draw boxes and arrows and cite maxims and “best practices” in their reasoning.
Plus you have the Sussman effect: software is often more like a natural science than engineering. There are so many dependencies and layers involved that you spend more time making observations about behaviour than designing for correct behaviours.
There could be useful cases for using GenAI as a tool in some process for creating software systems… but I don’t think we should be taking off our thinking caps and letting these tools drive the entire process. They can’t tell you what to specify or what correct means.
>
Every few years a new tool appears and someone declares that the difficult parts of software engineering have finally been solved, or eliminated. To some it looks convincing. Productivity spikes. Demos look impressive. The industry congratulates itself on a breakthrough. Staff reductions kick in in the hopes that the market will respond positively.As a software engineer, I'd love if the industry had an actual breakthrough, if we found a way to make the hard parts easier and prevent software projects from devolving into balls of chaos and complexity.
But not if the only reward for this would be to be laid off.
So, once again, the old question: If reducing jobs is the only goal, but people are also expected to have jobs to be able to pay for food and housing, what is the end goal here? What is the vision that those companies are trying to realize?
and I think these people are benefitting from it the most, people with expertise, who know their way around and knew what and how to build but did not want to do the grunt work
It only means job security for people with actual experience.
AI is an amplifier of existing behavior.
Sure, 'writing code' is not the difficult often, but when you have time constraints, 'writing code' becomes a limiting factor. And we all do not have infinite time in our hands.
So AI not only enables something you just could not afford doing in the past, but it also allows to spend more time of 'engineering', or even try multiple approaches, which would have been impossible before.
I was hopeful that the title was written like LLM-output ironically, and dismayed to find the whole blog post is annoying LLM output.
I feel I become more like a Product than Software Engineer when reviewing AI code constantly satisfying my needs.
And benefits provided by AI are too good. It allows to prototype near to anything in short terms which is superb. Like any tool in right hands can be a dealbreaker.
In my hobby work, AI is indispensable. It's really good at showing me how to solve problems that have been solved, but haven't been applied to my specific scenario. I just ported a WiFi driver for a little RISC-V dev board i bought back in 2023, entirely using Gemini.
Obviously there's lots of training data on compiling such drivers for new systems, and i could fill in the gaps (exact memory addresses, newer kennel function signatures, etc) but it really did write the code, something my JS-dev brain would've taken too long to figure out to be useful.
At the same time, at work, it ignores basic project conventions, hallucinates APIs, writes nonsense documentation; it's very useful, but it my no means "does my job"
There's an obvious disconnect: execs think it can do the anything, engineers think it can do a lot, but not what the execs think
Maybe if they "prompted the agent correctly", you get your infrastructure above at least 5 9s.
If we continue through this path, not only so-called "engineers" can't read or write code at all, but their agents will introduce seemingly correct code and introduce outages like we have seen already, like this one [0].
AI has turned "senior engineers" into juniors, and juniors back into "interns" and cannot tell what is maintainable code and waste time, money and tokens reinventing a worse wheel.
[0] https://sketch.dev/blog/our-first-outage-from-llm-written-co...
You can't satisfy every single paranoia, eventually you have to deem a risk acceptable and ship it. Which experiments you do run depends on what can be done in what limited time you have. Now that I can bootstrap a for-this-feature test harness in a day instead of a week, I'm catching much subtler bugs.
It's still on you to be a good engineer, and if you're careful, AI really helps with that.
> and ensuring that the system remains understandable as it grows in complexity.
Feel like only people like this guy, with 4 decades of experience, understand the importance of this.
Technology was never equaliser. It just divides more and yes ultimately some developers will get paid a lot more because their skills will be in more demand while other developers will be forced to seek other opportunities.
It’s not simpler. It’s faster and cheaper and more consistent in quality. But way more complex.
1. Programmers viewing programming through career and job security lens 2. Programmers who love the experience of writing code themselves 3. People who love making stuff 4. People who don't understand AI very well and have knee-jerk cultural / mob reactions against it because that's what's "in" right now in certain circles.
It is fun to read old issues of Popular Mechanics on archive.org from 100+ years ago because you can see a lot of the same personality types playing out.
At the end of the day, AI is not going anywhere, just like cars, electricity and airplanes never went anywhere. It will obviously be a huge part of how people interact with code and a number of other things going forward.
20-30 years from now the majority of the conversations happening this year will seem very quaint! (and a minority, primarily from the "people who love making stuff" quadrant, will seem ahead of their time)
I think we're all in denial about how bad software engineering has gotten. When I look at what's required to publish a web page today vs in 1996, I'm appalled. When someone asks me how to get started, all I can do is look at them and say "I'm so sorry":
https://xkcd.com/1168/
So "coding was always the hard part". All AI does is obfuscate how the sausage gets made. I don't see it fixing the underlying fallacies that turned academic computer science into for-profit software engineering.
Although I still (barely) hold onto hope that some of us may win the internet lottery someday and start fixing the fundamentals. Maybe get back to what we used to have with apps like HyperCard, FileMaker and Microsoft Access but for a modern world where we need more than rolodexes. Back to paradigms where computers work for users instead of the other way around.
Until then, at least we have AI to put lipstick on a pig.
The Visual Basic comparison is more salient. I've seen multiple rounds of "the end of programmers", including RAD tools, offshoring, various bubble-bursts, and now AI. Just because we've heard it before though, doesn't mean it's not true now. AI really is quite a transformative technology. But I do agree these tools have resulted in us having more software, and thus more software problems to manage.
The Alignment/Drift points are also interesting, but I think that this appeals to SWE's belief that that taste/discernment is stopping this happening in pre-AI times.
I buy into the meta-point which is that the engineering role has shifted. Opening the floodgates on code will just reveal bottlenecks elsewhere (especially as AI's ability in coding is three steps ahead and accelerating). Rebuilding that delivery pipeline is the engineering challenge.
If you gave an experienced house framer a hammer, hand saw and box of nails and a random person off the street a nail gun and powered saw who is going to produce the better house?
A confident AI and an unskilled human are just a Dunning-Kruger multiplier.
The model drifts because nothing structurally prevents it from drifting. Telling it "don't touch X" is negotiating behavior with a probabilistic system — it works until it doesn't. What actually worked: separating the workflow into phases where certain actions literally aren't available. Design phase? Read and propose only. Implementation phase? Edit, but only files in scope.
Your security example is even more telling — the model folding under minimal pushback isn't a knowledge gap, it's a sycophancy gradient. No amount of system prompting fixes that. You need the workflow to not ask the model for a judgment call it can't be trusted to hold.
> Code Was Never the Hard Part
I can't believe this has to be said, but yeah. Code took time, but it was never the hard part.
I also think that it is radically understated how much developers contribute to UX and product decisions. We are constantly having to ask "Would users really do that?" because it directly impacts how we design. Product people obviously do this more, but engineers do it as a natural part of their process as well. I can't believe how many people do not seem to know this.
Further, in my experience, even the latest models are terrible "experts". Expertise is niche, and niche simply is not represented in a model that has to pack massive amounts of data into a tiny, lossy format. I routinely find that models fail when given novel constraints, for example, and the constraints aren't even that novel - I was writing some lower level code where I needed to ensure things like "a lock is not taken" and "an allocation doesn't occur" because of reentrancy safety, and it ended up being the case that I was better off writing it myself because the model kept drifting over time. I had to move that code to a separate file and basically tell the model "Don't fucking touch that file" because it would often put something in there that wasn't safe. This is with aggressively tuning skills and using modern "make the AI behave" techniques. The model was Opus 4.5, I believe.
This isn't the only situation. I recently had a model evaluate the security of a system that I knew to be unsafe. To its credit, Opus 4.6 did much better than previous models I had tried, but it still utterly failed to identify the severity of the issues involved or the proper solutions and as soon as I barely pushed back on it ("I've heard that systems like this can be safe", essentially) it folded completely and told me to ship the completely unsafe version.
None of this should be surprising! AI is trained on massive amounts of data, it has to lossily encode all of this into a tiny space. Much of the expertise I've acquired is niche, borne of experience, undocumented, etc. It is unsurprising that a "repeat what I've seen before" machine can not state things it has not seen. It would be surprising if that were not the case.
I suppose engineers maybe have not managed to convey this historically? Again, I'm baffled that people don't see to know how much time engineers spend on problems where the code is irrelevant. AI is an incredible accelerator for a number of things but it is hardly "doing my job".
AI has mostly helped me ship trivial features that I'd normally have to backburner for the more important work. It has helped me in some security work by helping to write small html/js payloads to demonstrate attacks, but in every single case where I was performing attacks I was the one coming up with the attack path - the AI was useless there. edit: Actually, it wasn't useless, it just found bugs that I didn't really care about because they were sort of trivial. Finding XSS is awesome, I'm glad it would find really simple stuff like that, but I was going for "this feature is flawed" or "this boundary is flawed" and the model utterly failed there.