When evaluating the complete bun install improvements, it came out speed-wise to about the same as the existing git usage (due to networking being the big bottleneck time-wise despite more cases being slightly faster with ziggit over multiple benchmarks). Except, it's done in 100% zig and those internal improvements pile up as projects consist of more git dependencies. All in all, it seems like a sensible upstream contribution.
So you have to maintain a completely separate git implementation and keep that up to date with upstream git, all for the benefit of being indistinguishable on benchmarks. Oh well!
The commit I linked shows that it didn't even read the user name and email from git's config file, but used a test name, which means it's woefully incomplete.
It's just one giant function. Sometimes big functions are necessary. This one is clearly AI generated and not very readable for a human. This is just from a quick glance.
Surely "the commits are attributed to the user who creates them" is a pretty basic feature of the git CLI, and not something that you can add in as a fix later after posting your project to Github and writing a blog post about how much faster than git it is.
It's very easy to be faster than git's CLI if you don't have to do any of the things that git's CLI does!
Now we just need that AI booster guy to join this thread and tell us that actually this is super impressive. He was doing that for that worthless “browser” that Cursor built.
If git were a rapidly evolving project then I'd think this'd be a stronger issue.
With git being more of an established protocol that projects can piggy-back off of from GitHub to jj, filling a library in a new language seems like something that contributes
These "AI rewrite" projects are beginning to grate on me.
Sure, if you have a complete test suite for a library or CLI tool, it is possible to prompt Claude Opus 4.6 such that it creates a 100% passing, "more performant", drop-in replacement. However, if the original package is in its training data, it's very likely to plagiarize the original source.
Also, who actually wants to use or maintain a large project that no one understands and that doesn't have a documented history of thoughtful architectural decisions and the context behind them? No matter how tightly you structure AI work, probabilistic LLM logorrhea cannot reliably adopt or make high-level decisions/principles, apply them, or update them as new data arrives. If you think otherwise, you're believing an illusion - truly.
A large software project's source code and documentation are the empirical ground-truth encoding of a ton of decisions made by many individuals and teams -- decisions that need to be remembered, understood, and reconsidered in light of new information. AI has no ability to consider these types of decisions and their accompanying context, whether they are past, present, or future -- and is not really able to coherently communicate them in a way that can be trusted to be accurate.
That's why I can't and won't trust fully AI-written software beyond small one-off-type tools until AI gains two fundamentally new capabilities:
(1) logical reasoning that can weigh tradeoffs and make accountable decisions in terms of ground-truth principles accurately applied to present circumstances, and
(2) ability to update those ground-truth principles coherently and accurately based on new, experiential information -- this is real "learning"
> Sure, if you have a complete test suite for a library or CLI tool
And this is a huge "if". Having 100% test coverage does not mean you've accounted for every possible edge or corner case. Additionally, there's no guarantee that every bugfix implemented adequate test coverage to ensure the bug doesn't get reintroduced. Finally, there are plenty of poorly written tests out there that increase the test coverage without actually testing anything.
This is why any sort of big rewrite carries some level of risk. Tests certainly help mitigate this risk, but you can never be 100% sure that your big rewrite didn't introduce new problems. This is why code reviews are important, especially if the code was AI generated.
You raised very good points, however, what you typed negatively affects the shell game (as to what "AI" companies are often really doing) and partial pyramid scheme.
People seem not to realize that AI companies can not only plagiarize someone's original source code, but any source code that people connected to it are feeding and uploading to it. The shell game is taking Tom's code (with a few changes) and feeding it to Bill (based on prompts given). Both Tom and Bill are paying fees to the AI company, yet don't realize their code (along with many others) can be spit back at them.
You, the humans, are doing a lot of the work, and many don't realize it. Because Tom is not realizing someone has or is working on something similar. The AI company is connecting Tom and Bill together, without either of them realizing it. If they type in the right prompt, the search then feeds back that info. It's not the only thing going on or only way things work, but it is part of it, that is often not publicly acknowledged.
OpenAI definitely has used input tokens to further train its models, but Anthropic has emphatically stated they do no such thing. I have trusted them so far on that. Are you saying they're lying?
I'm not going against any explicit policy or promise to customers that a particular AI company might make, but rather what is and can be happening that a lot of the public doesn't realize in general. A lot of what is attributed to AI, can be the work of humans (including customers), that in various cases were or arguably being ripped-off. Speaking of which, there are lots of cases of companies claiming to use or have an AI product, but instead were just using humans for low pay (but wasn't previously referring to that).
In the Tom and Bill shell game example given, where they are being used for their code and to correct code that is sold to other customers, it's not a "now" thing either. Meaning Tom, Bill, and the other customers don't have to be exchanging code in real time, when that code is being uploaded, saved, and trained on by AI companies. Tom could have worked on some code a month ago, that was slurped up from Susan. Tom fixed many of the errors of Susan's code, which is now fed to Bill, when he inputs the correct prompts. Bill thinks the AI is the "genius", but is unknowingly benefiting from Bill's and Susan's work, review, and corrections. Potentially more devastating to Bill, is what he may mistakenly think was private or secret to only him, is fed to other customers for profit.
AI and their companies are also connecting people, in that indirect black box way, where those people may not realize they are connected, being fed, and are correcting each others code. Yeah, some may not care where the code comes from or how, but that they can use it for their personal purposes. Sure, that's not the only part of the story and LLMs are doing some interesting and amazing things, but there is another part of that story that is not being more widely acknowledged. In a similar way in which has angered so many artists and authors, where they feel aggrieved and taken advantage of; relative to many art, song, and book lawsuits.
> Sure, if you have a complete test suite for a library or CLI tool, it is possible to prompt Claude Opus 4.6 such that it creates a 100% passing, "more performant", drop-in replacement.
This was the "validation" used for determining how much progress was made at a given point in time. Re training data concerns, this was done and shipped to be open source (under GPLv2) so there's no abuse of open source work here imo
Re the tradeoffs you highlight - these are absolutely true and fair. I don't expect or want anyone to just use ziggit because it's new. The places where there performance gains (ie internally with bun install or as a better WASM binary alternative) are places that I do have interest or use in myself
_However_, if I could interest you in one thing. ziggit when compiled into a release build on my arm-based Mac, showed 4-10x faster performance than git's CLI for the core workflows I use in my git development
I suppose "Project X has been used productively by Y developers for Z amount of time" is a decent-enough endorsement (in this case, ziggit used by you).
But after the massive one-off rewrite, what are the chances that (a) humans will want to do any personal effort on reading it, documenting it, understanding it, etc., or that (b) future work by either agents or humans is going to be consistently high-quality?
Beyond a certain level of complexity, "high-quality work" is not just about where a codebase is right now, it's where it's going and how much its maintainers can be trusted to keep it moving in the right direction - a trust that only people with a name, reputation, observable values/commitments, and track record can earn.
So, they implemented a git client in zig, that had some significant speedups for their usecase. However:
> The git CLI test suite consists of 21,329 individual assertions for various git subcommands (that way we can be certain ziggit does suffice as a drop-in replacement for git).
> While we only got through part of the overall test suite, that's still the equivalent of a month's worth of straight developer work (again, without sleep or eating factored in).
I think we might be getting to the point where submissions for projects that are primarily written by ai and/or ai agents need to be tagged with [agent] in the title
With the recent barrage of AI-slop 'speedup' posts, the first thing I always do to see if the post is worth a read is doing a Ctrl+F "benchmark" and seeing if the benchmark makes any fucking sense.
99% of the time (such as in this article), it doesn't. What do you mean 'cloneBare + findCommit + checkout: ~10x win'? Does that mean running those commands back to back result in a 10x win over the original? Does that mean that there's a specific function that calls these 3 operations, and that's the improvement of the overall function? What's the baseline we're talking about, and is it relevant at all?
Those questions are partially answered on the much better benchmark page[1], but for some reason they're using the CLI instead of the gitlib for comparisons.
What it cost doesn't actually say what it cost. I wonder what models they used. Napkin math of Opus for everything (probably not true) with no caching suggests $67,000.
im pretty stoked about the llm harness theyre using. cause I wrote all the code thats not monopi code in that fork!
despite it’s paucity of features, the changes i landed in it from my design notes actually have been so smooth in terms of comparative ux/ llm behavior that its my daily driver since ive stood it up.
Previously, since early december, ive had to run a patch script on every update of claude code to make it stop undermining me. I didnt need a hilarious code leak to find the problematic strings in the minified js ;)
I regard punkin-pi as a first stab at translating ideas ive had over the past 6 months for reliable llm harnesses. I hit some walls in terms of mono pi architecture for doing much more improvement with mono pi.
so Im working on the next gen of agent harnesses! stay tuned!
The title is obviously dishonest. I do not hesitate to call it a lie.
The post is also not about the speed increase, it's about how proud this team is of their agent orchestration scheme.
As I understand it, there is really no speed difference at all between Zig and C, just some cognitive overhead associated with doing things "right" in C. It's all machine code at bottom.
So why is this rewrite faster? Why did the authors choose Zig? How has the logic or memory management changed?
The authors give us absolutely no insight whatsoever into the the Zig code. No indication that they know anything about Zig, or systems programming, at all. I wish this was an exaggeration.
And really. With all this agentic power at your fingertips, why wouldn't they just contribute these improvements to git itself? I can think of at least one reason, that they don't want their changes to be rejected as unhelpful or low-quality.
61 comments
Edit And then I go their repository and read commits like this https://github.com/hdresearch/ziggit/commit/31adc1da1693e402... which confirms it wasn't even looked over by a human.
Then there's stuff like this: https://github.com/hdresearch/ziggit/blob/master/src/cmd_bra...
It's just one giant function. Sometimes big functions are necessary. This one is clearly AI generated and not very readable for a human. This is just from a quick glance.
That was so the commit authors don't all appear like blank accounts on GitHub
It's very easy to be faster than git's CLI if you don't have to do any of the things that git's CLI does!
> maintain a separate git implementation
If git were a rapidly evolving project then I'd think this'd be a stronger issue.
With git being more of an established protocol that projects can piggy-back off of from GitHub to jj, filling a library in a new language seems like something that contributes
Sure, if you have a complete test suite for a library or CLI tool, it is possible to prompt Claude Opus 4.6 such that it creates a 100% passing, "more performant", drop-in replacement. However, if the original package is in its training data, it's very likely to plagiarize the original source.
Also, who actually wants to use or maintain a large project that no one understands and that doesn't have a documented history of thoughtful architectural decisions and the context behind them? No matter how tightly you structure AI work, probabilistic LLM logorrhea cannot reliably adopt or make high-level decisions/principles, apply them, or update them as new data arrives. If you think otherwise, you're believing an illusion - truly.
A large software project's source code and documentation are the empirical ground-truth encoding of a ton of decisions made by many individuals and teams -- decisions that need to be remembered, understood, and reconsidered in light of new information. AI has no ability to consider these types of decisions and their accompanying context, whether they are past, present, or future -- and is not really able to coherently communicate them in a way that can be trusted to be accurate.
That's why I can't and won't trust fully AI-written software beyond small one-off-type tools until AI gains two fundamentally new capabilities:
(1) logical reasoning that can weigh tradeoffs and make accountable decisions in terms of ground-truth principles accurately applied to present circumstances, and
(2) ability to update those ground-truth principles coherently and accurately based on new, experiential information -- this is real "learning"
> Sure, if you have a complete test suite for a library or CLI tool
And this is a huge "if". Having 100% test coverage does not mean you've accounted for every possible edge or corner case. Additionally, there's no guarantee that every bugfix implemented adequate test coverage to ensure the bug doesn't get reintroduced. Finally, there are plenty of poorly written tests out there that increase the test coverage without actually testing anything.
This is why any sort of big rewrite carries some level of risk. Tests certainly help mitigate this risk, but you can never be 100% sure that your big rewrite didn't introduce new problems. This is why code reviews are important, especially if the code was AI generated.
People seem not to realize that AI companies can not only plagiarize someone's original source code, but any source code that people connected to it are feeding and uploading to it. The shell game is taking Tom's code (with a few changes) and feeding it to Bill (based on prompts given). Both Tom and Bill are paying fees to the AI company, yet don't realize their code (along with many others) can be spit back at them.
You, the humans, are doing a lot of the work, and many don't realize it. Because Tom is not realizing someone has or is working on something similar. The AI company is connecting Tom and Bill together, without either of them realizing it. If they type in the right prompt, the search then feeds back that info. It's not the only thing going on or only way things work, but it is part of it, that is often not publicly acknowledged.
In the Tom and Bill shell game example given, where they are being used for their code and to correct code that is sold to other customers, it's not a "now" thing either. Meaning Tom, Bill, and the other customers don't have to be exchanging code in real time, when that code is being uploaded, saved, and trained on by AI companies. Tom could have worked on some code a month ago, that was slurped up from Susan. Tom fixed many of the errors of Susan's code, which is now fed to Bill, when he inputs the correct prompts. Bill thinks the AI is the "genius", but is unknowingly benefiting from Bill's and Susan's work, review, and corrections. Potentially more devastating to Bill, is what he may mistakenly think was private or secret to only him, is fed to other customers for profit.
AI and their companies are also connecting people, in that indirect black box way, where those people may not realize they are connected, being fed, and are correcting each others code. Yeah, some may not care where the code comes from or how, but that they can use it for their personal purposes. Sure, that's not the only part of the story and LLMs are doing some interesting and amazing things, but there is another part of that story that is not being more widely acknowledged. In a similar way in which has angered so many artists and authors, where they feel aggrieved and taken advantage of; relative to many art, song, and book lawsuits.
> Sure, if you have a complete test suite for a library or CLI tool, it is possible to prompt Claude Opus 4.6 such that it creates a 100% passing, "more performant", drop-in replacement.
This was the "validation" used for determining how much progress was made at a given point in time. Re training data concerns, this was done and shipped to be open source (under GPLv2) so there's no abuse of open source work here imo
Re the tradeoffs you highlight - these are absolutely true and fair. I don't expect or want anyone to just use ziggit because it's new. The places where there performance gains (ie internally with
bun installor as a better WASM binary alternative) are places that I do have interest or use in myself_However_, if I could interest you in one thing. ziggit when compiled into a release build on my arm-based Mac, showed 4-10x faster performance than git's CLI for the core workflows I use in my git development
But after the massive one-off rewrite, what are the chances that (a) humans will want to do any personal effort on reading it, documenting it, understanding it, etc., or that (b) future work by either agents or humans is going to be consistently high-quality?
Beyond a certain level of complexity, "high-quality work" is not just about where a codebase is right now, it's where it's going and how much its maintainers can be trusted to keep it moving in the right direction - a trust that only people with a name, reputation, observable values/commitments, and track record can earn.
> The git CLI test suite consists of 21,329 individual assertions for various git subcommands (that way we can be certain ziggit does suffice as a drop-in replacement for git).
> While we only got through part of the overall test suite, that's still the equivalent of a month's worth of straight developer work (again, without sleep or eating factored in).
99% of the time (such as in this article), it doesn't. What do you mean 'cloneBare + findCommit + checkout: ~10x win'? Does that mean running those commands back to back result in a 10x win over the original? Does that mean that there's a specific function that calls these 3 operations, and that's the improvement of the overall function? What's the baseline we're talking about, and is it relevant at all?
Those questions are partially answered on the much better benchmark page[1], but for some reason they're using the CLI instead of the gitlib for comparisons.
[1] https://github.com/hdresearch/ziggit/blob/5d3deb361f03d4aefe...
Cool article though!
>it becomes possible to see upward of 100x speedups for some git operations.
They really stretch the limits of an honest title there.
despite it’s paucity of features, the changes i landed in it from my design notes actually have been so smooth in terms of comparative ux/ llm behavior that its my daily driver since ive stood it up.
Previously, since early december, ive had to run a patch script on every update of claude code to make it stop undermining me. I didnt need a hilarious code leak to find the problematic strings in the minified js ;)
I regard punkin-pi as a first stab at translating ideas ive had over the past 6 months for reliable llm harnesses. I hit some walls in terms of mono pi architecture for doing much more improvement with mono pi.
so Im working on the next gen of agent harnesses! stay tuned!
The post is also not about the speed increase, it's about how proud this team is of their agent orchestration scheme.
As I understand it, there is really no speed difference at all between Zig and C, just some cognitive overhead associated with doing things "right" in C. It's all machine code at bottom.
So why is this rewrite faster? Why did the authors choose Zig? How has the logic or memory management changed?
The authors give us absolutely no insight whatsoever into the the Zig code. No indication that they know anything about Zig, or systems programming, at all. I wish this was an exaggeration.
And really. With all this agentic power at your fingertips, why wouldn't they just contribute these improvements to git itself? I can think of at least one reason, that they don't want their changes to be rejected as unhelpful or low-quality.
Big if true, etc
https://news.ycombinator.com/item?id=47618895 to discuss the git implementation