Feedback of someone who is used to manage large (>1500) software stack in C / C++ / Fortran / Python / Rust / etc:
- (1) Provide a way to compile without internet access and specify the associated dependencies path manually. This is absolutely critical.
Most 'serious' multi-language package managers and integration systems are building in a sandbox without internet access for security reasons and reproducibility reasons.
If your build system does not allow to build offline and with manually specified dependencies, you will make life of integrators and package managers miserable and they will avoid your project.
(2) Neverever build in '-03 -march=native' by default. This is always a red flag and a sign of immaturity. People expect code to be portable and shippable.
Good default options should be CMake equivalent of "RelWithDebInfo" (meaning: -O2 -g -DNDEBUG ).
-O3 can be argued. -march=native is always always a mistake.
- (3) Allow your build tool to be built by an other build tool (e.g CMake).
Anybody caring about reproducibility will want to start from sources, not from a pre-compiled binary. This also matter for cross compilation.
They are what will allow interoperability between your system and other build systems.
- (5) last but not least: Consider seriously the cross-compilation use case.
It is common in the world of embedded systems to cross compile. Any build system that does not support cross-compilation will be de facto banned from the embedded domain.
As someone who has also spent two decades wrangling C/C++ codebases, I wholeheartedly agree with every statement here.
I have an even stronger sentiment regarding cross compilation though - In any build system, I think the distinction between “cross” and “non-cross” compilation is an anti-pattern.
Always design build systems assuming cross compilation. It hurts nothing if it just so happens that your host and target platform/architecture end up being the same, and saves you everything down the line if you need to also build binaries for something else.
Amen. It always baffled me that cross compiling was ever considered a special, weird, off-nominal thing. I’d love to understand the history of that better, because it seems like it should have been obvious from the start that building for the exact same computer you’re compiling from is a special case.
A few things come to mind, but I wasn't even alive then so what do I know XD.
On one hand, it seems rather strange, because back in the early days of C (and later C++) there were far more CPU architectures in play. Every big Unix hardware vendor had their own CPU architecture, whereas today we only have about six. (In my mind: x86, arm, mips, risc-v, ppc, and s390x)
But it might be that in the early days of C/C++, development involved connecting to large shared Unix environments where the machine you developed on what always the machine (or at least the same type of machine) the program would run on, and also that those vendors weren't exactly incentivized to make developing for competitor's architectures easy.
Also the problem isn't creating a cargo like tool for C and C++, that is the easy part, the problem is getting more userbase than vcpkg or conan for it to matter for those communities.
Shipping anything built with -march=native is a horrible idea. Even on homogeneous targets like one of the clouds, you never know if they'll e.g. switch CPU vendors.
The correct thing to do is use microarch levels (e.g. x86-64-v2) or build fully generic if the target architecture doesn't have MA levels.
Not the OP, but: -march says the compiler can assume that the features of that particular CPU architecture family, which is broken out by generation, can be relied upon. In the worst case the compiler could in theory generate code that does not run on older CPUs of the same family or from different vendors.
-mtune says "generate code that is optimised for this architecture" but it doesn't trigger arch specific features.
Whether these are right or not depends on what you are doing. If you are building gentoo on your laptop you should absolutely -mtune=native and -march=native. That's the whole point: you get the most optimised code you can for your hardware.
If you are shipping code for a wide variety of architectures and crucially the method of shipping is binary form then you want to think more about what you might want to support. You could do either: if you're shipping standard software pick a reasonable baseline (check what your distribution uses in its cflags). If however you're shipping compute-intensive software perhaps you load a shared object per CPU family or build your engine in place for best performance. The Intel compiler quite famously optimised per family, included all the copies in the output and selected the worst one on AMD ;) (https://medium.com/codex/fixing-intel-compilers-unfair-cpu-d...)
Just popping in here because people seem to be surprised by
> I build on the exact hardware I intend to deploy my software to and ship it to another machine with the same specs as the one it was built on.
This is exactly the use case in HPC. We always build -march=native and go to some trouble to enable all the appropriate vectorization flags (e.g., for PowerPC) that don't come along automatically with the -march=native setting.
Every HPC machine is a special snowflake, often with its own proprietary network stack, so you can forget about binaries being portable. Even on your own machine you'll be recompiling your binaries every time the machine goes down for a major maintenance.
it certainly has scale issues when you need to support larger deployments.
[P.S.: the way I understand the words, "shipping" means "passing it off to someone else, likely across org boundaries" whereas what you're doing I'd call "deploying"]
On every project I've worked on, the PC I've had has been much better than the minimum PC required. Just because I'm writing code that will run nicely enough on a slow PC, that doesn't mean I need to use that same slow PC to build it!
And then, the binary that the end user receives will actually have been built on one of the CI systems. I bet they don't all have quite the same spec. And the above argument applies anyway.
If you use a cloud provider and use a remote development environment (VSCode remote/Jetbrains Gateway) then you’re wrong: cloud providers swap out the CPUs without telling you and can sell newer CPUs at older prices if theres less demand for the newer CPUs; you can’t rely on that.
To take an old naming convention, even an E3-Xeon CPU is not equivalent to an E5 of the same generation. I’m willing to bet it mostly works but your claim “I build on the exact hardware I ship on” is much more strict.
The majority of people I know use either laptops or workstations with Xeon workstation or Threadripper CPUs— but when deployed it will be a Xeon scalable datacenter CPU or an Epyc.
Hell, I work in gamedev and we cross compile basically everything for consoles.
The only time I used -march=native was for a university assignment which was built and evaluated on the same server, and it allowed juicing an extra bit of performance. Using it basically means locking the program to the current CPU only.
However I'm not sure about -O3. I know it can make the binary larger, not sure about other downsides.
The reason why I like it (beyond ease-of-use) is that it can spit out CMakeLists.txt and compile_commands.json for IDE/LSP integration and also supports installing Conan/vcpkg libraries or even Git repos.
Anyone can make a tool that solves a tiny part of the problem. however the reason no such tool has caught on is because of all the weird special cases you need to handle before it can be useful. Even if you limit your support to desktop: OS/X and Windows that problem will be hard, adding various linux flavors is even more difficult, not to mention BSD. The above is the common/mainstream choices, there Haiku is going to be very different, and I've seen dozens of others over the years, some of them have a following in their niche. Then there are people building for embedded - QNX, vxworks, or even no OS just bare metal - each adding weirdness (and implying cross compiling which makes everything harder because your assumptions are always wrong).
I'm sorry I have to be a downer, but the fact is if you can use the word "I" your package manager is obviously not powerful enough for the real world.
Thank you everyone for the feedback so far! I just wanted to say that I understand this is not a fully cohesive and functional project for every edge case. This is the first day of releasing it to the public and it is only the beginning of the journey. I do not expect to fully solve a problem of this scale on my own, Craft is open source and open to the community for development. I hope that as a community this can grow into a more advanced and widely adopted tool.
Having to work around a massive C++ software project daily, I wish you luck. We use conan2, and while it can be very challenging to use, I've yet to find something better that can handle incorporating as dependencies ancient projects that still use autoconf or even custom build tooling. It's also very good at detecting and enforcing ABI compatibility, although there are still some gaps. This problem space is incredibly hard and improving it is a prime driver for the creation of many of the languages that came after C/C++
Cmake is infamously not a build system. It is a build system generator.
This is now a build system generator generator. This is the wrong solution imho. The right solution is to just build a build system that doesn’t suck. Cmake sucks. Generating suck is the wrong angle imho.
One interesting chicken-egg-problem I couldn't solve is how to figure out the C/C++ toolchain that's going to be used without running cmake on a 'dummy project file' first. For some toolchain/IDE combos (most notably Xcode and VStudio) cmake's toolchain detection takes a lot of time unfortunately.
CMake is a combination of a warthog of a specification language, and mechanisms for handling a zillion idiosyncracies and corners cases of everything.
I doubt than < 10,000 lines of C code can cover much of that.
I am also doubtful that developers are able to express the exact relations and semantic nuances they want to, as opposed to some default that may make sense for many projects, but not all.
Still - if it helps people get started on simpler or more straightforward projects - that's neat :-)
FWIW: there is something fundamentally wrong with a meta-meta build system. I don't think you should bother generating or wrapping CMake, you should be replacing it.
> You describe your project in a simple craft.toml
I don't like it. Such format is generally restricted (is not Turing-complete), which doesn't allow doing something non-trivial, for example, choosing dependencies or compilation options based on some non-trivial conditions. That's why CMake is basically a programming language with variables, conditions, loops and even arithmetic.
Uses CMAKE, Sorry not for me. Call me old but i prefere good old make or batch. Maybe it's because i can understand those tools. Debugging CMAKE build problems made me hate it. Also i code for embedded CPU and most of the time CMAKE is just overkill and does not play well the compiler/binutils provided. The Platform independency is just not happening in those environments.
Seems to solve a problem very similar to Conan or vcpkg but without its own package archive or build scripts. In general, unlike Cargo/Rust, many C/C++ projects dynamically link libraries and often require complex Makefile/shell script etc magic to discover and optionally build their dependencies.
How does craft handle these 'diamond' patterns where 2 dependencies may depend on versions of the same library as transitive dependencies (either for static or dynamic linking or as header-only includes) without custom build scripts like the Conan approach?
This certainly seems less awful than the typical C building process.
What I've been doing to manage dependencies in a way that doesn't depress me much has been Nix flakes, which allows me a pretty straightforward nix build with the correct dependencies built in.
I'm just a bit curious though; a lot of C libraries are system-wide, and usually require the system package manager (e.g. libsdl2-dev) does this have an elegant way to handle those?
In the age of AI tools like this are pointless. Especially new ones, given existence of make, cmake, premake and a bunch of others.
C++ build system, at the core, boils down to calling gcc foo.c -o foo.obj / link foo.obj foo.exe (please forgive if I got they syntax wrong).
Sure, you have more .c files, and you pass some flags but that's the core.
I've recently started a new C++ program from scratch.
What build system did I write?
I didn't. I told Claude:
"Write a bun typescript script build.ts that compiles the .cpp files with cl and creates foo.exe. Create release and debug builds, trigger release build with -release cmd-line flag".
And it did it in minutes and it worked. And I can expand it with similar instructions. I can ask for release build with all the sanitize flags and claude will add it.
The particulars don't matter. I could have asked for a makefile, or cmake file or ninja or a script written in python or in ruby or in Go or in rust. I just like using bun for scripting.
The point is that in the past I tried to learn cmake and good lord, it's days spent learning something that I'll spent 1 hr using.
It just doesn't make sense to learn any of those tools given that claude can give me working any build system in minutes.
It makes even less sense to create new build tools. Even if you create the most amazing tool, I would still choose spending a minute asking claude than spending days learning arbitrary syntax of a new tool.
Project description is AI generated, even the HN post is AI generated, why should I spend any energy looking into your project when all you're doing is just slinging AI slop around and couldn't be bothered to put any effort in yourself?
But how this tool figures out where the header files and build instructions for the libraries are that are included? Any expected layout or industry wide consensus?
KDE already has a meta-build tool for C++ called Craft, which handles dependency management and cross - compilation for CMake-built applications and libraries.
If you think cmake isn't very good, the solution isn't to add more layers of crap around cmake, but to replace it. Cmake itself exists because a lot of humans haven't bothered to read the gnu make manual, and added more cruft to manage this. Please don't add to this problem. It's a disease
The tough truth is that there already is a cargo for C/C++: Conan2. I know, python, ick. I know, conanfile.py, ick. But despite its warts, Conan fundamentally CAN handle every part of the general problem. Nobody else can. Profiles to manage host vs. target configuration? Check. Sufficiently detailed modeling of ABI to allow pre-compiled binary caching, local and remote? Check, check, check. Offline vs. Online work modes? Check. Building any relevant project via any relevant build system, including Meson, without changes to the project itself? Check. Support for pulling build-side requirements? Check. Version ranges? Check. Lockfiles? Check. Closed-source, binary-only dependencies? Check.
Once you appreciate the vastness of the problem, you will see that having a vibrant ecosystem of different competing package managers sucks. This is a problem where ONE standard that can handle every situation is incalculably better than many different solutions which solve only slices of the problem. I don't care how terse craft's toml file is - if it can't cross compile, it's useless to me. So my project can never use your tool, which implies other projects will have the same problem, which implies you're not the one package manager / build system, which means you're part of the problem, not the solution. The Right Thing is to adopt one unilateral standard for all projects. If you're remotely interested in working on package managers, the best way to help the human race is to fix all of the outstanding things about Conan that prevent it from being the One Thing. It's the closest to being the One Thing, and yet there are still many hanging chads:
- its terribly written documentation
- its incomplete support for editable packages
- its only nascent support for "workspaces"
- its lack of NVIDIA recipes
If you really can't stand to work on Conan (I wouldn't blame you), another effort that could help is the common package specification format (CPS). Making that a thing would also be a huge improvement. In fact, if it succeeds, then you'd be free to compete with conan's "frontend" ergonomics without having to compete with the ecosystem.
168 comments
- (1) Provide a way to compile without internet access and specify the associated dependencies path manually. This is absolutely critical.
Most 'serious' multi-language package managers and integration systems are building in a sandbox without internet access for security reasons and reproducibility reasons.
If your build system does not allow to build offline and with manually specified dependencies, you will make life of integrators and package managers miserable and they will avoid your project.
(2) Never ever build in '-03 -march=native' by default. This is always a red flag and a sign of immaturity. People expect code to be portable and shippable.
Good default options should be CMake equivalent of "RelWithDebInfo" (meaning: -O2 -g -DNDEBUG ).
-O3 can be argued. -march=native is always always a mistake.
- (3) Allow your build tool to be built by an other build tool (e.g CMake).
Anybody caring about reproducibility will want to start from sources, not from a pre-compiled binary. This also matter for cross compilation.
- (4) Please offer a compatibility with pkg-config (https://en.wikipedia.org/wiki/Pkg-config) and if possible CPS (https://cps-org.github.io/cps/overview.html) for both consumption and generation.
They are what will allow interoperability between your system and other build systems.
- (5) last but not least: Consider seriously the cross-compilation use case.
It is common in the world of embedded systems to cross compile. Any build system that does not support cross-compilation will be de facto banned from the embedded domain.
I have an even stronger sentiment regarding cross compilation though - In any build system, I think the distinction between “cross” and “non-cross” compilation is an anti-pattern.
Always design build systems assuming cross compilation. It hurts nothing if it just so happens that your host and target platform/architecture end up being the same, and saves you everything down the line if you need to also build binaries for something else.
> In any build system, I think the distinction between “cross” and “non-cross” compilation is an anti-pattern.
This is one of the huge wins of Zig. Any Zig host compiler can produce output for any supported target. Cross compiling becomes straightforward.
On one hand, it seems rather strange, because back in the early days of C (and later C++) there were far more CPU architectures in play. Every big Unix hardware vendor had their own CPU architecture, whereas today we only have about six. (In my mind: x86, arm, mips, risc-v, ppc, and s390x)
But it might be that in the early days of C/C++, development involved connecting to large shared Unix environments where the machine you developed on what always the machine (or at least the same type of machine) the program would run on, and also that those vendors weren't exactly incentivized to make developing for competitor's architectures easy.
Also the problem isn't creating a cargo like tool for C and C++, that is the easy part, the problem is getting more userbase than vcpkg or conan for it to matter for those communities.
> Never ever build in '-03 -march=native' by default. This is always a red flag and a sign of immaturity.
Perhaps you can see how there are some assumptions baked into that statement.
Shipping anything built with -march=native is a horrible idea. Even on homogeneous targets like one of the clouds, you never know if they'll e.g. switch CPU vendors.
The correct thing to do is use microarch levels (e.g. x86-64-v2) or build fully generic if the target architecture doesn't have MA levels.
I am willing to hear arguments for other approaches.
-mtune says "generate code that is optimised for this architecture" but it doesn't trigger arch specific features.
Whether these are right or not depends on what you are doing. If you are building gentoo on your laptop you should absolutely -mtune=native and -march=native. That's the whole point: you get the most optimised code you can for your hardware.
If you are shipping code for a wide variety of architectures and crucially the method of shipping is binary form then you want to think more about what you might want to support. You could do either: if you're shipping standard software pick a reasonable baseline (check what your distribution uses in its cflags). If however you're shipping compute-intensive software perhaps you load a shared object per CPU family or build your engine in place for best performance. The Intel compiler quite famously optimised per family, included all the copies in the output and selected the worst one on AMD ;) (https://medium.com/codex/fixing-intel-compilers-unfair-cpu-d...)
> I build on the exact hardware I intend to deploy my software to and ship it to another machine with the same specs as the one it was built on.
This is exactly the use case in HPC. We always build -march=native and go to some trouble to enable all the appropriate vectorization flags (e.g., for PowerPC) that don't come along automatically with the -march=native setting.
Every HPC machine is a special snowflake, often with its own proprietary network stack, so you can forget about binaries being portable. Even on your own machine you'll be recompiling your binaries every time the machine goes down for a major maintenance.
it certainly has scale issues when you need to support larger deployments.
[P.S.: the way I understand the words, "shipping" means "passing it off to someone else, likely across org boundaries" whereas what you're doing I'd call "deploying"]
And then, the binary that the end user receives will actually have been built on one of the CI systems. I bet they don't all have quite the same spec. And the above argument applies anyway.
Quite hard to build on the exact hardware for those scenarios.
I’ve never heard of anyone doing that.
If you use a cloud provider and use a remote development environment (VSCode remote/Jetbrains Gateway) then you’re wrong: cloud providers swap out the CPUs without telling you and can sell newer CPUs at older prices if theres less demand for the newer CPUs; you can’t rely on that.
To take an old naming convention, even an E3-Xeon CPU is not equivalent to an E5 of the same generation. I’m willing to bet it mostly works but your claim “I build on the exact hardware I ship on” is much more strict.
The majority of people I know use either laptops or workstations with Xeon workstation or Threadripper CPUs— but when deployed it will be a Xeon scalable datacenter CPU or an Epyc.
Hell, I work in gamedev and we cross compile basically everything for consoles.
However I'm not sure about -O3. I know it can make the binary larger, not sure about other downsides.
I fully concur with that whole post as someone who also maintained a C++ codebase used in production.
>
-march=native is always always a mistakeGentoo user: hold my beer.
https://github.com/xmake-io/xmake
The reason why I like it (beyond ease-of-use) is that it can spit out CMakeLists.txt and compile_commands.json for IDE/LSP integration and also supports installing Conan/vcpkg libraries or even Git repos.
Then you use it likeIt's similar, but designed for an existing ecosystem. Cargo is designed for
cargo, obviously.But
pyproject.tomlis designed for the existing tools to all eventually adopt. (As well as new tools, of course.)I'm sorry I have to be a downer, but the fact is if you can use the word "I" your package manager is obviously not powerful enough for the real world.
Not sure how big your plans are.
My thoughts would be to start as a cmake generator but to eventually replace it. Maybe optionally.
And to integrate suppoet for existing package managers like vcpkg.
At the same time, I'd want to remain modular enough that's it's not all or nothing. I also don't like locking.
But right now package management and build system are decoupled completely. And they are not like that in other ecosystems.
For example, Cmake can use vcpkg to install a package but then I still have to write more cmake to actually find and use it.
curl | shwriting to the user's bashrc does not inspire confidence.This is now a build system generator generator. This is the wrong solution imho. The right solution is to just build a build system that doesn’t suck. Cmake sucks. Generating suck is the wrong angle imho.
Here's my feeble attempt using Deno as base (it's extremely opinionated though and mostly for personal use in my hobby projects):
https://github.com/floooh/fibs
One interesting chicken-egg-problem I couldn't solve is how to figure out the C/C++ toolchain that's going to be used without running cmake on a 'dummy project file' first. For some toolchain/IDE combos (most notably Xcode and VStudio) cmake's toolchain detection takes a lot of time unfortunately.
CMake is a combination of a warthog of a specification language, and mechanisms for handling a zillion idiosyncracies and corners cases of everything.
I doubt than < 10,000 lines of C code can cover much of that.
I am also doubtful that developers are able to express the exact relations and semantic nuances they want to, as opposed to some default that may make sense for many projects, but not all.
Still - if it helps people get started on simpler or more straightforward projects - that's neat :-)
> You describe your project in a simple craft.toml
I don't like it. Such format is generally restricted (is not Turing-complete), which doesn't allow doing something non-trivial, for example, choosing dependencies or compilation options based on some non-trivial conditions. That's why CMake is basically a programming language with variables, conditions, loops and even arithmetic.
How does craft handle these 'diamond' patterns where 2 dependencies may depend on versions of the same library as transitive dependencies (either for static or dynamic linking or as header-only includes) without custom build scripts like the Conan approach?
What exactly is it you do/need that can't be reasonably solved using the FetchContent module?
https://cmake.org/cmake/help/latest/module/FetchContent.html
What I've been doing to manage dependencies in a way that doesn't depress me much has been Nix flakes, which allows me a pretty straightforward
nix buildwith the correct dependencies built in.I'm just a bit curious though; a lot of C libraries are system-wide, and usually require the system package manager (e.g. libsdl2-dev) does this have an elegant way to handle those?
C++ build system, at the core, boils down to calling gcc foo.c -o foo.obj / link foo.obj foo.exe (please forgive if I got they syntax wrong).
Sure, you have more .c files, and you pass some flags but that's the core.
I've recently started a new C++ program from scratch.
What build system did I write?
I didn't. I told Claude:
"Write a bun typescript script build.ts that compiles the .cpp files with cl and creates foo.exe. Create release and debug builds, trigger release build with -release cmd-line flag".
And it did it in minutes and it worked. And I can expand it with similar instructions. I can ask for release build with all the sanitize flags and claude will add it.
The particulars don't matter. I could have asked for a makefile, or cmake file or ninja or a script written in python or in ruby or in Go or in rust. I just like using bun for scripting.
The point is that in the past I tried to learn cmake and good lord, it's days spent learning something that I'll spent 1 hr using.
It just doesn't make sense to learn any of those tools given that claude can give me working any build system in minutes.
It makes even less sense to create new build tools. Even if you create the most amazing tool, I would still choose spending a minute asking claude than spending days learning arbitrary syntax of a new tool.
https://cmkr.build/
cargo watch- that would be a killer feature!But how this tool figures out where the header files and build instructions for the libraries are that are included? Any expected layout or industry wide consensus?
PS is there a plan to include hermetic builds, from local sources / git submodules?
Once you appreciate the vastness of the problem, you will see that having a vibrant ecosystem of different competing package managers sucks. This is a problem where ONE standard that can handle every situation is incalculably better than many different solutions which solve only slices of the problem. I don't care how terse craft's toml file is - if it can't cross compile, it's useless to me. So my project can never use your tool, which implies other projects will have the same problem, which implies you're not the one package manager / build system, which means you're part of the problem, not the solution. The Right Thing is to adopt one unilateral standard for all projects. If you're remotely interested in working on package managers, the best way to help the human race is to fix all of the outstanding things about Conan that prevent it from being the One Thing. It's the closest to being the One Thing, and yet there are still many hanging chads:
- its terribly written documentation
- its incomplete support for editable packages
- its only nascent support for "workspaces"
- its lack of NVIDIA recipes
If you really can't stand to work on Conan (I wouldn't blame you), another effort that could help is the common package specification format (CPS). Making that a thing would also be a huge improvement. In fact, if it succeeds, then you'd be free to compete with conan's "frontend" ergonomics without having to compete with the ecosystem.