> This is not a bug report. [...] The goal is constructive, not a complaint.
Er, I appreciate trying to be constructive, but in what possible situation is it not a bug that a power cycle can lose the pool? And if it's not technically a "bug" because BTRFS officially specifies that it can fail like that, why is that not in big bold text at the start of any docs on it? 'Cuz that's kind of a big deal for users to know.
EDIT: From the longer write-up:
> Initial damage. A hard power cycle interrupted a commit at generation 18958 to 18959. Both DUP copies of several metadata blocks were written with inconsistent parent and child generations.
Did the author disable safety mechanisms for that to happen? I'm coming from being more familiar with ZFS, but I would have expected BTRFS to also use a CoW model where it wasn't possible to have multiple inconsistent metadata blocks in a way that didn't just revert you to the last fully-good commit. If it does that by default but there's a way to disable that protection in the name of improving performance, that would significantly change my view of this whole thing.
As far as I can see, no, the author disabled nothing of the sort that he documented.
I suspect that the author's intent is less "I do not view this as a bug" and more "I do not think it's useful to get into angry debates over whether something is a bug". I do not know whether this is a common thing on btrfs discussions, but I have certainly seen debates to that effect elsewhere.
(My personal favorite remains "it's not a data loss bug if someone could technically theoretically write something to recover the data". Perhaps, technically, that's true, but if nobody is writing such a tool, nobody is going to care about the semantics there.)
> I suspect that the author's intent is less "I do not view this as a bug" and more "I do not think it's useful to get into angry debates over whether something is a bug".
Agreed, and I appreciate the attempt to channel things into a productive conversation.
As far as I understand, single device and RAID1 is solid, but as soon as you want to do RAID1+0 or RAID5/6 you’re entering dangerous territory with BTRFS.
Changing the metadata profile to at least raid1 (raid1, raid1c3, raid1c4) is a good idea, especially for anyone, against recommendations, using raid5 or raid6 for a btrfs array (raid1c3 is more appropriate for raid6). That would make it very difficult for metadata to get corrupted, which is the lion's share of the higher-impact problems with raid5/6 btrfs.
check:
btrfs fi df
convert metadata:
btrfs balance start -mconvert=raid1c3,soft
(make sure it's -mconvert — m is for metadata — not -dconvert which would switch profiles for data, messing up your array)
> A hard power cycle on a 3 device pool (data single, metadata DUP, DM-SMR disks) left the extent tree and free space tree in a state that no native repair path could resolve.
As a ZFS wrangler by day:
People in this thread seem to happily shit on btrfs here but this seems to be very much not like a sane, resilient configuration no matter the FS. Just something to keep in mind.
This is obviously LLM output, but perhaps LLM output that corresponds to a real scenario. It's plausible that Claude was able to autonomously recover a corrupted fs, but I would not trust its "insights" by default. I'd love to see a btrfs dev's take on this!
Btrfs allows migration from ext4 with a rather good rollback strategy...
Post-migration, a complete disk image of the original ext4 disk will exist within the new filesystem, using no additional disk space due to the magic of copy-on-write.
Why isn't the repair process the same? Fix the filesystem to get everything online asap, and leave a complete disk image of the old damaged filesystem so other recovery processes can be tried if necessary.
This reads something between cluless/malicious and genius. Crosses several red lights with a car, smashes the car, rebuilds the car with AI, tells people to cross red lights
Keeps repeating btrfs check --repair . This command is dangerous and warned anywhere as a last resort: if you try to execute it you get a warning; the documentation has a warning; any guide from google tell you not to run it unless all else fails; chatgpt/lechat do not metion it, or note it as last resort. So not sure why he keeps repeating it without any note
> Use these tools ONLY if btrfs check --repair segfaults, enters an infinite loop, or leaves the filesystem in worse shape than before.
> Timeline of events ... First repair attempts. btrfs check --repair
The guy is recommending people brick their volumes permanently as first resort without any warning
Between using a dup profile and this I would not be surprised a btrfs dev just disregarding all as slop
> Pool only mounts with rescue=all,ro, fails to mount RW
Also this is important, the data was not lost. Even though read-only
I don't think I would run this code. Still it would be interesting a btrs dev to have look and comment if there is any value in the code generated. As it would be definitely interesting being able to repair more issues in the pool safely inplace
Welp. Guess I need to figure out another fs to use for a few drives in a nonraid pool I haven't gotten around to setting up yet. I forget why zfs seemed out. xfs?
To theal author: did you continue using btrfs after this ordeal? An FS that will not eat (all) your data upon a hard powercycle only at the cost of 14 custom C tools is a hard pass from me no matter how many distros try to push it down my throat as 'production-ready'...
65 comments
> This is not a bug report. [...] The goal is constructive, not a complaint.
Er, I appreciate trying to be constructive, but in what possible situation is it not a bug that a power cycle can lose the pool? And if it's not technically a "bug" because BTRFS officially specifies that it can fail like that, why is that not in big bold text at the start of any docs on it? 'Cuz that's kind of a big deal for users to know.
EDIT: From the longer write-up:
> Initial damage. A hard power cycle interrupted a commit at generation 18958 to 18959. Both DUP copies of several metadata blocks were written with inconsistent parent and child generations.
Did the author disable safety mechanisms for that to happen? I'm coming from being more familiar with ZFS, but I would have expected BTRFS to also use a CoW model where it wasn't possible to have multiple inconsistent metadata blocks in a way that didn't just revert you to the last fully-good commit. If it does that by default but there's a way to disable that protection in the name of improving performance, that would significantly change my view of this whole thing.
I suspect that the author's intent is less "I do not view this as a bug" and more "I do not think it's useful to get into angry debates over whether something is a bug". I do not know whether this is a common thing on btrfs discussions, but I have certainly seen debates to that effect elsewhere.
(My personal favorite remains "it's not a data loss bug if someone could technically theoretically write something to recover the data". Perhaps, technically, that's true, but if nobody is writing such a tool, nobody is going to care about the semantics there.)
> I suspect that the author's intent is less "I do not view this as a bug" and more "I do not think it's useful to get into angry debates over whether something is a bug".
Agreed, and I appreciate the attempt to channel things into a productive conversation.
Changing the metadata profile to at least raid1 (raid1, raid1c3, raid1c4) is a good idea, especially for anyone, against recommendations, using raid5 or raid6 for a btrfs array (raid1c3 is more appropriate for raid6). That would make it very difficult for metadata to get corrupted, which is the lion's share of the higher-impact problems with raid5/6 btrfs.
check:
convert metadata: (make sure it's -mconvert — m is for metadata — not -dconvert which would switch profiles for data, messing up your array)> A hard power cycle on a 3 device pool (data single, metadata DUP, DM-SMR disks) left the extent tree and free space tree in a state that no native repair path could resolve.
As a ZFS wrangler by day:
People in this thread seem to happily shit on btrfs here but this seems to be very much not like a sane, resilient configuration no matter the FS. Just something to keep in mind.
> Case study: recovery of a severely corrupted 12 TB multi-device pool, plus constructive gap analysis and reference tool set #1107
Please don't be btrfs please don't be btrfs please don't be btrfs...
Post-migration, a complete disk image of the original ext4 disk will exist within the new filesystem, using no additional disk space due to the magic of copy-on-write.
Why isn't the repair process the same? Fix the filesystem to get everything online asap, and leave a complete disk image of the old damaged filesystem so other recovery processes can be tried if necessary.
Keeps repeating btrfs check --repair . This command is dangerous and warned anywhere as a last resort: if you try to execute it you get a warning; the documentation has a warning; any guide from google tell you not to run it unless all else fails; chatgpt/lechat do not metion it, or note it as last resort. So not sure why he keeps repeating it without any note
> Use these tools ONLY if btrfs check --repair segfaults, enters an infinite loop, or leaves the filesystem in worse shape than before.
> Timeline of events ... First repair attempts. btrfs check --repair
The guy is recommending people brick their volumes permanently as first resort without any warning
Between using a dup profile and this I would not be surprised a btrfs dev just disregarding all as slop
> Pool only mounts with rescue=all,ro, fails to mount RW
Also this is important, the data was not lost. Even though read-only
I don't think I would run this code. Still it would be interesting a btrs dev to have look and comment if there is any value in the code generated. As it would be definitely interesting being able to repair more issues in the pool safely inplace
Also, impressive work!