Today I received a very interesting yet instructive lesson from QNAP on why a NAS is not a backup and the NAS itself needs to be backed up. It could happen to you as well.
I recently completed RAID scrubbing on my volumes. Never had done it before. Due to an issue I was having, I submitted my dump logs to the ticket. The technician came back and told me that I had a number of ZFS file system errors that could not be corrected and that I needed to reinitialize my NAS and start over from scratch!
This is not from bad drives (it’s showing up on my SSD drives and my main RAID HDs). He said this happens on good drives but sometimes data gets corrupted when being written to and there’s nothing you can do about it. He said make sure your data is all backed-up off the NAS and reinitialize it and start over again! I could not believe it.
So - another warning - the NAS itself is not a backup. You may backup your PCs to your NAS but backup your NAS elsewhere! It’s not necessarily a bad disc that may cause problems.
I’m so looking forward to doing this! At the very least, I’ll be going back to h5.2 since h5.3 is really not finished and is really only necessary for High Availability applications…
This is why I have been leery of Quts and all the stuff it does in the background. I had recently considered moving up to Quts, but I will stick with Qts for now..
Correct 1 NAS is never a sure way to have backups. RAID isn’t backup either. Thus the 321 rule. It’s probably the #1 misconception in having a NAS and RAID. They make recovery time minimal but alone don’t save you from a fire or hardware failure.
I bought a brand new TS-464C2 in late 2024 and it destroyed my entire storage pool withinin 1 month. I switched to QuTS Hero and soon found some checksum error on all of my drives, since which I noticed it’s the NAS issue and a memtest proved my thought.
Even worse, I get another brand-new device as replacement from vendor, and it also has memory issue…
I have daily backup on the cloud so I only lost data of several hours, but the recovery took me days.
For those running QuTS Hero, I would suggest regularly opening an SSH session and running this command:
zpool status -v
Here’s what this looks like:
[Jono@NA9D-NAS ~]$ zpool status -v
pool: zpool1
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub repaired 0 in 0 days 00:15:53 with 2 errors on Sun Oct 19 00:15:58 2025
prune: last pruned 409 entries, 2392 entries are pruned ever
total pruning count #5, avg. pruning rate = 2986468 (entry/sec)
expand: none requested
renew: none requested
config:
NAME STATE READ WRITE CKSUM
zpool1 ONLINE 0 0 2
mirror-0 ONLINE 0 0 4
qzfs/enc_0/disk_0x1_24074767F6C0_3 ONLINE 0 0 4
qzfs/enc_0/disk_0x2_24534D52401A_3 ONLINE 0 0 4
errors: Permanent errors have been detected in the following files:
zpool1/$SHADOW:<0x1> (181:1:0:5046392)
zpool1/$SHADOW:<0x1> (181:1:0:5046393)
pool: zpool2
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub repaired 0 in 5 days 04:25:26 with 16 errors on Fri Oct 24 04:25:37 2025
prune: last pruned 2637224 entries, 15879437 entries are pruned ever
total pruning count #5, avg. pruning rate = 3490183 (entry/sec)
expand: none requested
renew: none requested
config:
NAME STATE READ WRITE CKSUM
zpool2 ONLINE 0 0 364
raidz1-0 ONLINE 0 0 728
qzfs/enc_0/disk_0x3_5000CCA27EC5F5A5_3 ONLINE 0 0 0
qzfs/enc_0/disk_0x4_5000CCA267CD00FE_3 ONLINE 0 0 0
qzfs/enc_0/disk_0x5_5000CCA273F0B2D9_3 ONLINE 0 0 0
qzfs/enc_0/disk_0x6_5000CCA27EC5A850_3 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
zpool2/$SHADOW:<0x1> (181:1:0:29628189)
zpool2/$SHADOW:<0x1> (181:1:0:764458)
zpool2/$SHADOW:<0x1> (181:1:0:12855854)
zpool2/$SHADOW:<0x1> (181:1:0:784706)
zpool2/$SHADOW:<0x1> (181:1:0:57489228)
zpool2/$SHADOW:<0x1> (181:1:0:57232981)
zpool2/$SHADOW:<0x1> (181:1:0:57499738)
zpool2/$SHADOW:<0x1> (181:1:0:5362782)
zpool2/$SHADOW:<0x1> (181:1:0:1036430)
zpool2/$SHADOW:<0x1> (181:1:0:5493137)
zpool2/$SHADOW:<0x1> (181:1:0:5355174)
zpool2/$SHADOW:<0x1> (181:1:0:57241256)
zpool2/$SHADOW:<0x1> (181:1:0:30161860)
zpool2/$SHADOW:<0x1> (181:1:0:5499856)
zpool2/$SHADOW:<0x1> (181:1:0:754404)
zpool2/$SHADOW:<0x1> (181:1:0:13872636)
i had those errors before but that was a while ago. I dont recall how it was fixed, might been destroy and recreate the raid. I feel like i had those errors when my Ups was having issues with quick brown outs where the power dropped for like a second or two. After i replaced the ups with a better model i havent seen that happen to me anymore and some of my disks had a runtime of 8 years lol. some were newer like 2-3 years old.
If you have power dips where ups takes a second to provide power to nas it might be causing the errors.
I have been educating my customers about this for many years, because I once lost some data when the RAID array on my Netgear NAS crashed, then the same thing happened on my Synology NAS, but fortunately, it hasn’t happened yet with my QNAP NAS devices. However, nothing should be underestimated, and the 3-2-1 backup rule is important, which is why I have everything from my main NAS backed up to a secondary NAS.
Btw, NA9D - I tried your code and it looks fine.
(QuTS hero 5.2.7) zpool1 and zpool2 result → errors: No known data errors
@marcoi - I haven’t had any power fluctuations or anything like that. Who knows. My other hero NAS is fine. As the QNAP Tech told me, it’s unknown how it happened. It be something just got mangled when it was writing.
The important thing for everyone is that the NAS may appear to be perfectly fine. Drives in good shape, etc. Yet something could damage the file system itself which can lead to other problems.
okay i went through my old emails from qnap tickets. This happened btw in 2021 and seems like they found an issue and fixed it via firmware back in 2022 time frame.
Below from tickets
After review, the RD team says that the errors we saw indicate a permanent pool error that cannot be fixed, unfortunately, so the only solution is to create a new pool.
If you already have the data elsewhere, then you should just copy the data back to the new pool from that original set of data. If you don’t, then you could copy the data off this pool to move over, and any files that have been corrupted should be skipped by the system.
After you create a new pool, the team suggests that you schedule a monthly Pool Scrubbing to prevent this issue from happening (instead of trying to fix after the fact).
Just to provide a brief update, the team doesn’t believe it’s a hardware issue, but they’re not entirely sure what’s causing the issue.
There is some sort of unknown corruption happening in the “SHADOW” layer of the file system, but they’re still trying to find out how this corruption is happening.
It looks like there will be a new h5.0.0 firmware release, scheduled for 3/31, that has a fix for the pool corruption issue.
I’m still trying to get a bit more information from them about this though, but at least for now it looks like the fix should be expected soon.
I apologize for the late reply, I was trying to clarify some information with the team.
I’ve been told that there should be a new h5.0.0 firmware update being released soon, sometime in the next few days, and this firmware should stop future pools from getting corrupted anymore.
There doesn’t seem to be a way to fix any corrupted data, if there is any, but there will be a h5.0.1 version in the future that would allow clear out errors in those existing pools to allow them to keep running properly.
Please keep your eye out for the upcoming release, but if you have any questions, let me know.
It can be. Just depends on your understanding of the term backup. If you think a single copy of your data or a single device storing your data is sufficient for backup security, well Murphy will show you otherwise. The 321 principle is a bare minimum.