TS-664 RAID5 rebuild → LVM thin-pool btree corruption — 47TB unrecoverable via normal tools, need thin_repair help

Device & Software:

  • QNAP TS-664, QTS 5.2.9.3499

  • 4× 18TB WDC drives (slots 3–6), RAID 5

  • Storage pool using QNAP TIER thin-pool on top of DRBD + LVM

What happened:
After a RAID 5 rebuild completed, the storage pool went offline and will not come back. QTS shows the volume as inactive. I have not made any changes to the system.

Error in dmesg:

device-mapper: thin: tm_open_with_sm failed
device-mapper: table: thin-pool: Error creating metadata object

thin_check output:

examining superblock
  superblock is corrupt
    bad checksum in superblock
examining superblock backups
  6 valid superblock backups, last backup_id=5 blocknr=16711738

Situation:

  • 47.11TB of critical business data (design files, video production, accounting)

  • Pool will not mount — QTS just shows volume as inactive/damaged

  • I have a large external USB drive available for recovery if I can get read-only access

  • I have NOT reinitialised anything

Questions:

  1. Can thin_repair recover from the 6 valid backup superblocks that thin_check found?

  2. Has anyone done low-level thin-pool recovery on a QNAP TIER pool specifically?

  3. Is there a way to get QNAP support to assist at this level, or a recommended data recovery service familiar with QNAP internals?

Any help appreciated — I just need read-only access to copy the files off.

[/share] # dmsetup ls
vg1-tp1_tierdata_2 (252:3)
vg1-tp1_tierdata_1 (252:2)
vg1-tp1_tierdata_0 (252:1)
vg1-lv1312 (252:8)
vg1-lv544_tmeta (252:0)
[/share] # dmesg | grep -i “thin|tier” | tail -20
[217229.555462] device-mapper: thin metadata: sb_check failed: csum 1162835701: wanted 1308032382
[217229.574124] device-mapper: thin metadata: __get_correct_block_manager: free old bm
[217229.582835] device-mapper: thin metadata: sb_check failed: csum 1292622111: wanted 1308032382
[217229.601499] device-mapper: thin metadata: could not create block manager
[217229.609080] device-mapper: tier: tier_ctx_dtr:3989, put device 252:1 !!
[217229.616568] device-mapper: tier: tier_ctx_dtr:3989, put device 252:2 !!
[217229.624056] device-mapper: tier: tier_ctx_dtr:3989, put device 252:3 !!
[217229.631538] device-mapper: table: 252:4: thin-pool: Error creating metadata object
[/share] #

So why did you need to do a RAID rebuild?

Do you have no backups of your critical data? RAID is not a backup scheme…