TS 469U - Disk change, rebuild stops at 29.9%

TiBi · October 14, 2025, 9:09am

Hello! Two of my four drives (4x 3TB RAID 10) developed bad sectors and needed replacement, so I bought 2x 6TB drives in order to buy two more next month and replace the other 2 good drives and, eventually, extend the storage capacity.
I haven’t read the article posted at Online RAID Capacity Upgrade | QNAP before, my bad.
Oddly, the drives nedeed to be replaced were Disk 1 and Disk 2, which are on same mirror.

First I changed Disk 2 (it has the least number of bad sectors) by cold swap. After I powered on the server, it imediatelly included the disk in RAID and started to rebuild the matrix on Disk 2, which was the expected behavior, and after 11 hours the rebuild was done succesfully and everything was normal.
Meanwhile, the NAS was used normally, so files were accessed and uploaded on it.

I then changed the Disk 1 using cold swap aswell. The server included the disk in matrix and started the rebuild process which went smooth until it reached 29.9% when I noticed a significant drop in rebuild speed, but using the Resource Monitor I found that the rebuild was stopped completelly, and the speed showed in Storage Manager was simply an average, not the actual speed.
I stoped and rebooted the server, several times, but even the rebuild process started from zero, it stopped at same 29.9% every time and no error nor warning is issued on the matter. Just stops the transfer, but the Storage Manager still thinks is in the rebuild process!

In [~] # cat /var/log/storage_lib.log | more
I’ve found some info, but I can’t figure out if are related with my problem:

2025-10-13 15:21:14 [ 6156 disk_manage.cgi] Perform cmd "/sbin/lvs vg1/tp1 2>>/dev/null -o segtype --noheadings | /bin/grep 'tier-thin-pool' &>/dev/null 2>>/dev/null" failed, cmd_rsp=256, reason code:1.

2025-10-13 15:21:14 [ 6156 disk_manage.cgi] Perform cmd "/sbin/lvs vg1/tp1 -o lv_attr --noheadings 2>>/dev/null | /bin/sed s/^[[:space:]]*// 2>>/dev/null" OK, cmd_rsp=0, reason code:0.

2025-10-13 15:21:14 [ 6156 disk_manage.cgi] Perform cmd "/sbin/lvs vg1/tp1 -o lv_attr --noheadings 2>>/dev/null | /bin/sed s/^[[:space:]]*// 2>>/dev/null" OK, cmd_rsp=0, reason code:0.

… and more.

NAS info:
Model: TS-469U
CPU: Intel(R) Atom™ CPU D2701 @ 2.13GHz
Memory: 2.93 GB
Firmware: 4.3.4.2814 Build 20240618
Disks:
Disk 1: TOSHIBA HDWT860 (SATA) 5589.03 GB, FW: KQ0C1L
Disk 2: TOSHIBA HDWT860 (SATA) 5589.03 GB, FW: KQ0C1L
Disk 3: WD30EZRX-00SPEB0 (SATA) 2794.52 GB, FW: 80.00A80
Disk 4: WD30EFZX-68AWUN0 (SATA) 2794.52 GB, FW: 81.00B81

System info:

[~] # pvs
  PV         VG   Fmt  Attr PSize PFree
  /dev/md1   vg1  lvm2 a--  5.44t    0

[~] # mdadm -D /dev/md1
/dev/md1:
        Version : 1.0
  Creation Time : Wed Aug 12 13:54:11 2015
     Raid Level : raid10
     Array Size : 5840623232 (5570.05 GiB 5980.80 GB)
  Used Dev Size : 2920311616 (2785.03 GiB 2990.40 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Mon Oct 13 14:24:06 2025
          State : clean, degraded, recovering
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : near=2
     Chunk Size : 64K

 Rebuild Status : 29% complete

           Name : 1
           UUID : 49fe7ca6:8cabaaa6:b7d566cb:98e22f1a
         Events : 144869

 Number   Major   Minor   RaidDevice State
    6       8        3        0      spare rebuilding   /dev/sda3
    5       8       19        1      active sync set-B  /dev/sdb3
    2       8       35        2      active sync set-A  /dev/sdc3
    4       8       51        3      active sync set-B  /dev/sdd3

[~] # md_checker

Welcome to MD superblock checker (v1.4) - have a nice day~

Scanning system...

HAL firmware detected!
Scanning Enclosure 0...

RAID metadata found!
UUID:           49fe7ca6:8cabaaa6:b7d566cb:98e22f1a
Level:          raid10
Devices:        4
Name:           md1
Chunk Size:     64K
md Version:     1.0
Creation Time:  Aug 12 13:54:11 2015
Status:         ONLINE (md1) [_UUU]
=========================================================================
 Disk | Device | # | Status |   Last Update Time   | Events | Array State
=========================================================================
   1  /dev/sda3  0  Rebuild   Oct 13 14:28:55 2025   144877   AAAA
   2  /dev/sdb3  1   Active   Oct 13 14:28:55 2025   144877   AAAA
   3  /dev/sdc3  2   Active   Oct 13 14:28:55 2025   144877   AAAA
   4  /dev/sdd3  3   Active   Oct 13 14:28:55 2025   144877   AAAA
=========================================================================

Basically, I’m stuck, I don’t know what to do to push it over the 29.9% barrier. Please help!

TiBi · October 14, 2025, 9:32am

I also have the report using Patrick Wilson’s script:

[~] # /tmp/nasreport
*********************
** QNAP NAS Report **
*********************

NAS Model:   TS-469U
Firmware:    4.3.4 Build 20240618
System Name: JIUL-NAS
Workgroup:   credit

Default Gateway Device: eth0

          inet addr:192.168.0.10  Bcast:192.168.0.127  Mask:255.255.255.128
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:48800 errors:0 dropped:2784 overruns:0 frame:0
          TX packets:62702 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:11613754 (11.0 MiB)  TX bytes:49596701 (47.2 MiB)
          Interrupt:16 Memory:c0100000-c0120000


DNS Nameserver(s):1.1.1.1
208.67.222.222


HDD Information:

 Model=TOSHIBA HDWT860                         , FwRev=KQ0C1L  , SerialNo=        Z4E2S1MCSRQL
 Model=TOSHIBA HDWT860                         , FwRev=KQ0C1L  , SerialNo=        Z4E2S1M9SRQL
 Model=WDC WD30EZRX-00SPEB0                    , FwRev=80.00A80, SerialNo=     WD-WCC4E2EC80Z7
 Model=WDC WD30EFZX-68AWUN0                    , FwRev=81.00B81, SerialNo=     WD-WX62D325CKDY
 Model=        , FwRev=, SerialNo=914EE000CA309FEE

Disk Space:

Filesystem                Size      Used Available Use% Mounted on
none                    200.0M    181.8M     18.2M  91% /
devtmpfs                  1.5G      8.0K      1.5G   0% /dev
tmpfs                    64.0M    332.0K     63.7M   1% /tmp
tmpfs                     1.5G      4.0K      1.5G   0% /dev/shm
tmpfs                    16.0M         0     16.0M   0% /share
/dev/md9                509.5M    139.9M    369.6M  27% /mnt/HDA_ROOT
cgroup_root               1.5G         0      1.5G   0% /sys/fs/cgroup
/dev/md13               371.0M    344.8M     26.2M  93% /mnt/ext
tmpfs                     1.0M         0      1.0M   0% /mnt/rf/nd
tmpfs                    64.0M      1.9M     62.1M   3% /samba
/dev/mapper/ce_cachedev1
                          5.4T      3.1T      2.2T  58% /share/CE_CACHEDEV1_DATA
tmpfs                    16.0M     36.0K     16.0M   0% /share/CE_CACHEDEV1_DATA/.samba/lock/msg.lock
tmpfs                    16.0M         0     16.0M   0% /mnt/ext/opt/samba/private/msg.sock

Mount Status:

none on /new_root type tmpfs (rw,mode=0755,size=204800k)
/proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw)
sysfs on /sys type sysfs (rw)
tmpfs on /tmp type tmpfs (rw,size=64M)
tmpfs on /dev/shm type tmpfs (rw)
tmpfs on /share type tmpfs (rw,size=16M)
none on /proc/bus/usb type usbfs (rw)
/dev/md9 on /mnt/HDA_ROOT type ext3 (rw,data=ordered)
cgroup_root on /sys/fs/cgroup type tmpfs (rw)
cpu on /sys/fs/cgroup/cpu type cgroup (rw,cpu)
/dev/md13 on /mnt/ext type ext4 (rw,data=ordered,barrier=1,nodelalloc)
none on /sys/kernel/config type configfs (rw)
tmpfs on /mnt/rf/nd type tmpfs (rw,size=1m)
nfsd on /proc/fs/nfsd type nfsd (rw)
tmpfs on /samba type tmpfs (rw,size=64M)
/dev/mapper/ce_cachedev1 on /share/CE_CACHEDEV1_DATA type ext4 (rw,usrjquota=aquota.user,jqfmt=vfsv0,user_xattr,data=ordered,data_err=abort,delalloc,nopriv,nodiscard,acl)
tmpfs on /share/CE_CACHEDEV1_DATA/.samba/lock/msg.lock type tmpfs (rw,size=16M)
tmpfs on /mnt/ext/opt/samba/private/msg.sock type tmpfs (rw,size=16M)

RAID Status:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md1 : active raid10 sda3[6] sdd3[4] sdc3[2] sdb3[5]
                 5840623232 blocks super 1.0 64K chunks 2 near-copies [4/3] [_UUU]
                 [=====>...............]  recovery = 29.9% (874372532/2920311616) finish=46246.2min speed=737K/sec

md322 : active raid1 sdd5[3](S) sdc5[2](S) sdb5[1] sda5[0]
                 7235136 blocks super 1.0 [2/2] [UU]
                 bitmap: 0/1 pages [0KB], 65536KB chunk

md256 : active raid1 sdd2[3](S) sdc2[2](S) sdb2[1] sda2[0]
                 530112 blocks super 1.0 [2/2] [UU]
                 bitmap: 0/1 pages [0KB], 65536KB chunk

md13 : active raid1 sda4[26] sdd4[24] sdc4[2] sdb4[25]
                 458880 blocks super 1.0 [24/4] [UUUU____________________]
                 bitmap: 1/1 pages [4KB], 65536KB chunk

md9 : active raid1 sda1[26] sdd1[24] sdc1[2] sdb1[25]
                 530112 blocks super 1.0 [24/4] [UUUU____________________]
                 bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>

Memory Information:

MemTotal:        3072244 kB
MemFree:          216660 kB

NASReport completed on 2025-10-13 14:41:15 (-sh)

Also, some activity of manaRequest.cgi which I’ve seen it in Process Manager doing CPU work:

2025-10-13 15:21:06 [ 5813 manaRequest.cgi] md_get_status: /dev/md1 : status=6, progress=29.900000.
2025-10-13 15:21:06 [ 5813 manaRequest.cgi] Perform cmd "/bin/df -k /share/CE_CACHEDEV1_DATA 2>>/dev/null | /usr/bin/tail -n1 | /bin/awk -F ' ' '{print $(NF-3)}'" OK, cmd_rsp=0, reason code:0.
2025-10-13 15:21:06 [ 5813 manaRequest.cgi] Perform cmd "/bin/df -k /share/CE_CACHEDEV1_DATA 2>>/dev/null | /usr/bin/tail -n1 | /bin/awk -F ' ' '{print $(NF-2)}'" OK, cmd_rsp=0, reason code:0.
2025-10-13 15:21:06 [ 5813 manaRequest.cgi] md_get_status: /dev/md1 : status=6, progress=29.900000.
2025-10-13 15:21:09 [ 5939 manaRequest.cgi] Volume_Enumerate_Data_Volumes: got called with (0xea1800, 772).
2025-10-13 15:21:09 [ 5939 manaRequest.cgi] Blk_Dev_Generate_Mount_Point: mount point for "/dev/mapper/ce_cachedev1" is "/share/CE_CACHEDEV1_DATA", is_internal is 1.
2025-10-13 15:21:10 [ 6051 disk_manage.cgi] Blk_Dev_Generate_Mount_Point: mount point for "/dev/mapper/ce_cachedev1" is "/share/CE_CACHEDEV1_DATA", is_internal is 1.
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] md_get_status: /dev/md1 : status=6, progress=29.900000.
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] RAID_Get_Default_Parameters:raid level=10, readAhead=0, stripeCacheSize=0, speedLimitMax=0, speedLimitMin=0, ret=0.
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] RAID_Get_Config_Parameters:raid_id=1, readAhead=0, stripeCacheSize=0, speedLimitMax=0, speedLimitMin=50000, ret=0.
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] NAS_Secure_Erase_Get_Progress 189 enc_id=0 port_id=1
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] NAS_Secure_Erase_Get_Progress 203 0 progress=0.00
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] NAS_Secure_Erase_Get_Progress 189 enc_id=0 port_id=2
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] NAS_Secure_Erase_Get_Progress 203 0 progress=0.00
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] NAS_Secure_Erase_Get_Progress 189 enc_id=0 port_id=3
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] NAS_Secure_Erase_Get_Progress 203 0 progress=0.00
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] NAS_Secure_Erase_Get_Progress 189 enc_id=0 port_id=4
2025-10-13 15:21:13 [ 6107 manaRequest.cgi] NAS_Secure_Erase_Get_Progress 203 0 progress=0.00

Is it possible to upload logs on forum?

dolbyman · October 14, 2025, 1:01pm

Not sure why you cold swapped the disks, but you have two disks busted in a mirror pair…Game over for the data kill the array and rebuild it from backups.

TiBi · October 14, 2025, 1:45pm

I made cold swap because I didn’t believe that mid-range QNAP products might have hot-swap.
Data is fine, thanks for the care.

dolbyman · October 14, 2025, 1:46pm

All QNAP products with externally accessible bays are hotswap capable.

If the data is fine, just start your volume from scratch then.

TiBi · October 15, 2025, 6:19am

Thanks! Can you provide instructions on how to force rebuild process on Disk 1?

SteveKo · October 15, 2025, 7:31am

Hi, based on our experience, the situation you’ve run into (getting stuck during a RAID rebuild) is usually caused by another hard drive (HDD) having an issue.

Please open a support ticket for us, and our Support Team will be happy to assist you. Thanks!