QuTS hero h6.0 Beta Testing - System Stall Root Cause Identified — Qsirch

During my QuTS hero h6.0 Beta testing, I ran into a severe system stall that looked like a ZFS meltdown:

  • Load averages spiked into the hundreds

  • SSH commands froze (ps, find, getcfg)

  • Processes got stuck in D‑state

  • The system became partially unresponsive

  • ZFS pools appeared to hang intermittently

  • Kernel logs showed [DISK SLOW] warnings and ZFS assertions

Initially, it looked like a failing disk or a ZFS bug. After a full investigation, the root cause turned out to be much simpler:

Removing Qsirch immediately eliminated the issue.

What Happened

Qsirch began a full indexing cycle after reboot. On QuTS hero (ZFS), Qsirch’s indexing engine generates extremely heavy metadata I/O:

  • deep directory walks

  • small random reads

  • SQLite/MariaDB writes

  • thumbnail extraction

  • content scanning

This load overwhelmed one of my pools, causing:

  • multi‑second disk latency

  • ZFS transaction delays

  • kernel I/O wait buildup

  • processes stuck in D‑state

  • system‑wide stalls

Once Qsirch was removed, the system instantly stabilized:

  • ZFS pools returned to normal latency

  • No more [DISK SLOW] warnings

  • No more kernel stalls

  • No more frozen commands

  • Load average dropped back to normal

No other changes were required.

Key Takeaway

On QuTS hero h6.0 Beta, Qsirch can overwhelm ZFS pools, especially large or busy ones. If you experience:

  • unexplained system stalls

  • frozen processes

  • high load with low CPU usage

  • ZFS latency warnings

Try disabling or removing Qsirch first.

For my system, removing Qsirch completely resolved the issue.

1 Like

I forgot to mention earlier: one of the 2.5-inch SSDs failed immediately after the incident. It may be related to the underlying issue, but I’m still validating. Adding this for completeness.

QSirch is extremely CPU intensive initially as it will index every single file on your NAS. This is not an issue but a nature of the beast. When first setting up a NAS, it is best to let everything get set up and settled down before enabling QSirch. In addition you can set folders to exclude from Qsirch’s indexing. For example, likely no need to include Container Station data or Virtual Machine drives in Qsirch.

Once Qsirch is done with the indexing, it takes minimal CPU and resource time. But it is exceptionally powerful in searching for stuff on the NAS.

1 Like

Did this issue only start occurring after upgrading to version 6.0? Was everything working normally in the previous versions? Thank you!

i was using qts before and no issue

Thanks for explaining that — really appreciate you sharing the details about how QSirch behaves during the initial indexing. I honestly wish I had known this earlier.

In my case, my NAS ended up running for about five days in an unreachable state, and eventually the SSD died because of bad sectors. Before that, I had a similar situation where the system was unreachable for three days until I reset and reconfigured everything again.

I just enabled Qsirch again, and the NAS immediately became unreachable. This is the same behavior I saw earlier—long periods of unresponsiveness, and in my previous case it even contributed to an SSD failing due to bad sectors after days of nonstop load.

As I said, if you have just done something significant, like a major OS update like this, it’s likely that there’s a LOT going on under the hood even if you don’t think there is. Qsirch just adds to all this until the NAS settles down.

If you want to see actual usage and see what is gumming things up, open an SSH session and run the “top” command. It will show you your CPU load, and the CPU consumption of all processes. Your CPU load is the critical number. It should be less than or equal to the number of cores/threads in your CPU. If it’s higher, then things will start to get sluggish.

Hi, we’ve tried reproducing this issue internally but haven’t been able to see the same results.

Just to clarify, Qsirch on QuTS hero 6.0.0 beta is optimized for better performance, so CPU usage might be slightly higher than previous versions. However, we haven’t encountered the specific situation you’re describing.

To help us investigate further, could you share some details about your data?

  • What is the total size of your files?

  • How many videos and photos do you have?

  • What is the total capacity currently being used?

Thanks!

Hi,

Thanks for the follow-up. I appreciate the clarification regarding Qsirch optimization in QuTS hero 6.0.0 beta.

To help with your investigation, here are the requested details:

  • Total size of files indexed by Qsirch via Multimedia Console: Approximately 23.13 TB, spanning two storage pools.

  • Photos: 893,365

  • Videos: 11,917

  • Music: 0 (indexed so far)

Please note: this only reflects the subset of files added to the Multimedia Console. My full dataset is nearly double that size, including a substantial music library that hasn’t yet been indexed.

  • Total capacity currently being used:

    • Storage Pool 1: 12.52 TB used out of 21.81 TB

    • Storage Pool 2: 10.61 TB used out of 14.13 TB

Let me know if you’d like system specs, CPU usage graphs, or logs to help reproduce the load profile more accurately. Happy to assist further.

Best regards, Victor Lam

In addition, My QNAP AI Core is actively running multiple recognition workloads. AI Core is enabled, and I am using a QAI‑U100 accelerator.

  • Facial Recognition: Working — 52% (470,880 / 893,637) Last update: 2025/12/31 09:59:20

  • Object Recognition: Working — 55% (496,770 / 893,637) Last update: 2025/12/31 09:59:45

  • Similar Photo Recognition: Working — 54% (483,804 / 893,637) Last update: 2025/12/31 09:58:42

One more update regarding the issue:

My NAS became unresponsive again even without Qsirch installed. At the time of the failure, the only active process was Multimedia Console indexing photos and videos. The system eventually stopped responding completely, and I had to reset the NAS and physically remove all hard drives to restore the system.

After rebuilding, I am now allowing only photos to be indexed. Under this reduced workload, the NAS has been running stably for about two days.

Hi, thanks for providing the info! We’d like to try reproducing the issue on our end.

Could you let us know the file formats for your photos and videos (e.g., .jpg, .heic, .mp4)? Also, what is the average size of each photo?

I noticed you mentioned AI Core—do you have any AI features currently enabled? Thanks!