Does F2FS Freeze an HC620 SMR Drive? Linux SMR Disk Troubleshooting Guide

Why HC620 Host-managed SMR drives may show high I/O wait and system freezes under F2FS, plus practical mount options, scheduler tuning, GC limits, and filesystem alternatives.

When an HC620 helium-filled SMR drive is used with F2FS, symptoms such as system freezes, unresponsive applications, and sustained high iowait are usually not caused by one bad option. They are the result of device behavior colliding with filesystem policy.

Western Digital Ultrastar DC HC620 is a Host-managed SMR drive. It is better suited to sequential writes, zoned-aware workloads, and software stacks that understand the device constraints. F2FS is a log-structured filesystem designed for flash storage. Although it can turn many random writes into sequential writes, heavy garbage collection, metadata updates, or low free space can still push a mechanical SMR drive into long internal maintenance cycles.

Confirm the problem first

Start with these checks:

1
2
3
iostat -x 1
iotop -oPa
dmesg -T | grep -Ei "f2fs|blk|zoned|reset|timeout|I/O error"

If disk %util stays close to 100%, await is high, and many processes are stuck in D state, the bottleneck is probably block I/O.

Then check whether the drive is exposed as a zoned device:

1
2
lsblk -o NAME,MODEL,SIZE,ROTA,ZONED,SCHED,MOUNTPOINTS
cat /sys/block/sdX/queue/zoned

If it is Host-managed SMR, ordinary filesystems and random-write workloads may perform poorly. Unlike consumer drive-managed SMR disks, this class depends more on host software understanding the write rules.

Why F2FS can amplify the stall

SMR cannot overwrite arbitrary locations as freely as CMR disks. Shingled tracks overlap to increase capacity. When writes become random, overwrites are frequent, or cache is exhausted, the drive must perform additional data movement and cleanup.

F2FS was built for NAND flash. It uses log-structured writes and reclaims space through segment cleaning and garbage collection. On SSDs this is natural because there is no mechanical seek. On mechanical disks, especially SMR disks, GC-related reads and writes can turn into severe tail latency.

When F2FS background GC, foreground writes, checkpoints, metadata updates, and the drive’s own SMR cleanup overlap, the I/O queue can stay saturated for a long time. From user space, copying files, deleting directories, downloading, extracting archives, or database writes may make the whole system feel frozen.

Start with conservative mount options

If you cannot migrate immediately, first adjust /etc/fstab:

1
UUID=xxxx  /data  f2fs  defaults,nodiscard,active_logs=2,gc_merge,flush_merge,lazytime  0  0

What these options do:

  • nodiscard: disables real-time discard. Mechanical disks usually do not need frequent TRIM/discard behavior like SSDs.
  • active_logs=2: F2FS supports 2, 4, or 6 active logs, and the default is commonly 6. Reducing it to 2 can reduce seek pressure from concurrent logs.
  • gc_merge: lets the background GC thread handle some foreground GC requests, reducing stalls when a process triggers slow GC.
  • flush_merge: merges cache flush requests, which can help when the device handles flush slowly.
  • lazytime: reduces metadata writes caused by some access time updates.

Do not treat checkpoint=disable as a normal tuning switch. It may reduce checkpoint pressure, but it increases risk after crashes or power loss. Kernel documentation also notes that the filesystem still needs GC while checkpoint is disabled to ensure usable space. Unless you understand the tradeoff clearly, do not use it as a long-term performance fix.

Tune the I/O scheduler

Mechanical disks and SMR disks often benefit from request merging and latency control. Check the current scheduler:

1
cat /sys/block/sdX/queue/scheduler

Try switching to mq-deadline:

1
echo mq-deadline | sudo tee /sys/block/sdX/queue/scheduler

For desktop interaction, bfq is also worth testing. Do not look only at sequential throughput. Watch whether freezes are reduced, await drops, and the system feels more responsive.

Limit F2FS background GC

The F2FS sysfs path depends on the actual device name. Check it first:

1
ls /sys/fs/f2fs/

Then adjust the GC interval for the matching device:

1
2
echo 60000 | sudo tee /sys/fs/f2fs/sdX/gc_min_sleep_time
echo 120000 | sudo tee /sys/fs/f2fs/sdX/gc_max_sleep_time

Here sdX is only an example. The actual name may be sda1, dm-0, or something else. Increasing GC sleep time reduces how often background GC competes for I/O, but space reclaim becomes slower. If the disk is nearly full, foreground GC may still be triggered, so keep enough free space.

Better long-term options

If the drive stores important data, the safest long-term answer is to back up and change the filesystem, or use a more suitable drive.

For large mechanical disks, consider:

  • XFS: good for large files, backup drives, media libraries, archives, and sequential-write workloads.
  • EXT4: stable behavior, broad compatibility, and abundant troubleshooting material.

If the drive is Host-managed SMR, also confirm that your kernel, controller, filesystem, and application stack truly support zoned block devices. Otherwise, using it like a normal random-write disk can lead to unpredictable long stalls.

Practical advice

This class of disk is better suited to cold data, archives, backups, media files, and sequential writes. It is a poor fit for download caches, container images, VM disks, databases, frequent archive extraction, and small-file random writes.

If you must keep using F2FS, at least do this:

  • Disable real-time discard.
  • Use active_logs=2 to reduce concurrent logs.
  • Enable gc_merge and flush_merge.
  • Keep plenty of free space.
  • Avoid placing downloads, databases, and VM images on this disk.
  • Watch iostat -x 1, not just average speed.

In short, HC620 + F2FS freezes are the result of SMR write constraints, F2FS GC, and mechanical disk tail latency stacking together. Short-term mitigation is mount-option tuning, scheduler tuning, and background GC limits. The long-term fix is to migrate to XFS/EXT4, or use the SMR drive only for workloads it actually suits.

References:

记录并分享
Built with Hugo
Theme Stack designed by Jimmy