USB file transfer causes system freezes; ops take hours instead of minutes

Bug #500069 reported by sbec67
654
This bug affects 143 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
High
linux (Fedora)
Won't Fix
High
linux (Ubuntu)
Fix Released
High
Unassigned
Nominated for Lucid by Mudstone

Bug Description

USB Drive is a MP3 Player 2GB

sbec@Diamant:~$ lsusb
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 002: ID 046d:c50e Logitech, Inc. MX-1000 Cordless Mouse Receiver
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 004: ID 0402:5661 ALi Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
sbec@Diamant:~$

Linux Diamant 2.6.31-15-generic #50-Ubuntu SMP Tue Nov 10 14:54:29 UTC 2009 i686 GNU/Linux
Ubuntu 2.6.31-15.50-generic

to test, i issued dd command:
dd if=/dev/zero of=/media/usb-disk/test-file bs=32

while dd is running i run dstat.... this is in the log file attached.

other logs are also in the tar.gz file...

there is a huge USB performance Bug report #1972262. this Report is something simular

Revision history for this message
sbec67 (sbec) wrote :
Revision history for this message
Dexter (pogany-tamas+bug) wrote :

I have this problem too, check my logs. My USB 2.0 pendrive's speed is about 3-5 MB/s. Soooo slow.

Revision history for this message
sbec67 (sbec) wrote :

did someone took a look on this ?
This Bug is really annoying, as it takes ages to bring some Data on a USB Device ;-(

Regards

Changed in linux (Ubuntu):
status: New → In Progress
assignee: nobody → Colin King (colin-king)
Revision history for this message
Colin Ian King (colin-king) wrote :

@sbec67, can you run the following command to collect all the system specific information about your computer to help us diagnose this bug:

apport-collect 500069

please can you supply the model number of the pendrive and if you are using it via a USB hub or not.

Thanks!

Changed in linux (Ubuntu):
importance: Undecided → Medium
importance: Medium → High
Andy Whitcroft (apw)
tags: added: kernel-series-unknown
Revision history for this message
sbec67 (sbec) wrote :

@Colin: I entered "sudo apport-collect 500069"
the Device i facing the Problem is a Tech Line DFS-1002 MP3 Player.
But it seems the Problem happens with almost any Flash Memory USB Device.

@Andy: GRUB uses ( out of /boot/gruub/menu.lsi ) this Kernel Parm to boot:

kernel /boot/vmlinuz-2.6.31-17-generic root=UUID=96f8feb3-e65b-4696-b8f
5-c32638cde861 ro quiet splash

Kind regards
Simon

Revision history for this message
Cklein (pablo-cascon) wrote : apport-collect data

AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: ALC272 Analog [ALC272 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: pablo 1645 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xf0340000 irq 22'
   Mixer name : 'Realtek ALC272'
   Components : 'HDA:10ec0272,144dca00,00100001'
   Controls : 19
   Simple ctrls : 12
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=UUID=c1938014-8689-4db3-add3-99b0c514b946
MachineType: SAMSUNG ELECTRONICS CO., LTD. NC10
Package: linux (not installed)
ProcCmdLine: root=UUID=5f964d43-b8cd-4886-a6e5-80376a5eb7b5 ro quiet splash
ProcEnviron:
 SHELL=/bin/bash
 LANG=es_ES.UTF-8
ProcVersionSignature: Ubuntu 2.6.31-17.54-generic
RelatedPackageVersions:
 linux-backports-modules-2.6.31-17-generic N/A
 linux-firmware 1.25
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Tags: ubuntu-unr
Uname: Linux 2.6.31-17-generic i686
UserGroups: adm admin cdrom dialout fuse lpadmin plugdev sambashare
WpaSupplicantLog:

dmi.bios.date: 09/08/2009
dmi.bios.vendor: Phoenix Technologies Ltd.
dmi.bios.version: 11CA.M015.20090908.RHU
dmi.board.name: NC10
dmi.board.vendor: SAMSUNG ELECTRONICS CO., LTD.
dmi.board.version: Not Applicable
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: SAMSUNG ELECTRONICS CO., LTD.
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnPhoenixTechnologiesLtd.:bvr11CA.M015.20090908.RHU:bd09/08/2009:svnSAMSUNGELECTRONICSCO.,LTD.:pnNC10:pvrNotApplicable:rvnSAMSUNGELECTRONICSCO.,LTD.:rnNC10:rvrNotApplicable:cvnSAMSUNGELECTRONICSCO.,LTD.:ct10:cvrN/A:
dmi.product.name: NC10
dmi.product.version: Not Applicable
dmi.sys.vendor: SAMSUNG ELECTRONICS CO., LTD.

tags: added: apport-collected
16 comments hidden view all 834 comments
Revision history for this message
David Turner (dkturner) wrote : Re: since Ubuntu karmic Filetransfer to some USB Drives got realy slow

I've also experienced very slow USB transfers, with CPU waiters piling up. The odd thing is that USB-to-USB transfers go at about four times the speed of HDD-to-USB transfers. On average I get 2MB/s from hard drive to USB, and about 8MB/s copying between flash drives.

Revision history for this message
sbec67 (sbec) wrote :

its correct.. this Report is a duplicate of #197762
But #197762 has Status "incomplete", and is Medium. & almost 1 1/2 Years old !

I think this Bug should be High.. its realy a Pain in the A..... for many Users.
Regards
Sbec

Revision history for this message
Simon Holm (odie-cs) wrote :

sbec67:
Have you tried your device with another OS with different performance results?

From your lsusb output is does not look like your device is listed? Was it plugged in when you ran lsusb?

If that doesn't turn up anything obvious you should probably see https://help.ubuntu.com/community/DiskPerformance and test whether the raw device performance is better than when formatted. Backup your data on the device first.

dd numbers with bs=1M in addition to the bs=32 would also be interesting.

Revision history for this message
MMS-Prodeia (mms-prodeia) wrote :

I do second this report!

Revision history for this message
sbec67 (sbec) wrote :

Simon Holm Thøgersen:
it was connected, and it also shows on lsusb
Bus 001 Device 004: ID 0402:5661 ALi Corp

its a Bug, witch hit many Users.
Do a google...... just one Talk:
http://ubuntuforums.org/showthread.php?t=1306333

i passed all Data with the apport-collect command.. Unit was connected while doing this command.

i did a other Test with a Sony 4GB USB stick.. there i dont see this Problem...
so it seems that that Problem happens only with certain Flash Drives
Regards

Mudstone (agovernali)
description: updated
Revision history for this message
Mathias Zubek (mathias-zubek) wrote :

Hello,

I have this problem too....
2 Installations: My PC is an Intel MoBo DG965WH; my Laptop is an ThinkpadT60.
Both with Karmnic 9.10 and patches are up to date. Maximum Write speed after transferring more than 150MB slows down to 3,5MB/s.
Adding ehci_hcd to /etc/initramfs-tools/modules as recommended in some posts was not helpful.

Thanks in advance

Revision history for this message
sbec67 (sbec) wrote :

@colin Mins: any new on this, as its assigned to U ?

Rgerads
Sbec

Revision history for this message
Badcam (kiwicameron+launchpad) wrote :

I believe that I have this very same issue on my Mint 8.0 distro. I have two 5 port USB 2.0 hubs and only 1 USB port in either device is showing as a USB 2.0, with all the rest showing 1.1 I've checked the hardware and every port is supposed to be 2.0

Running a "sudo dmesg | grep usb" swwms to show that all the devices are recognised as 2.0 but somehow being limited to 1.1:

[156037.796028] usb 10-2: new full speed USB device using uhci_hcd and address 5
[156037.941105] usb 10-2: not running at top speed; connect to a high speed hub
[156037.979263] usb 10-2: configuration #1 chosen from 1 choice

I have not attached a patch, just my Terminal info.

Revision history for this message
aleth (aleth) wrote :

I also have this issue when writing to a 4Gb USB stick. Transfer rates start high, then slow to a crawl or even stall completely. As they slow down, the whole computer becomes unresponsive for seconds at a time.
On the same machine, the same USB stick is working fine under Windows XP; read access is also no problem.

Revision history for this message
aleth (aleth) wrote :

Just in case, it might be worth pointing out there are many more reports of this problem at http://ubuntuforums.org/showthread.php?t=1306333&page=12

Revision history for this message
André Desgualdo Pereira (desgua) wrote :

This is one workaround until it's got fixed: update the kernel.

Go to http://kernel.ubuntu.com/~kernel-ppa/mainline and choose a recent linux-image file. Download and install.

Changed in linux (Ubuntu):
status: In Progress → Confirmed
Revision history for this message
blahde (daisy-ice) wrote :

@André Desgualdo Pereira:

which kernel exactly do you recommend? Linux 2.6.35-020635rc1-generic does not bring any changes - for me.

Revision history for this message
André Desgualdo Pereira (desgua) wrote :

@blahde

The 2.6.33 kernel has worked with Karmic Koala, but I haven 't tested with Lucid Lynx.

Regards.

Revision history for this message
blahde (daisy-ice) wrote :

@André Desgualdo Pereira:

Thank you. Unfortunately 2.6.33 doesn't make any difference for me either - on Lucid Lynx. Whereas I can confirm that the transfer rate from USB to USB seems to be not affected by this.

Revision history for this message
André Desgualdo Pereira (desgua) wrote :

Sorry I can't help any further.
If I found something I will post here.
Regards.

Revision history for this message
Adam Kulagowski (fidor-fidor) wrote :

I have Sandisk Backup U3. (32G). I've tested it on 4 different computers. I had always writing speed around 3MB/s, which is slow, because this pendrive is capable of doing 16-17MB/s writing speed. The only way I'm able to put files faster is to use dd with bs=64 AND oflag=direct. Using these options I have full writing speed.

dd if=ubuntu-10.04-server-amd64.iso of=/media/FE35-228F/file.bin bs=64k oflag=direct
10840+1 records in
10840+1 records out
710412288 bytes (710 MB) copied, 43,8788 s, 16,2 MB/s

What is also important I can break the copying proccess any time. Without oflag=direct dd ignores Ctrl-C or even kill -9.

This was tested on 10.04 and with similar result on "Recovery is Possible" distro (kernel 2.6.34-git16)

uname -a
Linux fidor 2.6.32-24-generic #38-Ubuntu SMP Mon Jul 5 09:20:59 UTC 2010 x86_64 GNU/Linux

Maybe it will help.

Revision history for this message
Adam Kulagowski (fidor-fidor) wrote :

I've made a typo in my previous comment: you have to specify (at least in my case) bs=64k , not bs=64. Command line example was correct. Any other value, bigger or smaller (32k, 128k, 256k) brings speed down back to 3MB/s.

One more thing. I've found second pen drive which is working correctly (full 5MB/s writing speed). There are some small differences in lsusb between those two. I'm attaching lsusb -v output.

On the working pen drive (Adata) bs doesn't really matter. Of course with bigger block size, you get bigger writing speed up to 128k. Bigger bs than 128k doesnt change anything. I've tested from 256k up to 2048k, still achieving full writing speed.

I'll try to test more USB sticks.

Revision history for this message
FriedChicken (domlyons) wrote :
Changed in linux (Ubuntu):
assignee: Colin King (colin-king) → nobody
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Adam Porter (alphapapa)
summary: - since Ubuntu karmic Filetransfer to some USB Drives got realy slow
+ USB file transfer causes system freezes; ops take hours instead of
+ minutes
Brad Figg (brad-figg)
tags: added: precise
Changed in linux (Ubuntu):
status: Won't Fix → Confirmed
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.2.0-12.20
Adam Porter (alphapapa)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: bot-stop-nagging.
Changed in linux:
importance: Unknown → High
status: Unknown → Confirmed
tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Changed in linux:
status: Confirmed → Fix Released
penalvch (penalvch)
Changed in linux (Ubuntu):
status: Triaged → Incomplete
754 comments hidden view all 834 comments
Revision history for this message
In , vitaly.v.ch (vitaly.v.ch-linux-kernel-bugs) wrote :

I have successfully reproduce this bug on my HP Z200 under ubuntu 12.04 LTS. After some investigation I found out than main reason of this bug is very ugly bottleneck in block device layer so cores of my Z200 spend almost all time in spinning on spinlock while we have disabled IRQ on ALL cores.

AO (aofrl10n)
tags: added: trusty
penalvch (penalvch)
tags: added: bot-stop-nagging
removed: apport-collected bot-stop-nagging. slow trusty usb
Revision history for this message
In , horst-bugme-osdl (horst-bugme-osdl-linux-kernel-bugs) wrote :

I'm still seeing this.

Setup: Debian 7 Wheezy, amd64 backports kernel (3.11-0.bpo.2-amd64), ~45MB/s write of a low number of large files by rsync (fed through a GBit ethernet link) on an ext3 FS (rw,noatime,data=ordered) in a LVM2 partition on a hardware RAID5.

Observation: The machine (32-core Xeon E5-4650, 192 GB RAM), primarily servicing multiple interactive users via SSH, x2go and SunRay sessions, gets completely unusable during and quite some time after the rsync transfer. TCP connections to SunRay clients time out, IRC connections are dropped, even simple tools like "htop" don't do anything but clear the screen after being started. "iotop" shows a [jbd2/dm-1-8] process on top, reportedly doing "99.99%" I/O (but not reading or writing a single byte, maybe because it's a kernel thread?).

Once I switch from the default CFQ I/O scheduler to "deadline" (echo deadline > /sys/block/sdb/queue/scheduler), the symptoms disappear completely.

Revision history for this message
In , yanp.bugz (yanp.bugz-linux-kernel-bugs) wrote :

Still face this bug. Kernel 3.16

Is it possible to preserve 5% of IO for user/othe processes needs? Any fast download or copying eats 99.99 of IO and system is hard to use.

Revision history for this message
In , aros (aros-linux-kernel-bugs) wrote :

I'm curious: this bug was ostensibly fixed years ago however I dare everyone, who owns an Android smartphone, run a simple test. Invoke any terminal emulator and execute this command:

$ cat < /dev/zero > /sdcard/EMPTY

What's terribly unpleasant is that _all_ CPU cores become busy (more than 75% load), and the CPU jumps into the highest performance state, i.e. frequency, i.e. power consumption. Obviously this is wrong, bad and shouldn't happen. This test is kinda artificial as no Android app can create such a high IO load, but then there are multiple phones out there with either 5GHz MIMO 802.11n or 802.11ac chips which allow up to 80MB/sec throughput which can easily saturate most if not all internal MMC cards and have the same effect as the above command.

Perhaps vanilla kernel bugzilla is not a place to discuss bugs in Android, but latest Android releases usually feature kernels 3.10.x and 4.1.x without that many patches, so this bug is still there. Both these kernels are currently maintained and supported. Android by default never uses SWAP (one of the reasons for this bug).

Go figure.

P.S. Sample apps from Google Play:

* CPU Stats by takke
* Terminal Emulator by Jack Palevich

Revision history for this message
In , eugene.seppel (eugene.seppel-linux-kernel-bugs) wrote :

I've just experienced this issue with 3.19.0-32-generic on Ubuntu.
My KTorrent downloaded files to NTFS filesystem on SATA3 drive (fuse, download speed was about 100Mbit/s), simultaneously I copied files from that filesystem to USB3.0 flash drive with NTFS filesystem. That resulted poor interactive performance, mouse and windows lags. The workaround was to suspend torrent downloa until files copied.

Hardware: One AMD FX(tm)-8320 Eight-Core Processor, 8 GB RAM.

Revision history for this message
In , gooberslot (gooberslot-linux-kernel-bugs) wrote :

This bug is definitely not fixed. A simple cp from one drive to another makes a huge impact on my desktop. Trying to do an rsync is even worse. It seems to mainly be a problem with large files. My system is old (Athlon II 250) but even an old P3 running Win98 doesn't lag this bad from just copying files.

Revision history for this message
In , bes1002t (bes1002t-linux-kernel-bugs) wrote :

I'm trying to copy 50gb from one tower to another via USB 3.0 and it is really no fun. If I would copy all files at once the speed is decreasing constantly. After 30 minutes it copies with 1.0MB/s. If I copy a bunch of directories it is a littlebit better but also decreases in speed. For 2GB my Linux system needs more than an hour. This bug is definitely not fixed. On Windows this USB Stick is working without that speed loss.

OS: Fedora 24
Kernel: 4.8.15

Revision history for this message
In , bes1002t (bes1002t-linux-kernel-bugs) wrote :

I've noticed that this happens not everytime when I use exact the same USB stick. For my 2GB files (Eclipse with workspace and a project) I needed one hour to copy. It startet with 60MB/s and decreased to 500KB/s. Now I copy 16gb (android studio and some other projects) and it only needs about 15minutes. The copy speed startet with 70MB/s and at the end it was 22MB/s fast. So it also decreases but not as fast as in my 2GB copy process.

Revision history for this message
In , mikhail.v.gavrilov (mikhail.v.gavrilov-linux-kernel-bugs) wrote :

It seems Kernel developers not look this topic here, much better to write to the mailing list.

Revision history for this message
In , oleksandr (oleksandr-linux-kernel-bugs) wrote :

Does Jens' buffered writeback throttling patchset solve your issue?

Revision history for this message
In , aros (aros-linux-kernel-bugs) wrote :

(In reply to bes1002t from comment #642)
> I'm trying to copy 50gb from one tower to another via USB 3.0 and it is
> really no fun. If I would copy all files at once the speed is decreasing
> constantly. After 30 minutes it copies with 1.0MB/s. If I copy a bunch of
> directories it is a littlebit better but also decreases in speed. For 2GB my
> Linux system needs more than an hour. This bug is definitely not fixed. On
> Windows this USB Stick is working without that speed loss.
>
> OS: Fedora 24
> Kernel: 4.8.15

This bug report has nothing to do with the speed of copying data to USB flash drive. It's about substantially degraded interactivity which manifests in slowness and it's hard to believe you can perceive it via an SSH session.

I'm inclined to believe your bug is related to other subsystems like USB.

> It seems Kernel developers not look this topic here, much better to write to
> the mailing list.

Kernel bugzilla has always been neglected. Thousands of bug reports which have zero comments from prospective developers. LKML is a hit and miss too. Your developer skipped your e-mail because he/she was busy? Bad luck.

Revision history for this message
In , mpartap (mpartap-linux-kernel-bugs) wrote :

@bes1002t: I think throughput is a different issue than this, although it might well be related.

But most important would be for someone to create a I/O concurrency / latency benchmark. Maybe the Phoronix Test Suite is an adequate tool for that? It can also be used for automatic bisecting..

I clearly remember pre-2.6.18 times where I had a much inferior machine and while gent0o's emerge was compiling stuff in the background with multiple threads, I could browse the web switch between programs and play a HD stream without any hickup or stalling.

Revision history for this message
In , thomas.pi (thomas.pi-linux-kernel-bugs) wrote :

@bes1002t: Copying to a USB device always starts with the speed of the harddrive as all is cached till the write cache is full and ends with the speed of the usb drive. The write process has to wait till all data is written.

@Artem S. Tashkinov: The stall problems on a ssh session exists or existed. I have migrated an old server with CentOS 6 and copied some vm images. The ssh responsiveness was very bad. I had to wait for up to 20 seconds for tab auto to complete.

I many cases it was a swap problem, as the buffers are full and the caches need a long time to be written to a slow usb device. The server starts to swap process data. It's only a very small amount of data. I could increase the overall desktop performance with an RAM upgrade.

Revision history for this message
In , iam (iam-linux-kernel-bugs) wrote :

Try Kernel 4.10.

>Improved writeback management
>
>Since the dawn of time, the way Linux synchronizes to disk the data written to
>memory by processes (aka. background writeback) has sucked. When Linux writes
>all that data in the background, it should have little impact on foreground
>activity. That's the definition of background activity...But for a long as it
>can be remembered, heavy buffered writers have not behaved like that. For
>instance, if you do something like $ dd if=/dev/zero of=foo bs=1M count=10k,
>or try to copy files to USB storage, and then try and start a browser or any
>other large app, it basically won't start before the buffered writeback is
>done, and your desktop, or command shell, feels unreponsive. These problems
>happen because heavy writes -the kind of write activity caused by the
>background writeback- fill up the block layer, and other IO requests have to
>wait a lot to be attended (for more details, see the LWN article).
>
>This release adds a mechanism that throttles back buffered writeback, which
>makes more difficult for heavy writers to monopolize the IO requests queue,
>and thus provides a smoother experience in Linux desktops and shells than what
>people was used to. The algorithm for when to throttle can monitor the
>latencies of requests, and shrinks or grows the request queue depth
>accordingly, which means that it's auto-tunable, and generally, a user would
>not have to touch the settings. This feature needs to be enabled explicitly in
>the configuration (and, as it should be expected, there can be regressions)

Revision history for this message
In , marius (marius-linux-kernel-bugs) wrote :

Hi..

Thank you for your email.

I am sorry, but this email will soon be disabled..

Please send everything work related to <email address hidden>

Please send private mails to <email address hidden>

bye m.

Revision history for this message
In , mikhail.v.gavrilov (mikhail.v.gavrilov-linux-kernel-bugs) wrote :

> Try Kernel 4.10.
It not helps in my work load :(
still freezing mouse pointer and keyboard input

Revision history for this message
In , iam (iam-linux-kernel-bugs) wrote :

Make sure your kernel has that option enabled.

>This feature needs to be enabled explicitly in
>the configuration (and, as it should be expected, there can be regressions)

Revision history for this message
In , mikhail.v.gavrilov (mikhail.v.gavrilov-linux-kernel-bugs) wrote :

I read this https://kernelnewbies.org/Linux_4.10 and this https://kernelnewbies.org/Linux_4.10 articles, but I not seen name of this option.

Revision history for this message
In , mikhail.v.gavrilov (mikhail.v.gavrilov-linux-kernel-bugs) wrote :

Created attachment 255491
$ cat /boot/config-`uname -r`

Revision history for this message
In , aros (aros-linux-kernel-bugs) wrote :

(In reply to Mikhail from comment #651)

First, I'd recommend trying to disable SWAP completely - it might help:

$ sudo swapoff -a

If you compile your own kernel or your distro hasn't enabled them for you, here's the list of the options you need to enable:

BLK_WBT, enable support for block device writeback throttling
BLK_WBT_MQ, multiqueue writeback throttling
BLK_WBT_SQ, single queue writeback throttling

They are all under "Enable the block layer".

If disabling swap and enabling these options have no effect, please ***create a new bug report*** and provide the following information:

CPU
Motherboard and BIOS version
RAM type and volume
Storage and its type
Kernel version and its .config

And also the complete output of these utilities:

dmesg
lspci -vvv
lshw
free
vmstat (when the bug is exposed)

cat /proc/interrupts
cat /proc/iomem
cat /proc/meminfo
cat /proc/mttr

Revision history for this message
In , iam (iam-linux-kernel-bugs) wrote :

>CONFIG_BLK_WBT=y
># CONFIG_BLK_WBT_SQ is not set
>CONFIG_BLK_WBT_MQ=y

So writeback throttling is enabled only for multi queue devices in your case. I suppose you need to use blk-mq for your sd* devices to activate writeback throttling (scsi_mod.use_blk_mq=1 boot flag) or to recompile kernel with CONFIG_BLK_WBT_SQ enabled.

Revision history for this message
In , mikhail.v.gavrilov (mikhail.v.gavrilov-linux-kernel-bugs) wrote :

Created attachment 255501
all required files in one archive

Revision history for this message
In , mikhail.v.gavrilov (mikhail.v.gavrilov-linux-kernel-bugs) wrote :

After setting boot flag "scsi_mod.use_blk_mq=1", the freezes became much shorter. I'm not sure now that they are at the kernel level. More look like that window manager (GNOME mutter) is written in such a way that freezes mouse while loading list of applications. To finally defeat freezes, seems need to make the window manager not paged into the swap file.

I'm also catch vmstat output when freeze occurred:
# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r b swpd free buff cache si so bi bo in cs us sy id wa st
 2 6 15947052 205136 112592 4087608 32 41 93 119 7 23 43 19 37 1 0

Revision history for this message
In , aros (aros-linux-kernel-bugs) wrote :

Twice I asked you you to try disabling SWAP altogether and you still haven't.

I'm unsubscribing from this bug report.

Dimitrenko (paviliong6)
Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in linux (Fedora):
importance: Unknown → High
status: Unknown → Won't Fix
Revision history for this message
In , zvova7890 (zvova7890-linux-kernel-bugs) wrote :

Created attachment 274511
Per deice dirty ration configuration support

Per device dirty bytes configuration

Revision history for this message
In , zvova7890 (zvova7890-linux-kernel-bugs) wrote :

Per device dirty bytes configuration. Patch is not ideal, i'm make it for smoothly flash drive wriring by passing smaller value of dirty byte per removeable device.

>> Path
# ls /sys/block/sdc/bdi/
dirty_background_bytes dirty_background_ratio dirty_bytes dirty_ratio max_ratio min_pages_to_flush min_ratio power read_ahead_kb stable_pages_required subsystem uevent

>> udev Rule for removeables device
# cat /etc/udev/rules.d/90-dirty-flash.rules
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{removable}=="1", ATTR{bdi/dirty_bytes}="4194304"

Revision history for this message
In , chriswy27 (chriswy27-linux-kernel-bugs) wrote :

Was this bug actually fixed? The status shows CLOSED CODE_FIX with a last modified date of Dec 5 2018. I don't see any updates as to what was corrected, and what version the fix will be put into?

Revision history for this message
In , gatekeeper.mail (gatekeeper.mail-linux-kernel-bugs) wrote :

Created attachment 282477
attachment-6179-0.html

This was never fixed and since bug state cheating with no commit info ever
provided even if asked directly, will never be fixed. Nobody just cares and
I guess nobody even figured out who broke the kernel by which changeset and
when. Just buy another couple of Xeons for your zupa-dupa web-serfing
desktop and pray it's enough for loads of waits when you format your
diskette. Another approach is to buy enough ram to hold whole your block
devices set there so write-outs are quick enough and you won't see
microsecond lags. This is complete workaround list they provided since the
bug opened.

вт, 23 апр. 2019 г., 18:21 <email address hidden>:

> https://bugzilla.kernel.org/show_bug.cgi?id=12309
>
> protivakid (<email address hidden>) changed:
>
> What |Removed |Added
>
> ----------------------------------------------------------------------------
> CC| |<email address hidden>
>
> --- Comment #663 from protivakid (<email address hidden>) ---
> Was this bug actually fixed? The status shows CLOSED CODE_FIX with a last
> modified date of Dec 5 2018. I don't see any updates as to what was
> corrected,
> and what version the fix will be put into?
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

Revision history for this message
In , vi0oss (vi0oss-linux-kernel-bugs) wrote :

As far as I understand, this is kind of meta-bug: there are multiple causes and multiple fixes.

"I do bulk IO and it gets slow" sounds rather general, and problem that can resurface anytime due to some new underlying issue. So the problem cannot be really "closed for good" no matter how much technical progress is made.

For me 12309 basically stopped happening unless I deliberately tune "/proc/sys/vm/dirty_*" values to non-typical ranges and forgot to revert them back. I see system controllably slowing down processes doing bulk IO so the system in general stays reasonable. This behaviour is one of outcomes of this bug.

I don't expect meaningful technical discussion to be happen in this thread. It should just serve as a hub for linking to specific new issues.

Revision history for this message
In , powerman-asdf (powerman-asdf-linux-kernel-bugs) wrote :

Sure it's a meta bug, but for me 12309 is still actual, and I don't use any tuning for I/O subsystem at all.

Not as bad as years ago when it happens for the first time, but I still have to throttle rtorrent to download at 2.5MB/sec maximum instead of usual 10MB/s if I like to view films in mplayer at same time without jitter/freeze/lag. And that's on powerful and modern enough system with kernel 4.19.27, CPU i7-2600K @ 4.5GHz, RAM 24GB, and HDD 3TB Western Digital Caviar Green WD30EZRX-00D. This is annoying, and I remember time before 12309 when rtorrent without any throttling won't make mplayer to freeze on less powerful hardware.

Revision history for this message
In , gatekeeper.mail (gatekeeper.mail-linux-kernel-bugs) wrote :

Created attachment 282483
attachment-22369-0.html

Well, I've tried to report a new bug to investigate my own "my CPU does
nothing because waiting is too hard for it". Of no interest of any kernel
dev. So, just as Linus once said "f**k you Nvidia", the very same goes back
to linux itself. Pity some devs think that make their software linux-bound
(via udev only binding or alsa only sound out) is a good idea (gnome and
even parts of KDE). They forgot 15 years ago they picketed Adobe for having
flash for win only. Now one has to use 12309-bound crap for not having a
way to run his software on another platform.

вт, 23 апр. 2019 г., 21:29 <email address hidden>:

> https://bugzilla.kernel.org/show_bug.cgi?id=12309
>
> --- Comment #665 from _Vi (<email address hidden>) ---
> As far as I understand, this is kind of meta-bug: there are multiple
> causes and
> multiple fixes.
>
> "I do bulk IO and it gets slow" sounds rather general, and problem that can
> resurface anytime due to some new underlying issue. So the problem cannot
> be
> really "closed for good" no matter how much technical progress is made.
>
> For me 12309 basically stopped happening unless I deliberately tune
> "/proc/sys/vm/dirty_*" values to non-typical ranges and forgot to revert
> them
> back. I see system controllably slowing down processes doing bulk IO so the
> system in general stays reasonable. This behaviour is one of outcomes of
> this
> bug.
>
> I don't expect meaningful technical discussion to be happen in this
> thread. It
> should just serve as a hub for linking to specific new issues.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

Revision history for this message
In , mpartap (mpartap-linux-kernel-bugs) wrote :

Can 'someone' please open a bounty on creation of a VM test case, f.e. with `vagrant` or `phoronix test suite`?
Basically, a way to reproduce and quantify the perceived/actual performance difference between
> Linux 2.6.17 Released 17 June, 2006
and
> Linux 5.0 Released Sun, 3 Mar 2019

(In reply to Alex Efros from comment #666)
> Not as bad as years ago […]
> And that's on powerful and modern enough system with
> kernel 4.19.27, CPU i7-2600K @ 4.5GHz, RAM 24GB, and HDD 3TB […]
> This is annoying, and I remember time before
> 12309 when rtorrent without any throttling won't make mplayer to freeze on
> less powerful hardware.

Oh yeah, this... i can clearly remember back then when on a then mid-range machine with a lot of compiling (gentoo => 100% cpu �[U+1F923]�) and filesystem work, VLC used to play an HD video stream even under heavy load without any hiccups and micro-stuttering.. It was impressive at the time.. and then.. it broke �[U+1F928]�

Revision history for this message
In , vitaly.v.ch (vitaly.v.ch-linux-kernel-bugs) wrote :

According to my attempts to fix this bug, I totally disagree with you.

This bug is caused by pure design of current block dev layer. Methods which are good to develop code is absolutely improper for developing ideas. It's probably the key problem of the Linux Comunity. Currently, there is merged WA for block devices with a good queue such as Samsung Pro NVMe.

WBR,

Vitaly

(In reply to _Vi from comment #665)
> As far as I understand, this is kind of meta-bug: there are multiple causes
> and multiple fixes.
>
> "I do bulk IO and it gets slow" sounds rather general, and problem that can
> resurface anytime due to some new underlying issue. So the problem cannot be
> really "closed for good" no matter how much technical progress is made.
>
> For me 12309 basically stopped happening unless I deliberately tune
> "/proc/sys/vm/dirty_*" values to non-typical ranges and forgot to revert
> them back. I see system controllably slowing down processes doing bulk IO so
> the system in general stays reasonable. This behaviour is one of outcomes of
> this bug.
>
> I don't expect meaningful technical discussion to be happen in this thread.
> It should just serve as a hub for linking to specific new issues.

Revision history for this message
In , todorovic.s (todorovic.s-linux-kernel-bugs) wrote :

Had this again 20 minutes ago.
Was dopying 8.7GiB of data from one directory to another directory on the same filesystem (ext4 (rw,relatime,data=ordered)) on the same disk (Western Digital WDC WD30EZRX-00D8PB0 spinning metal disk).

The KDE UI became unresponsive (Everything other than /home and user data in on a SSD), could not launch any new applications. Opening a new tab on Firefox to go to Youtube didnt load the page, and kept saying waiting for youtube.com in the status bar (network gets halted?).

dmesg shows these, are they important?

[25013.905943] INFO: task DOMCacheThread:17496 blocked for more than 120 seconds.
[25013.905945] Tainted: P OE 4.15.0-54-generic #58-Ubuntu
[25013.905947] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25013.905949] DOMCacheThread D 0 17496 2243 0x00000000
[25013.905951] Call Trace:
[25013.905954] __schedule+0x291/0x8a0
[25013.905957] schedule+0x2c/0x80
[25013.905959] jbd2_log_wait_commit+0xb0/0x120
[25013.905962] ? wait_woken+0x80/0x80
[25013.905965] __jbd2_journal_force_commit+0x61/0xb0
[25013.905967] jbd2_journal_force_commit+0x21/0x30
[25013.905970] ext4_force_commit+0x29/0x2d
[25013.905972] ext4_sync_file+0x14a/0x3b0
[25013.905975] vfs_fsync_range+0x51/0xb0
[25013.905977] do_fsync+0x3d/0x70
[25013.905980] SyS_fsync+0x10/0x20
[25013.905982] do_syscall_64+0x73/0x130
[25013.905985] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[25013.905987] RIP: 0033:0x7fc9cb839b07
[25013.905988] RSP: 002b:00007fc9a7aeb200 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
[25013.905990] RAX: ffffffffffffffda RBX: 00000000000000a0 RCX: 00007fc9cb839b07
[25013.905992] RDX: 0000000000000000 RSI: 00007fc9a7aeaff0 RDI: 00000000000000a0
[25013.905993] RBP: 0000000000000000 R08: 0000000000000000 R09: 72732f656d6f682f
[25013.905994] R10: 0000000000000000 R11: 0000000000000293 R12: 00000000000001f6
[25013.905995] R13: 00007fc97fc5d038 R14: 00007fc9a7aeb340 R15: 00007fc987523380

Revision history for this message
In , alpha_one_x86 (alphaonex86-linux-kernel-bugs) wrote :

KDE have problem too, same copy via CLI or via Ultracopier (GUI) have no problem.
I note too KDE have UI more slow, plasma doing CPU usage in case I use the HDD...

Revision history for this message
In , howaboutsynergy (howaboutsynergy-linux-kernel-bugs) wrote :

What's the value of `vm.dirty_writeback_centisecs` ?, ie.
$ sysctl vm.dirty_writeback_centisecs

try setting it to 0 to disable it, ie.
`$ sudo sysctl -w vm.dirty_writeback_centisecs=0`

I found that this helps my network transfer not stall/stop at all(for a few seconds when that is =1000 for example) while some kinda of non-async `sync`(command)-like flushing is going on periodically while transferring GiB of data files from sftp to SSD!(via Midnight Commander, on a link limited to 10MiB per second)

vm.dirty_writeback_centisecs is how often the pdflush/flush/kdmflush processes wake up and check to see if work needs to be done.

Coupled with the above I've been using another value:
`vm.dirty_expire_centisecs=1000`
for both cases (when stall and not stall), so this one remained fixed to =1000.

vm.dirty_expire_centisecs is how long something can be in cache before it needs to be written. In this case it's 1 seconds. When the pdflush/flush/kdmflush processes kick in they will check to see how old a dirty page is, and if it's older than this value it'll be written asynchronously to disk. Since holding a dirty page in memory is unsafe this is also a safeguard against data loss.

Well, with the above, at least I'm not experiencing network stalls when copying GiB of data via Midnight Commander's sftp to my SSD until some kernel-caused sync-ing is completed in the background.

I don't know if this will work for others, but if curious about any of my other (sysctl)settings, they should be available for perusing [here](https://github.com/howaboutsynergy/q1q/tree/0a2cd4ba658067140d3f0ae89a0897af54da52a4/OSes/archlinux/etc/sysctl.d)

Revision history for this message
In , howaboutsynergy (howaboutsynergy-linux-kernel-bugs) wrote :

correction:

> In this case it's 1 seconds.

*In this case it's 10 seconds.

Also, heads up:
I found that 'tlp' in `/etc/default/tlp`, on ArchLinux, will overwrite the values set in `/etc/sysctl.d/*.conf` files if these are set to non `0`, ie.
MAX_LOST_WORK_SECS_ON_AC=10
MAX_LOST_WORK_SECS_ON_BAT=10
will set:
vm.dirty_expire_centisecs=1000
vm.dirty_writeback_centisecs=1000

regardless of what values you set them in `/etc/sysctl.d/*.conf` files.

/etc/default/tlp is owned by tlp 1.2.2-1

Not setting those (eg. commenting them out) will have tlp set the to its default of 15 sec (aka =1500). So the workaround is to set them to =0 which makes tlp not set them at all, thus the values from `/etc/sysctl.d/*.conf` files is allowed to remain as set.

Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
In , mricon (mricon-linux-kernel-bugs) wrote :

I'm making this bug private to prevent more spam from being added to it.

Revision history for this message
In , caroljames972022 (caroljames972022-linux-kernel-bugs) wrote :
Revision history for this message
In , alexwrinner (alexwrinner-linux-kernel-bugs) wrote :

[25013.905943] INFO: task DOMCacheThread:17496 blocked for more than 120 seconds.
[25013.905945] Tainted: P OE 4.15.0-54-generic #58-Ubuntu
[25013.905947] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25013.905949] DOMCacheThread D 0 17496 2243 0x00000000
[25013.905951] Call Trace:
[25013.905954] __schedule+0x291/0x8a0
[25013.905957] schedule+0x2c/0x80
[25013.905959] jbd2_log_wait_commit+0xb0/0x120
[25013.905962] ? wait_woken+0x80/0x80 cheap essay https://trustanalytica.com/online/top-cheap-essay-writing-services /0xb0
[25013.905965] __jbd2_journal_force_commit+0x61/0xb0
[25013.905967] jbd2_journal_force_commit+0x21/0x30
[25013.905970] ext4_force_commit+0x29/0x2d
[25013.905972] ext4_sync_file+0x14a/0x3b0
[25013.905975] vfs_fsync_range+0x51/0xb0
[25013.905977] do_fsync+0x3d/0x70
[25013.905980] SyS_fsync+0x10/0x20
[25013.905982] do_syscall_64+0x73/0x130
[25013.905985] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[25013.905987] RIP: 0033:0x7fc9cb839b07

Displaying first 40 and last 40 comments. View all 834 comments or add a comment.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.