------- Comment From <email address hidden> 2017-03-08 05:56 EDT-------
Just tried the newest Kernel 4.4.0-66, and I'm still running into the hang. Here the final statements in /var/log/syslog (the lines, that never make it out onto the disk):
Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdb - tur checker timed out
Mar 8 11:26:31 mclint multipathd[955]: 8:16: reinstated
Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdd - tur checker timed out
Mar 8 11:26:31 mclint rsyslogd-2007: action 'action 10' suspended, next retry is Wed Mar 8 11:27:01 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar 8 11:26:31 mclint multipathd[955]: 8:48: reinstated
Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdc - tur checker timed out
Mar 8 11:26:31 mclint multipathd[955]: 8:32: reinstated
Mar 8 11:26:32 mclint multipathd[955]: mpatha: sda - tur checker timed out
Mar 8 11:26:32 mclint multipathd[955]: 8:0: reinstated
And this here shows up on the sclp_line console:
? 961.419327! INFO: task cpuplugd:2604 blocked for more than 120 seconds.
? 961.419337! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419338! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419404! INFO: task irqbalance:2651 blocked for more than 120 seconds.
? 961.419406! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419407! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419450! INFO: task kworker/0:4:3801 blocked for more than 120 seconds.
? 961.419451! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419452! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419494! INFO: task kworker/1:1:4548 blocked for more than 120 seconds.
? 961.419495! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419496! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419539! INFO: task kworker/0:0H:20302 blocked for more than 120 seconds.
? 961.419540! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419541! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419764! INFO: task kworker/0:0:66641 blocked for more than 120 seconds.
? 961.419766! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419767! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 961.419895! INFO: task rm:81710 blocked for more than 120 seconds.
? 961.419896! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419897! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419024! INFO: task systemd:1 blocked for more than 120 seconds.
? 1081.419033! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419035! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419148! INFO: task cpuplugd:2604 blocked for more than 120 seconds.
? 1081.419150! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419151! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419186! INFO: task irqbalance:2651 blocked for more than 120 seconds.
? 1081.419187! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419188! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
------- Comment From <email address hidden> 2017-03-08 05:56 EDT-------
Just tried the newest Kernel 4.4.0-66, and I'm still running into the hang. Here the final statements in /var/log/syslog (the lines, that never make it out onto the disk):
Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdb - tur checker timed out www.rsyslog. com/e/2007 ]
Mar 8 11:26:31 mclint multipathd[955]: 8:16: reinstated
Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdd - tur checker timed out
Mar 8 11:26:31 mclint rsyslogd-2007: action 'action 10' suspended, next retry is Wed Mar 8 11:27:01 2017 [v8.16.0 try http://
Mar 8 11:26:31 mclint multipathd[955]: 8:48: reinstated
Mar 8 11:26:31 mclint multipathd[955]: mpatha: sdc - tur checker timed out
Mar 8 11:26:31 mclint multipathd[955]: 8:32: reinstated
Mar 8 11:26:32 mclint multipathd[955]: mpatha: sda - tur checker timed out
Mar 8 11:26:32 mclint multipathd[955]: 8:0: reinstated
And this here shows up on the sclp_line console:
? 961.419327! INFO: task cpuplugd:2604 blocked for more than 120 seconds. kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this kernel/ hung_task_ timeout_ secs" disables this
? 961.419337! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419338! "echo 0 > /proc/sys/
message.
? 961.419404! INFO: task irqbalance:2651 blocked for more than 120 seconds.
? 961.419406! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419407! "echo 0 > /proc/sys/
message.
? 961.419450! INFO: task kworker/0:4:3801 blocked for more than 120 seconds.
? 961.419451! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419452! "echo 0 > /proc/sys/
message.
? 961.419494! INFO: task kworker/1:1:4548 blocked for more than 120 seconds.
? 961.419495! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419496! "echo 0 > /proc/sys/
message.
? 961.419539! INFO: task kworker/0:0H:20302 blocked for more than 120 seconds.
? 961.419540! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419541! "echo 0 > /proc/sys/
message.
? 961.419764! INFO: task kworker/0:0:66641 blocked for more than 120 seconds.
? 961.419766! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419767! "echo 0 > /proc/sys/
message.
? 961.419895! INFO: task rm:81710 blocked for more than 120 seconds.
? 961.419896! Not tainted 4.4.0-66-generic #87-Ubuntu
? 961.419897! "echo 0 > /proc/sys/
message.
? 1081.419024! INFO: task systemd:1 blocked for more than 120 seconds.
? 1081.419033! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419035! "echo 0 > /proc/sys/
message.
? 1081.419148! INFO: task cpuplugd:2604 blocked for more than 120 seconds.
? 1081.419150! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419151! "echo 0 > /proc/sys/
message.
? 1081.419186! INFO: task irqbalance:2651 blocked for more than 120 seconds.
? 1081.419187! Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419188! "echo 0 > /proc/sys/
message.
I pulled a DASD-Dump from the system:
KERNEL: /usr/lib/ debug/boot/ vmlinux- 4.4.0-66- generic 20170308_ kernel_ 4_4_0-66_ without_ openafs. dump
DUMPFILE: mclint_
CPUS: 3
DATE: Wed Mar 8 11:37:56 2017
UPTIME: 00:25:30
LOAD AVERAGE: 12.99, 11.25, 6.55
TASKS: 422
NODENAME: mclint
RELEASE: 4.4.0-66-generic
VERSION: #87-Ubuntu SMP Fri Mar 3 15:32:53 UTC 2017
MACHINE: s390x (unknown Mhz)
MEMORY: 7.8 GB
PANIC: ""
PID: 0
COMMAND: "swapper/0"
TASK: bb1538 (1 of 3) [THREAD_INFO: b7c000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
INFO: no panic task found
And again I see 10 multipath-Daemons in the process list, this is my typical hang scenario.
crash> ps | grep multipathd
955 1 0 1e49115f0 IN 0.1 335364 8316 multipathd
971 1 0 7e8b2be0 IN 0.1 335364 8316 multipathd
972 1 0 7e8b6db0 IN 0.1 335364 8316 multipathd
977 1 1 7e8b36d8 IN 0.1 335364 8316 multipathd
978 1 0 7e8b62b8 IN 0.1 335364 8316 multipathd
979 1 2 7e8b4cc8 IN 0.1 335364 8316 multipathd
81714 1 1 7cdc8000 IN 0.1 335364 8316 multipathd
81715 1 1 7cdc95f0 IN 0.1 335364 8316 multipathd
81716 1 1 7cdcc1d0 IN 0.1 335364 8316 multipathd
81717 1 1 1e6c595f0 IN 0.1 335364 8316 multipathd
I'll compress the dump and try to find ways to make it available to you ...