Unfortunately the best way to make this not to happen is by fixing the kernel hang situation, when kernel calls sd_sync_cache() to every configured device before the shutdown. There is a single I/O cmd hanging in all scsi paths and the I/O error is never propagated to block layer (despite iscsi having proper I/O error settings). I'm finishing analysing some kernel dumps so I can finally understand what is happening in the transport layer (this happens with more recent kernels also).
The workaround was to create a script that would restore the iscsi connection, wait for the login to happen again and the paths are back online, and cleanly logout, allowing the sd_sync_cache() operation to be finalized.
If you are facing this problem, I know for sure that your iscsi connections are not being finalized before the network is off. This means that you have to pay attention on how you configured your iscsi disks:
- guarantee that iscsiadm was configured with "interfaces" so it works on startup:
You have to make sure open-iscsi and iscsid systemd units are started after the network is available and are stopped before they disappear. That might be your problem, if configuration above is correct.
So you can see that iscsid.service runs BEFORE open-iscsi.service. In my case, I'm configuring network using rc-local.service (since this is my lab) and I had to guarantee the ordering also:
If, after configuring your system like this, you still face problems, you can use this script:
And provide me the DEBUG=/.shutdown.log file, created after its execution, attached to this launchpad case. Its likely that you will have hang iscsi connections for some reason (services ordering, lack of volumes in fstab so umounts are not done, etc).
Hello Matthijs
Unfortunately the best way to make this not to happen is by fixing the kernel hang situation, when kernel calls sd_sync_cache() to every configured device before the shutdown. There is a single I/O cmd hanging in all scsi paths and the I/O error is never propagated to block layer (despite iscsi having proper I/O error settings). I'm finishing analysing some kernel dumps so I can finally understand what is happening in the transport layer (this happens with more recent kernels also).
The workaround was to create a script that would restore the iscsi connection, wait for the login to happen again and the paths are back online, and cleanly logout, allowing the sd_sync_cache() operation to be finalized.
If you are facing this problem, I know for sure that your iscsi connections are not being finalized before the network is off. This means that you have to pay attention on how you configured your iscsi disks:
- guarantee that iscsiadm was configured with "interfaces" so it works on startup:
sudo iscsiadm -m iface -I ens4 --op=new -n iface.hwaddress -v 52:54:00:b4:21:bb
sudo iscsiadm -m iface -I ens7 --op=new -n iface.hwaddress -v 52:54:00:c2:34:1b
- the discovery/login has to be made AFTER the iscsiadm had interfaces added
sudo iscsiadm -m discovery --op=new --op=del --type sendtargets --portal $SERVER1
sudo iscsiadm -m discovery --op=new --op=del --type sendtargets --portal $SERVER2
# iscsiadm -m node --loginall= automatic HAS TO WORK or else init scripts will fail
http:// pastebin. ubuntu. com/25894472/
- configure the volumes in /etc/fstab with "_netdev" parameter for systemd unit ordering
LABEL=BLUE /blue ext4 defaults,_netdev 0 1
LABEL=GREEN /green ext4 defaults,_netdev 0 1
LABEL=PURPLE /purple ext4 defaults,_netdev 0 1
LABEL=RED /red ext4 defaults,_netdev 0 1
LABEL=YELLOW /yellow ext4 defaults,_netdev 0 1
You have to make sure open-iscsi and iscsid systemd units are started after the network is available and are stopped before they disappear. That might be your problem, if configuration above is correct.
inaddy@iscsihang:~$ systemctl edit --full iscsid.service
inaddy@iscsihang:~$ systemctl edit --full open-iscsi.service
The defaults are:
[Unit] man:iscsid( 8) online. target remote- fs-pre. target remote- fs-pre. target target network- online. target
Description=iSCSI initiator daemon (iscsid)
Documentation=
Wants=network-
Before=
After=network.
and
[Unit] man:iscsiadm( 8) man:iscsid(8) online. target remote- fs-pre. target iscsid.service online. target iscsid.service remote- fs-pre. target
Description=Login to default iSCSI targets
Documentation=
Wants=network-
After=network-
Before=
So you can see that iscsid.service runs BEFORE open-iscsi.service. In my case, I'm configuring network using rc-local.service (since this is my lab) and I had to guarantee the ordering also:
If, after configuring your system like this, you still face problems, you can use this script:
http:// pastebin. ubuntu. com/25894592/
And provide me the DEBUG=/ .shutdown. log file, created after its execution, attached to this launchpad case. Its likely that you will have hang iscsi connections for some reason (services ordering, lack of volumes in fstab so umounts are not done, etc).
Hope it helps for now.