So v5.7 was fine and after many reboots it has been found that this commit below introduced the issue.
Do I also need to find when the issue was resolved ? (between v5.8-rc1 and v5.9.10) or is this information enough ?
54b2fcee1db041a83b52b51752dade6090cf952f is the first bad commit
commit 54b2fcee1db041a83b52b51752dade6090cf952f
Author: Keith Busch <email address hidden>
Date: Mon Apr 27 11:54:46 2020 -0700
nvme-pci: remove last_sq_tail
The nvme driver does not have enough tags to wrap the queue, and blk-mq
will no longer call commit_rqs() when there are no new submissions to
notify.
And my $ git bisect log is the following FWIW.
git bisect start
# good: [3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162] Linux 5.7
git bisect good 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
# bad: [b3a9e3b9622ae10064826dccb4f7a52bd88c7407] Linux 5.8-rc1
git bisect bad b3a9e3b9622ae10064826dccb4f7a52bd88c7407
# bad: [ee01c4d72adffb7d424535adf630f2955748fa8b] Merge branch 'akpm' (patches from Andrew)
git bisect bad ee01c4d72adffb7d424535adf630f2955748fa8b
# bad: [16d91548d1057691979de4686693f0ff92f46000] Merge tag 'xfs-5.8-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
git bisect bad 16d91548d1057691979de4686693f0ff92f46000
# good: [cfa3b8068b09f25037146bfd5eed041b78878bee] Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
git bisect good cfa3b8068b09f25037146bfd5eed041b78878bee
# good: [3fd911b69b3117e03181262fc19ae6c3ef6962ce] Merge tag 'drm-misc-next-2020-05-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
git bisect good 3fd911b69b3117e03181262fc19ae6c3ef6962ce
# good: [1966391fa576e1fb2701be8bcca197d8f72737b7] mm/migrate.c: attach_page_private already does the get_page
git bisect good 1966391fa576e1fb2701be8bcca197d8f72737b7
# bad: [0c8d3fceade2ab1bbac68bca013e62bfdb851d19] bcache: configure the asynchronous registertion to be experimental
git bisect bad 0c8d3fceade2ab1bbac68bca013e62bfdb851d19
# bad: [84b8d0d7aa159652dc191d58c4d353b6c9173c54] nvmet: use type-name map for ana states
git bisect bad 84b8d0d7aa159652dc191d58c4d353b6c9173c54
# good: [72e6329f86c714785ac195d293cb19dd24507880] nvme-fc and nvmet-fc: revise LLDD api for LS reception and LS request
git bisect good 72e6329f86c714785ac195d293cb19dd24507880
# good: [e4fcc72c1a420bdbe425530dd19724214ceb44ec] nvmet-fc: slight cleanup for kbuild test warnings
git bisect good e4fcc72c1a420bdbe425530dd19724214ceb44ec
# good: [31fdad7be18992606078caed6ff71741fa76310a] nvme: consolodate io settings
git bisect good 31fdad7be18992606078caed6ff71741fa76310a
# bad: [2a5bcfdd41d68559567cec3c124a75e093506cc1] nvme-pci: align io queue count with allocted nvme_queue in nvme_probe
git bisect bad 2a5bcfdd41d68559567cec3c124a75e093506cc1
# good: [6623c5b3dfa5513190d729a8516db7a5163ec7de] nvme: clean up error handling in nvme_init_ns_head
git bisect good 6623c5b3dfa5513190d729a8516db7a5163ec7de
# good: [74943d45eef4db64b1e5c9f7ad1d018576e113c5] nvme-pci: remove volatile cqes
git bisect good 74943d45eef4db64b1e5c9f7ad1d018576e113c5
# bad: [54b2fcee1db041a83b52b51752dade6090cf952f] nvme-pci: remove last_sq_tail
git bisect bad 54b2fcee1db041a83b52b51752dade6090cf952f
# first bad commit: [54b2fcee1db041a83b52b51752dade6090cf952f] nvme-pci: remove last_sq_tail
@kaihengfeng
So v5.7 was fine and after many reboots it has been found that this commit below introduced the issue.
Do I also need to find when the issue was resolved ? (between v5.8-rc1 and v5.9.10) or is this information enough ?
54b2fcee1db041a 83b52b51752dade 6090cf952f is the first bad commit 83b52b51752dade 6090cf952f
commit 54b2fcee1db041a
Author: Keith Busch <email address hidden>
Date: Mon Apr 27 11:54:46 2020 -0700
nvme-pci: remove last_sq_tail
The nvme driver does not have enough tags to wrap the queue, and blk-mq
will no longer call commit_rqs() when there are no new submissions to
notify.
Signed-off-by: Keith Busch <email address hidden>
Reviewed-by: Sagi Grimberg <email address hidden>
Signed-off-by: Christoph Hellwig <email address hidden>
Signed-off-by: Jens Axboe <email address hidden>
drivers/ nvme/host/ pci.c | 23 ++++--- ------- ------- --
1 file changed, 4 insertions(+), 19 deletions(-)
And my $ git bisect log is the following FWIW. c0504c904bd6e5c df3a5cf8162] Linux 5.7 0504c904bd6e5cd f3a5cf8162 0064826dccb4f7a 52bd88c7407] Linux 5.8-rc1 064826dccb4f7a5 2bd88c7407 7d424535adf630f 2955748fa8b] Merge branch 'akpm' (patches from Andrew) d424535adf630f2 955748fa8b 91979de4686693f 0ff92f46000] Merge tag 'xfs-5.8-merge-8' of git://git. kernel. org/pub/ scm/fs/ xfs/xfs- linux 1979de4686693f0 ff92f46000 5037146bfd5eed0 41b78878bee] Merge tag 'for-linus-hmm' of git://git. kernel. org/pub/ scm/linux/ kernel/ git/rdma/ rdma 037146bfd5eed04 1b78878bee e03181262fc19ae 6c3ef6962ce] Merge tag 'drm-misc- next-2020- 05-07' of git://anongit. freedesktop. org/drm/ drm-misc into drm-next 03181262fc19ae6 c3ef6962ce fb2701be8bcca19 7d8f72737b7] mm/migrate.c: attach_page_private already does the get_page b2701be8bcca197 d8f72737b7 1bbac68bca013e6 2bfdb851d19] bcache: configure the asynchronous registertion to be experimental bbac68bca013e62 bfdb851d19 52dc191d58c4d35 3b6c9173c54] nvmet: use type-name map for ana states 2dc191d58c4d353 b6c9173c54 785ac195d293cb1 9dd24507880] nvme-fc and nvmet-fc: revise LLDD api for LS reception and LS request 85ac195d293cb19 dd24507880 dbe425530dd1972 4214ceb44ec] nvmet-fc: slight cleanup for kbuild test warnings be425530dd19724 214ceb44ec 606078caed6ff71 741fa76310a] nvme: consolodate io settings 06078caed6ff717 41fa76310a 59567cec3c124a7 5e093506cc1] nvme-pci: align io queue count with allocted nvme_queue in nvme_probe 9567cec3c124a75 e093506cc1 3190d729a8516db 7a5163ec7de] nvme: clean up error handling in nvme_init_ns_head 190d729a8516db7 a5163ec7de 64b1e5c9f7ad1d0 18576e113c5] nvme-pci: remove volatile cqes 4b1e5c9f7ad1d01 8576e113c5 a83b52b51752dad e6090cf952f] nvme-pci: remove last_sq_tail 83b52b51752dade 6090cf952f a83b52b51752dad e6090cf952f] nvme-pci: remove last_sq_tail
git bisect start
# good: [3d77e6a8804abc
git bisect good 3d77e6a8804abcc
# bad: [b3a9e3b9622ae1
git bisect bad b3a9e3b9622ae10
# bad: [ee01c4d72adffb
git bisect bad ee01c4d72adffb7
# bad: [16d91548d10576
git bisect bad 16d91548d105769
# good: [cfa3b8068b09f2
git bisect good cfa3b8068b09f25
# good: [3fd911b69b3117
git bisect good 3fd911b69b3117e
# good: [1966391fa576e1
git bisect good 1966391fa576e1f
# bad: [0c8d3fceade2ab
git bisect bad 0c8d3fceade2ab1
# bad: [84b8d0d7aa1596
git bisect bad 84b8d0d7aa15965
# good: [72e6329f86c714
git bisect good 72e6329f86c7147
# good: [e4fcc72c1a420b
git bisect good e4fcc72c1a420bd
# good: [31fdad7be18992
git bisect good 31fdad7be189926
# bad: [2a5bcfdd41d685
git bisect bad 2a5bcfdd41d6855
# good: [6623c5b3dfa551
git bisect good 6623c5b3dfa5513
# good: [74943d45eef4db
git bisect good 74943d45eef4db6
# bad: [54b2fcee1db041
git bisect bad 54b2fcee1db041a
# first bad commit: [54b2fcee1db041