Yes, whenever the bug gets triggered it's because it throws the value into the negative. However without the patch that's not what happens.
Let's take this for example:
2016-12-05T14:41:07.903932Z qemu-system-x86_64: VQ 2 size 0x80 <
last_avail_idx 0x9 - used_idx 0xa
Without patch: 0x9 - 0xA = 65535
With patch: 0x9 - 0xA = -1 (reset to 0)
Because the integers are unsigned in the comparison they don't support negative values, so it will end up reverting to the highest digit like 65535. So I thought the safest solution was to convert any negative value to 0 because any comparisons with a integer negative value against an unsigned could produce unexpected results elsewhere like it did here.
From my understanding the bug only gets triggered when you're migrating from an older qemu version or a VM that was originally booted on an older version, so that's likely why upstream still functions most of the time.
I spent an ungodly amount of time debugging this issue, and every report of a similar problem that was on google (with the same error message), the math always put the value into the negative. Started out by doing the same thing as everyone else, trying to find what patch broke the issue or any new commits that could help instead of analyzing the problem itself.
Really annoying issue because technically speaking the code is correct it's the behavior that's not because of mixed types. Primary goal was to retain the CVE patch if possible.
Yes, whenever the bug gets triggered it's because it throws the value into the negative. However without the patch that's not what happens.
Let's take this for example: 05T14:41: 07.903932Z qemu-system-x86_64: VQ 2 size 0x80 <
2016-12-
last_avail_idx 0x9 - used_idx 0xa
Without patch: 0x9 - 0xA = 65535
With patch: 0x9 - 0xA = -1 (reset to 0)
Because the integers are unsigned in the comparison they don't support negative values, so it will end up reverting to the highest digit like 65535. So I thought the safest solution was to convert any negative value to 0 because any comparisons with a integer negative value against an unsigned could produce unexpected results elsewhere like it did here.
From my understanding the bug only gets triggered when you're migrating from an older qemu version or a VM that was originally booted on an older version, so that's likely why upstream still functions most of the time.
I spent an ungodly amount of time debugging this issue, and every report of a similar problem that was on google (with the same error message), the math always put the value into the negative. Started out by doing the same thing as everyone else, trying to find what patch broke the issue or any new commits that could help instead of analyzing the problem itself.
Really annoying issue because technically speaking the code is correct it's the behavior that's not because of mixed types. Primary goal was to retain the CVE patch if possible.