A protocol fix
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
FileStore |
Fix Released
|
Critical
|
Jason Gerard DeRose |
Bug Description
Thanks to a friendly anonymous reviewer, I'm making a tweak to the protocol to fix what now seems like a really obvious problem.
Because of the way I concatenated the leaf_index and leaf_data, it was easy to manipulate things across this boundary.
For example, Leaf(2, b'2test') has the same hash as Leaf(22, b'test').
Even though I quoted this in the protocol description and apparently didn't fully take it to heart, this is exactly the sort of thing the Skein personalization strings are designed to prevent:
"""
All Skein applications (except the PRNG output production) can be personalized with the personalization input. We recommend that all application designers seriously consider doing this; we have seen many protocols where a hash that is computed in one part of the protocol can be used in an entirely different part because two hash computations were done on similar or related data, and the attacker can force the application to make the hash inputs the same. Personalizing each hash function used in the protocol summarily stops this type of attack.
"""
So I'm changing things from:
leaf_hash = hash(leaf_index + leaf_data)
To:
leaf_hash = hash(hash(
Where the inner hash is done using a different personalization string. Now you most certainly cannot shift bits between the leaf_index and leaf_data "domain" and get the same result. This puts a very strong boundary in place. The inner hash alone would protect against the sort of glaring oops the reviewer brought up, but with the distinct personalization string it also protects against subtle cases that are way beyond my limited crypto knowledge.
Although this wasn't such a glaring issue for the root_hash, I'm also making the equivalent change there, from:
root_hash = hash(file_size + leaf_hashes)
To:
root_hash = hash(hash(
Where again the inner hash is using a distinct personalization string.
4 distinct hashing "domains" calls for 4 distinct personalization strings so you can't cross the streams!
Much thanks, friendly anonymous reviewer!
Related branches
- David Jordan: Approve
-
Diff: 519 lines (+147/-110)4 files modifieddoc/filestore.rst (+35/-7)
doc/protocol.rst (+95/-78)
filestore.py (+7/-15)
test_filestore.py (+10/-10)
Changed in filestore: | |
status: | In Progress → Fix Committed |
Changed in filestore: | |
status: | Fix Committed → Fix Released |