I am attempting to replicate the hashing the Excel does when a sheet is password-protected in Python, but am not matching even when testing on dummy inputs. From the xml file, I am seeing this:
sheetProtection algorithmName="SHA-512"
hashValue="Ua4h+FTPQI0+aCSbQ1Ya9fDYsddMzCfAypD1u1TBGmNONIy6sRfJBLDoMhOfbCv0i5Q2t1JOm4okjSvC1CsJYw=="
saltValue="Furur6jnDIFaQBhHQBXzFA=="
spinCount="100000"
To replicate this, I coded the following in Python:
import hashlib
import base64
hash_value = "Ua4h+FTPQI0+aCSbQ1Ya9fDYsddMzCfAypD1u1TBGmNONIy6sRfJBLDoMhOfbCv0i5Q2t1JOm4okjSvC1CsJYw=="
salt_value = "Furur6jnDIFaQBhHQBXzFA=="
password = "password"
pdata = password.encode('utf-8')
sdata = base64.b64decode(salt_value)
hash_iter = (sdata + pdata)
for i in range(100000):
hash_iter = hashlib.sha512(hash_iter).digest()
print(base64.b64encode(hash_iter).decode())
which returns the following result:
9o4313eeh/ym8+GHSHW4iyh1usvNVD1DflzET5WgG9QKutn0loM24Op7/McAGr4D5H10W+DuQCD8Tj8Cn7uDOg==
What am I doing wrong here? I have tried switching between prefix/suffix for the salt, and it does not get me Excel's final hash. I have also tried different encodings for the plain text password into binary, but that doesn't seem to be the issue. I have a suspicion it might have something to do with how the hash is iterated, but I am not sure what I'm doing wrong.
Hn = H(Hn-1 + iterator)