1

I know there are similar questions already on SO but none of them seem to address this problem. I have inherited the following c# code that has been used to create password hashes in a legacy .net app, for various reasons the C# implementation is now being migrated to php:

string input = "fred";
SHA256CryptoServiceProvider provider = new SHA256CryptoServiceProvider();
byte[] hashedValue = provider.ComputeHash(Encoding.ASCII.GetBytes(input));
string output = "";
string asciiString = ASCIIEncoding.ASCII.GetString(hashedValue);
foreach ( char c in asciiString ) {
   int tmp = c;
   output += String.Format("{0:x2}", 
             (uint)System.Convert.ToUInt32(tmp.ToString()));
}
return output;

My php code is very simple but for the same input "fred" doesn't produce the same result:

$output = hash('sha256', "fred");

I've traced the problem down to an encoding issue - if I change this line in the C# code:

string asciiString = ASCIIEncoding.ASCII.GetString(hashedValue);

to

string asciiString = ASCIIEncoding.UTF7.GetString(hashedValue);

Then the php and C# output match (it yields d0cfc2e5319b82cdc71a33873e826c93d7ee11363f8ac91c4fa3a2cfcd2286e5).

Since I'm not able to change the .net code I need to work out how to replicate the results in php.

Thanks in advance for any help,

1
  • PLEASE add a system that upgrades the encoding and hashing scheme whenever a user logs in. The C# scheme is horrible. It is barely better than plaintext password storage. Commented Feb 5, 2012 at 14:09

3 Answers 3

3

I don’t know PHP well enough to answer your question; however, I must point out that your C# code is broken. Try generating the hash of these two inputs: "âèí" and "çñÿ". You will find that their hash collides:

3f3b221c6c6e3f71223f51695d456d52223f243f3f363949443f3f763b483615

The first bug lies in this operation:

Encoding.ASCII.GetBytes(input)

This assumes that all characters within your input are US-ASCII. Any non-ASCII characters would cause the encoder to fall back to the byte value for the ? character, thereby giving (unwanted) hash collisions, as demonstrated above. Notwithstanding, this will not be an issue if your input is constrained to only allow US-ASCII characters.

The other (more severe) bug lies in the following operation:

ASCIIEncoding.ASCII.GetString(hashedValue)

ASCII only defines mappings for values 0–127. Since the elements of your hashedValue byte array may contain any byte value (0–255), encoding them as ASCII would cause data to be lost whenever a value greater than 127 is encountered. This may lead to further “unwanted” (read: potentially maliciously generated) hash collisions, even when your original input was US-ASCII.

Given that, statistically, half of the bytes constituting your hashes would be greater than 127, then you are losing at least half the strength of your hash algorithm. If a hacker gains access to your stored hashes, it is quite likely that they will manage to devise an attack to generate hash collisions by exploiting this cryptographic weakness.

Edit: Notwithstanding the considerations mentioned in my posts and Jon’s, here is the PHP code that succumbs to the same weakness – so to speak – as your C# code, and thereby gives the same hash:

$output = hash('sha256', $input, true);

for ($i = 0; $i < strlen($output); $i++)
   if ($output[$i] > chr(127))
       $output[$i] = '?';

$output = bin2hex($output);
Sign up to request clarification or add additional context in comments.

6 Comments

The code is more broken than that - it's then taking the binary MD5 hash and assuming it's valid ASCII :(
Exactly. So even if all the input is valid ASCII, the system is still vulnerable to maliciously-generated hash collisions.
Thanks. There's actually some code that validates the input is ASCII prior to to it going through the hashing code. The hash is a password hash though so clashes don't really matter - as long as the output is consistent (I think that's all that matters). Unfortunately the .net team responsible for the code no longer exists.
Yes, consistency remains the top priority. Clashes would matter if some malicious user manages to gain access to the stored hashes in your database, since they might, due to the explained cryptographic vulnerability, be able to generate ‘fake’ passwords that give the same hash, thereby allowing them to log in with the compromised user’s account without needing to know the original password. If your database is secure, this is probably not an issue.
Weird, run that code on writecodeonline and it matches the .net output exactly. It doesn't match on my machine running php 5.3.10. Thanks for taking the time to put together the code. Will do a bit of digging through php now.
|
1

Could you use mb_convert_encoding (see http://php.net/manual/en/function.mb-convert-encoding.php - the page also has a link to a list of supported encodings) to convert the PHP string to ASCII from UTF7?

2 Comments

Issues in original code apart, I think this is the right direction to maintain consistency. Encode input as ASCII (bytes), generate SHA-256 hash, decode binary hash as ASCII (string).
Thanks. I've tried this as well and couldn't get the output to match. As Douglas suggests I've encoded to ASCII, generated hash and then decoded the binary output as ASCII.
1

I've traced the problem down to an encoding issue

Yes. You're trying to treat arbitrary binary data as if it's valid text-encoded data. It's not. You should not be using any Encoding here.

If you want the results in hex, the simplest approach is to use BitConverter.ToString

string text = BitConverter.ToString(hashedValue).Replace("-", "").ToLower();

And yes, as pointed out elsewhere, you probably shouldn't be using ASCII to convert the text to binary at the start of the hashing process. I'd probably use UTF-8.

It's really important that you understand the problem here though, as otherwise you'll run into it in other places too. You should only use encodings such as ASCII, UTF-8 etc (on any platform) when you've genuinely got encoded text data. You shouldn't use them for images, the results of cryptography, the results of hashing, etc.

EDIT: Okay, you say you can't change the C# code... it's not clear whether that just means you've got legacy data, or whether you need to keep using the C# code regardless. You should absolutey not run this code for a second longer than you have to.

But in PHP, you may find you can get away with just replacing every byte with a value >= 0x80 in the hash with 0x3F, which is the ASCII for "question mark". If you look through your data you'll probably find there are a lot of 3F bytes in there.

If you can get this to work, I would strongly suggest that you migrate over to the true MD5 hash without losing information like this. Wherever you're storing the hashes, store two: the legacy one (which is all you have now) and the rehashed one. Whenever you're asked to validate that a password is correct, you should:

  • Check whether you have a "new" one; if so, only use that - ignore the legacy one.
  • If you only have a legacy one:
    • Hash the password in the broken way to check whether it's correct
    • If it is, hash it again properly and store the results in the "new" place.

Then when everyone's logged in correctly once, you'll be able to wipe out the legacy hashes.

4 Comments

Thanks but as I pointed out in the question I can't change the c# code.
Thanks Jon. That sounds like a good approach. We have around ~100k password hashes generated with this code, within the next week or so I need to have migrated them to php. At that point the c# code will never see the light of day again and I'll gradually migrate the legacy hashes to new, properly implemented versions as you've suggested.
I thought of just the same workaround :-) Pasted above thanks to writecodeonline.com/php.
The part about wiping the legacy hashes is very important. And of course the new one should not be a plain hash(md5, sha1, sha2,...) but a function like bcrypt or PBKDF2.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.