2

I need to calculate a checksum for a hex serial word string using XOR. To my (limited) knowledge this has to be performed using the bitwise operator ^. Also, the data has to be converted to binary integer form. Below is my rudimentary code - but the checksum it calculates is 1000831. It should be 01001110 or 47hex. I think the error may be due to missing the leading zeros. All the formatting I've tried to add the leading zeros turns the binary integers back into strings. I appreciate any suggestions.

    word = ('010900004f')

    #divide word into 5 separate bytes
    wd1 = word[0:2] 
    wd2 = word[2:4]
    wd3 = word[4:6]
    wd4 = word[6:8]
    wd5 = word[8:10]

    #this converts a hex string to a binary string
    wd1bs = bin(int(wd1, 16))[2:] 
    wd2bs = bin(int(wd2, 16))[2:]
    wd3bs = bin(int(wd3, 16))[2:]
    wd4bs = bin(int(wd4, 16))[2:]

    #this converts binary string to binary integer
    wd1i = int(wd1bs)
    wd2i = int(wd2bs)
    wd3i = int(wd3bs)
    wd4i = int(wd4bs)
    wd5i = int(wd5bs)

    #now that I have binary integers, I can use the XOR bitwise operator to cal cksum
    checksum = (wd1i ^ wd2i ^ wd3i ^ wd4i ^ wd5i)

    #I should get 47 hex as the checksum
    print (checksum, type(checksum))
10
  • I think this has been addressed before see this question Commented Mar 30, 2014 at 1:55
  • 0x47 != 0b1001110. Very few odd numbers end in 0 in their binary representation. Commented Mar 30, 2014 at 2:01
  • @PyNEwbie That is true, but here we are facing an XY-problem par excellence. Commented Mar 30, 2014 at 2:07
  • What is a "binary integer"? Why are you interpreting digit strings in base 2 as if they were in base 10? I think you're getting the numbers and their representations mixed up, which is why you're going on this binary detour. Commented Mar 30, 2014 at 2:11
  • 1
    @user3284986 I always find it practical to distinguish between the "representation" of a number and its "value". 0x2a, 0b101010 and 42 all have the same value. But the value 42 can be represented as 0x2a, 0b101010 or 42. An integer is not binary, or decimal, or hexadecimal, ternary, unary or gray-coded: an integer is an integer, i.e. an element of Z. Its representation can be binary, decimal, etc, pp. Commented Mar 30, 2014 at 3:29

3 Answers 3

5

Why use all this conversions and the costly string functions?

(I will answer the X part of your XY-Problem, not the Y part.)

def checksum (s):
    v = int (s, 16)
    checksum = 0
    while v:
        checksum ^= v & 0xff
        v >>= 8
    return checksum

cs = checksum ('010900004f')
print (cs, bin (cs), hex (cs) )

Result is 0x47 as expected. Btw 0x47 is 0b1000111 and not as stated 0b1001110.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks - that is eloquent. And it does solve the X part, rendering the Y part superfluous. Superfluous but mysterious . . .
@user3284986 Glad to help. Not that mysterious, just check out the link PyNewbie posted in his comment to your question.
@Victory I wouldn't call it a trick, but a standard approach.
1
s = '010900004f'
b = int(s, 16)
print reduce(lambda x, y: x ^ y, ((b>> 8*i)&0xff for i in range(0, len(s)/2)), 0)

3 Comments

Will fail on python3, which I suspect OP to be using, as he uses print as a function.
@Hyperboreus: possibly, but it's tagged 2.7.
@DSM I could swear it wasn't tagged for any specific version when I wrote that comment. I must have been blind.
1

Just modify like this.

before:

wd1i = int(wd1bs)
wd2i = int(wd2bs)
wd3i = int(wd3bs)
wd4i = int(wd4bs)
wd5i = int(wd5bs)

after:

wd1i = int(wd1bs, 2)
wd2i = int(wd2bs, 2)
wd3i = int(wd3bs, 2)
wd4i = int(wd4bs, 2)
wd5i = int(wd5bs, 2)

Why your code doesn't work?

Because you are misunderstanding int(wd1bs) behavior. See doc here. So Python int function expect wd1bs is 10 base by default. But you expect int function to treat its argument as 2 base. So you need to write as int(wd1bs, 2)


Or you can also rewrite your entire code like this. So you don't need to use bin function in this case. And this code is basically same as @Hyperboreus answer. :)

w = int('010900004f', 16)
w1 = (0xff00000000 & w) >> 4*8
w2 = (0x00ff000000 & w) >> 3*8
w3 = (0x0000ff0000 & w) >> 2*8
w4 = (0x000000ff00 & w) >> 1*8
w5 = (0x00000000ff & w)

checksum = w1 ^ w2 ^ w3 ^ w4 ^ w5

print hex(checksum)
#'0x47'

And this is more shorter one.

import binascii
word = '010900004f'
print hex(reduce(lambda a, b: a ^ b, (ord(i) for i in binascii.unhexlify(word))))
#0x47

4 Comments

All your lines wx = (0x.... can be written as wX = (w >> Y*8) & 0xff. Just shift first and mask after, then it is always 0xff.
@Hyperboreus: Ah this is more smarter way thanks. My code is always very verbose... :)
@user3284986: I added explanation See what's wrong in your code.
@user2931409 - Many thanks to you. You are spot on about my incorrect assumption regarding the base 10 default. I don't mind verbosity if it aids readability - which is what a python newbie like myself needs . . . . But eventually, we all want to type as little as possible :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.