I am trying to implement Theil's index (http://en.wikipedia.org/wiki/Theil_index) in Python to measure inequality of revenue in a list.
The formula is basically Shannon's entropy, so it deals with log. My problem is that I have a few revenues at 0 in my list, and log(0) makes my formula unhappy. I believe adding a tiny float to 0 wouldn't work as log(tinyFloat) = -inf, and that would mess my index up.
[EDIT] Here's a snippet (taken from another, much cleaner -and freely available-, implementation)
def error_if_not_in_range01(value):
if (value <= 0) or (value > 1):
raise Exception, \
str(value) + ' is not in [0,1)!'
def H(x)
n = len(x)
entropy = 0.0
sum = 0.0
for x_i in x: # work on all x[i]
print x_i
error_if_not_in_range01(x_i)
sum += x_i
group_negentropy = x_i*log(x_i)
entropy += group_negentropy
error_if_not_1(sum)
return -entropy
def T(x):
print x
n = len(x)
maximum_entropy = log(n)
actual_entropy = H(x)
redundancy = maximum_entropy - actual_entropy
inequality = 1 - exp(-redundancy)
return redundancy,inequality
Is there any way out of this problem?

