0

The sum of the elements of an array needs to be one while the elements adhere to certain maximum and minimum constraints. If the element is smaller than the minimum allowed value, it element may be set to zero.

For example, if the input array (shown in python )

arr = np.array([0.1 , 0.1 , 0.8 , 0.01])
max_value = 0.5
min_value = 0.1

Then some approach may say:

// enforce minimum constraints
arr = np.where(arr < min_value, 0.0, arr) // arr = [0.1, 0.1, 0.8 , 0.0]
// enforce maximum constraints
arr = np.where(arr > max_value, max_value, arr) // arr =  [0.1, 0.1, 0.5 , 0.0]
//normalize so sum == 1
arr \= arr.sum() // arr = [0.14285714, 0.14285714, 0.71428571, 0.] -- violating max constraint

I've tried many approaches for this problem. I've tried to split the cases where the sum > 1 and sum < 0. If it is larger than one, I tried to subtract the excess from other weights, but ensuring the minimum constraints hold becomes a problem, even with some proportional subtraction. Same for adding to the weights until the sum is 1 , the maximum constraints are a problem. The sum of elements must be == 1. The maximum constraint will always ensure a sum of 1 is possible. max_value < 1/len(arr)

Another approach can be to conditionally scale the elements based on some factor of the constraints.

1 Answer 1

0

I think in order to achieve this you have to add quite some code. You already see the problem, that after the first normalization there can be values which are larger than your maximum. An idea would be to take this "overhead" and split it upon the remaining values, but here there are several options:

arr = np.array([0.1 , 0.1 , 0.8 , 0.01])
max_value = 0.5
min_value = 0.1

def normalize_with_contrains(to_normalize, val_max, val_min):
    to_normalize = np.where(to_normalize < val_min, 0.0, to_normalize)
    to_normalize = np.where(to_normalize > val_max, val_max, to_normalize)
    to_normalize /= to_normalize.sum()
    overhead = np.sum(np.where(to_normalize > val_max, to_normalize - val_max, 0.0))
    while overhead > 0.0:
        to_distribute = np.argwhere(np.where(to_normalize > 0.5, 0.0, to_normalize) > val_min)
        to_normalize = np.where(to_normalize > val_max, val_max, to_normalize)
        num_entries = to_distribute.shape[0]
        if not num_entries:
            return None
        distr_add = overhead / num_entries
        for index in to_distribute:
            to_normalize[index] += distr_add
        overhead = np.sum(np.where(to_normalize > val_max, to_normalize - val_max, 0.0))
    return to_normalize
        

output = normalize_with_contrains(arr, max_value, min_value)
# will be [0.25 0.25 0.5  0.  ]

But this approach has a flaw: An array with values 0.01, 0.01, 0.6, 0.05 will not have a place to distribute the values. And there are also some issues with other minimum and maximum values. There will not always be a solution for every combination. In these cases, I decided to return None.

You could choose to first add the overhead to the empty positions and just after that distribute it to the other positions:

arr = np.array([0.1 , 0.1 , 0.8 , 0.01])
max_value = 0.5
min_value = 0.1

def normalize_with_contrains(to_normalize, val_max, val_min):
    to_normalize = np.where(to_normalize < val_min, 0.0, to_normalize)
    to_normalize = np.where(to_normalize > val_max, val_max, to_normalize)
    to_normalize /= to_normalize.sum()
    overhead = np.sum(np.where(to_normalize > val_max, to_normalize - val_max, 0.0))

    # split overhead to "empty" positions:
    if overhead > val_min:
        to_distribute = np.argwhere(to_normalize == 0.0)
        for index in to_distribute:
            if over_head > val_max:
                to_normalize[index] = val_max
                over_head -= val_max
                if over_head < val_min:
                    break
            else:
                to_normalize[index] = overhead
                overhead = 0.0
                break
    
    # split remaining overhead to other positions
    while overhead > 0.0:
        to_distribute = np.argwhere(np.where(to_normalize > 0.5, 0.0, to_normalize) > val_min)
        to_normalize = np.where(to_normalize > val_max, val_max, to_normalize)
        num_entries = to_distribute.shape[0]
        if not num_entries:
            return None
        distr_add = overhead / num_entries
        for index in to_distribute:
            to_normalize[index] += distr_add
        overhead = np.sum(np.where(to_normalize > val_max, to_normalize - val_max, 0.0))
    return to_normalize
        

output = normalize_with_contrains(arr, max_value, min_value)
# will be [0.14285714 0.14285714 0.71428571 0.21428571]

But this can also run into issues: If your maximum value is smaller than the inverse size of your array, you cannot normalize it anyway (e.g. maximum_val = 0.2 and arr = [0.33, 0.33, 0.33]).

As I said, these are some suggestions, as you were not really specific about what you want to achieve in the end.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.