0

I have a 2d numpy array called arm_resets that has positive integers. The first column has all positive integers < 360. For all columns other than the first, I need to replace all values over 360 with the value that is in the same row in the 1st column. I thought this would be a relatively easy thing to do, here's what I have:

i = 300
over_360 = arm_resets[:, [i]] >= 360
print(arm_resets[:, [i]][over_360])
print(arm_resets[:, [0]][over_360])
arm_resets[:, [i]][over_360] = arm_resets[:, [0]][over_360]
print(arm_resets[:, [i]][over_360])

And here's what prints:

[3600 3609 3608 ... 3600 3611 3605]
[ 0  9  8 ...  0 11  5]
[3600 3609 3608 ... 3600 3611 3605]

Since all numbers that are being shown in the first print (first 3 and last 3) are above 360, they should be getting replaced by the 2nd print in the 3rd print. Why is this not working?

edit: reproducible example:

df = pd.DataFrame({"start":[1,2,5,6],"freq":[1,5,6,9]})
periods = 6
arm_resets = df[["start"]].values
freq = df[["freq"]].values
arm_resets = np.pad(arm_resets,((0,0),(0,periods-1)))
for i in range(1,periods):
    arm_resets[:,[i]] = arm_resets[:,[i-1]] + freq
    #over_360 = arm_resets[:,[i]] >= periods
    #arm_resets[:,[i]][over_360] = arm_resets[:,[0]][over_360]
arm_resets

Given commented out code here's what prints:

array([[ 1,  2,  3,  4,  5,  6],
       [ 2,  7, 12, 17, 22, 27],
       [ 3,  9, 15, 21, 27, 33],
       [ 4, 13, 22, 31, 40, 49]])

What I would expect:

array([[ 1,  2,  3,  4,  5,  1],
       [ 2,  2, 2, 2, 2, 2],
       [ 3,  3, 3, 3, 3, 3],
       [ 4, 4, 4, 4, 4, 4]])

Now if it helps, the final 2d array I'm actually trying to create is a 1/0 array that indicates which are filled in, so in this example I'd want this:

array([[ 0,  1,  1,  1,  1,  1],
       [ 0,  0, 1, 0, 0, 0],
       [ 0,  0, 0, 1, 0, 0],
       [ 0, 0, 0, 0, 1, 0]])

The code I use to achieve this from the above arm_resets is this:

fin = np.zeros((len(arm_resets),periods),dtype=int)
for i in range(len(arm_resets)):
    fin[i,a[i]] = 1
6
  • All your numbers are >= 3600 what gives? Commented Oct 25, 2021 at 23:43
  • None of the prose matches the code or output Commented Oct 25, 2021 at 23:44
  • Most of the numbers in the later columns will be over 360 but many may not be. Commented Oct 25, 2021 at 23:53
  • Show an example with a 3x4 array or so. Making a proper minimal reproducible example is a skill that seems about as hard to learn as good debugging. It's good not only for SO posts, but also for your own debugging. An array with 300 elements is hard to visualize. One with 3 or 4 is easy. Commented Oct 25, 2021 at 23:58
  • Edited to add example of smaller array with expectations. Commented Oct 26, 2021 at 3:11

2 Answers 2

2

The slice arm_resets[:, [i]] is a fancy index, and therefore makes a copy of the ith column of the data. arm_resets[:, [i]][over_360] = ... therefore calls __setitem__ on a temporary array that is discarded as soon as the statement executes. If you want to assign to the mask, call __setitem__ on the sliced object directly:

arm_resets[over_360, [i]] = ...

You also don't need to make the index into a list. It's generally better to use simple indices, especially when doing assignments, since they create views rather than copies:

arm_resets[over_360, i] = ...

With slicing, even the following should work, since it calls __setitem__ on a view:

arm_resets[:, i][over_360] = ...

This index does not help you process each row of the data, since i is a column. In fact, you can process the entire matrix in one step, without looping, if you use indices rather than a boolean mask. The reason that indices are useful is that you can match the item from the correct row in the first column:

rows, cols = np.nonzero(arm_resets[:, 1:] >= 360)
arm_resets[rows, cols] = arm_resets[rows, 1]
Sign up to request clarification or add additional context in comments.

Comments

2

You can use np.where()

first_col = arm_resets[:,0] # first col
first_col = first_col.reshape(first_col.size,1) #Transfor in 2d array
arm_resets = np.where(arm_resets >= 360,first_col,arm_resets)

You can see in detail how np.where work here, but basically it compare arm_resets >= 360, if true it put first_col value in place (there another detail here with broadcasting) if false it put arm_resets value.

Edit: As suggested by Mad Physicist. You can use arm_resets[:,0,None] directly instead of creating first_col variable.

arm_resets = np.where(arm_resets >= 360,arm_resets[:,0,None],arm_resets)

3 Comments

Python variables can't start with numbers. What is c? You misspelled boolean.
Yeah, I messed up the variables names, thanks for point me the error.
Try arm_resets[:, 0, None]. You can write a one-liner that way

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.