2

Is there an easy possibility to delete similar values in an array (with condition) without using a for loop? For example lets say I have an array

np.array([1.2, 3.4, 3.5, 8.9, 10.9])

In this case, i would set the condition for example difference < 0.3 and get as an output

np.array([1.2, 3.4, 8.9, 10.9])

I haven't seen anything on the internet similar to this question. Of course there is the function .unique() but that is only for exactly the same values.

5
  • You may need your own function to do this. Commented Nov 28, 2022 at 8:00
  • 1
    Can you specify your problem a little bit more? Which values should be deleted if an array contains 3.4, 3.5, 3.6? And what is with the case where you have a row of similar values, for example: [3.4, 3.5, 3.6, 3.7, 3.8, 3.9]? Depending on this you need to define your own function. Commented Nov 28, 2022 at 8:03
  • What would be the output if you have [1.2, 1.5, 1.7]? Here you can either keep [1.5] or keep [1.2, 1.7] Commented Nov 28, 2022 at 8:03
  • For example for the array [1.2 1.5 1.7] i want to only have the first element. I need this for a data set which is in my case: [1266.20287121 1269.10287778 1420.00321996 1428.10323833 1432.30324785 1804.80409253 1808.50410092] and i want to keep only [1266.20287121 1420.00321996 1804.80409253] Commented Nov 28, 2022 at 8:05
  • What's the criteria by which to keep values? Should the first, average, or some other value be kept? What if a removed value decides whether to remove another – which values to keep out of 3.4, 3.6, 3.8? Are the values always sorted – which values to keep out of 3.4, 1.2, 3.5? Commented Nov 28, 2022 at 8:13

1 Answer 1

1

If you want to delete the successive values, you can compute the successive differences and perform boolean indexing:

a = np.array([1.2, 3.4, 3.5, 8.9, 10.9])

out = a[np.r_[True, np.diff(a)>=0.3]]

Or, if you want the absolute difference:

out = a[np.r_[True, np.abs(np.diff(a))>=0.3]]

Output:

array([ 1.2,  3.4,  8.9, 10.9])

Intermediates:

np.diff(a)
# array([2.2, 0.1, 5.4, 2. ])

np.diff(a)>=0.3
# array([ True, False,  True,  True])

np.r_[True, np.diff(a)>=0.3]
# array([ True,  True, False,  True,  True])
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much! This is exactly what I searched for :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.