0

For instance I have thousands row with one of its is column 'cow_ID' where each cow ID have several rows. I want to replace those ID with number starting from 1 just to make it easier to remember.

df['cow_id'].unique().tolist()

resulting in:

 5603,
 5606,
 5619,
 4330,
 5587,
 4967,
 5554,
 4879,
 4151,
 5501,
 4723,
 4908,
 3963,
 4023,
 4573,
 3986,
 5668,
 4882,
 5645,
 5548

How do I change each unique ID into new number such as:

5603 -> 1
5606 -> 2

3 Answers 3

3

Try to look at

df.groupby('cow_id').ngroup()+1

Or try pd.factorize:

pd.factorize(df['cow_id'])[0]+1

As in the documentation, pd.factorize Encodes the object as an enumerated type or categorical variable.

Note that there are two return variables of pd.factorize

Sign up to request clarification or add additional context in comments.

Comments

1

What you are looking for should be tagged with categorical encoding. sklearn library in python has many preprocessing methods out of which label encoder should do the job for you. Refer this link. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html#sklearn.preprocessing.LabelEncoder

Also keep in mind that using encodings like these might introduce some bias in your dataset as some algorithms can consider one label higher than the other, i.e., 1 > 2> ...>54 . Refer this blog to learn more about encodings and when to use what https://towardsdatascience.com/encoding-categorical-features-21a2651a065c

Let me know if you have any questions.

Comments

1

Here is the result using pandas.Categorical. The benefit is that you keep the original data and can flip back and forth.Here I create a variable called "c" that holds both the original categories and the new codes

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.