3

How do I merge data with similar values reading from an excel file?

import pandas as pd
import numpy as np
df = pd.read_excel("testfile.xlsx")
print(df)

File example: testdata.xlsx

Identifier   Dates
123456       1/1/2021
789101       2/2/2021
221342       3/3/2021
231344       1/1/2021
134562       2/2/2021
135650       2/2/2021
135677       2/2/2021
2246         1/1/2021
24682        3/3/2021
245684       1/1/2021

Output data wanted (merge the data corresponding to a certain date):

2/2/2021   789101 134562 135650 135677  
1/1/2021   245684   2246 231344
3/3/2021   24682  221342
3
  • 1
    you want to groupby Commented Oct 8, 2021 at 16:43
  • 1
    Do you want separate columns for each of the fields? (If so this is a pivot) Commented Oct 8, 2021 at 16:44
  • No I want add all the data having the same dates into one line. For example, for the date 2/2/2021, it has multiple Identifiers, I want all the identifiers for 2/2/2021 in one line and so on. Commented Oct 8, 2021 at 18:06

1 Answer 1

3

Does this solve your problem?

df.groupby(['Dates'])['Identifier'].apply(list)
Dates
1/1/2021      [123456, 231344, 2246, 245684]
2/2/2021    [789101, 134562, 135650, 135677]
3/3/2021                     [221342, 24682]
Name: Identifier, dtype: object

If you dont want this as a list, but as a string with spaces separated, as you indicate in your question, then try this -

df.astype({'Identifier':str}).groupby(['Dates'])['Identifier'].apply(' '.join)
Dates
1/1/2021      123456 231344 2246 245684
2/2/2021    789101 134562 135650 135677
3/3/2021                   221342 24682
Name: Identifier, dtype: object
Sign up to request clarification or add additional context in comments.

4 Comments

I want add all the data having the same dates into one line. For example, for the date 2/2/2021, it has multiple Identifiers, I want all the identifiers for 2/2/2021 in one line and so on. –
Did you get a chance to try out the approach above? I believe it solves what you mention in your comment above.
I did but it didnt work
could you elaborate on why it didnt work? what is the output you are getting? did you try running it with the dummy data that you have posted yourself?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.