1

I have a dataframe with information regarding all employers from a given company. All employers should have an ID and the corresponding Manager ID.

Example:
data = pd.DataFrame({'Parent':['a','a','b','c','c','f','q','z','k'],
                      Child':['b','c','d','f','g','h','k','q','w']})

a
├── b
│   └── d
└── c
    ├── f
    │   └── h
    └── g
z
└── q
    └── k
        └── w

(example: w reports to k and k reports to q and q reports to z)

I would like to get a new dataframe which contains information from all employers as follows:

 child  level1  level2  level x
   a      a        -        -
   b      a        -        -
   d      a        b        -
   c      a        -        -
   f      a        c        -
   h      a        c        f
   g      a        c        -
   z      z        -        -
   q      z        -        -
   k      z        q        -
   w      z        q        k

I do not know how many levels there are upfront therefore I have used 'level x'. I guess I somehow need a recursive pattern iterate over the dataframe.

5
  • Could you make this question more specific? On the left you want to have the child and then you always want to start from top level inheritance to lowest level inheritance? Commented Apr 4, 2023 at 15:45
  • Please provide enough code so others can better understand or reproduce the problem. Commented Apr 4, 2023 at 16:41
  • Are you still looking for help worth this question? Commented Apr 5, 2023 at 16:29
  • The bottom part of your expected output doesn't look right. 'z' isn't a parent. Commented Apr 7, 2023 at 4:15
  • The second part depicts the reporting structure from high level (left) to lower level (right). z (as well as a) are the two top level managers reporting to no-one. Commented Apr 7, 2023 at 14:54

1 Answer 1

1

I'm posting this code in the hope that someone other that the OP finds it useful since the OP seems to have lost interest.

Note the the output does NOT exactly meet the OP's requirements.

import pandas as pd

def get_manager(row, column, data):
    manager_ids = data.index[data['Child'] == row[column]].tolist()
    return data['Parent'][manager_ids[0]] if manager_ids else '-'

data = pd.DataFrame({'Parent': ['a','a','b','c','c','f','q','z','k'],
                     'Child':  ['b','c','d','f','g','h','k','q','w']})
staff = sorted(set(list(data['Parent']) + list(data['Child'])))
df = pd.DataFrame(staff, columns=[0])  # we start with all staff in first column
for i in range(len(staff)):  # can't have more than len(staff) columns
    df[i+1] = df.apply(lambda row: get_manager(row, i, data), axis=1)
    if sum(df[i+1].str.count('-')) == len(staff):
        break  # when no higher level managers
print(df)  # we could stop here but the OP wants the order reversed.
for index, row in df.iterrows():
    row = list(row)
    row.reverse()  # We want the top managers first
    i = len(row) - 1 - row[::-1].index('-')  # index of last '-'
    row = row[i+1:] + row[:i]  # we rotate the -'s to the end and drop the 1st col.
    print('  '.join(row))

Output:

    0  1  2  3  4
0   a  -  -  -  -
1   b  a  -  -  -
2   c  a  -  -  -
3   d  b  a  -  -
4   f  c  a  -  -
5   g  c  a  -  -
6   h  f  c  a  -
7   k  q  z  -  -
8   q  z  -  -  -
9   w  k  q  z  -
10  z  -  -  -  -
a  -  -  -
a  b  -  -
a  c  -  -
a  b  d  -
a  c  f  -
a  c  g  -
a  c  f  h
z  q  k  -
z  q  -  -
z  q  k  w
z  -  -  -
Sign up to request clarification or add additional context in comments.

1 Comment

Doesn't meet the OP's exact requirements but does almost exactly what I want it to do 2 years later ....

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.