1

I have two pandas dataframes. The first one contains 3401 row 1 column, the second one is 4 row with 3 column.

But what I got is (exemple output of my script):

 DataFrame1      |     DataFrame2 

 - email1        |     -Id1  -Project1 -Descr1
 - email2        |     -Id2  -Project2 -Descr2
 - email3        |     -Id3  -Project3 -Descr3
 - email4        |     -Id4  -Project4 -Descr4
 - email5        |     -None -None     -None 
  ... ....       |      ... ...   
 - email3401     |     -None -None     -None

What I want to do is for every mail, I want to get something like that :

 - mail1, Id1, Project1, Descr1, Id2, Project2, ... , Id4, Project4, Descr4
 - mail2, Id1, Project1, Descr1, Id2, Project2, ... , Id4, Project4, Descr4
 ... ...
 - mail3401, Id1, Project1, Descr1, Id2, Project2, ... , Id4, Project4, Descr4 

Thanks for Advices !

Here is my code :

     path = r"/Users/kd/path"
     allFiles = glob.glob(path + "/*.csv")
     frame = pd.DataFrame()
     file_names = []
     j=0
     for file_ in allFiles:
         name = os.path.splitext(file_)[0]
         i = int(name[-1])
         file_names.append(name)
         df = pd.read_csv(file_, index_col = None, header = 0)
         if j>0:
            globals()["self.dfInternautes%s"%i] =   pd.concat([globals(["self.dfInternautes%s"%i], df], axis=1)
         else: 
            globals()["self.dfInternautes%s"%i] = df
         j += 1
2
  • So you want all rows to be identical (Id1, Project1, Descr1, Id2, Project2, ... , Id4, Project4, Descr4) except for the first column (mail1, mail2, ...)? Commented Jun 16, 2016 at 10:49
  • @IanS Yes, that's exactly what I want ! Commented Jun 16, 2016 at 10:57

1 Answer 1

1

To make one row from a DataFrame use stack. Then iterate over it creating new column in first DataFrame.

>>> df1
        0
0  email1
1  email2
2  email3
3  email4
4  email5
5  email6
>>> df2
     0         1       2
0  Id1  Project1  Descr1
1  Id2  Project2  Descr2
2  Id3  Project3  Descr3
3  Id4  Project4  Descr4
>>> st = df2.stack()
>>> st
0  0         Id1
   1    Project1
   2      Descr1
1  0         Id2
   1    Project2
   2      Descr2
2  0         Id3
   1    Project3
   2      Descr3
3  0         Id4
   1    Project4
   2      Descr4
dtype: object
>>> df = df1.copy()
>>> for i in st.index: df[i] = st[i]
... 
>>> df
        0 (0, 0)    (0, 1)  (0, 2) (1, 0)    (1, 1)  (1, 2) (2, 0)    (2, 1)  \
0  email1    Id1  Project1  Descr1    Id2  Project2  Descr2    Id3  Project3   
1  email2    Id1  Project1  Descr1    Id2  Project2  Descr2    Id3  Project3   
2  email3    Id1  Project1  Descr1    Id2  Project2  Descr2    Id3  Project3   
3  email4    Id1  Project1  Descr1    Id2  Project2  Descr2    Id3  Project3   
4  email5    Id1  Project1  Descr1    Id2  Project2  Descr2    Id3  Project3   
5  email6    Id1  Project1  Descr1    Id2  Project2  Descr2    Id3  Project3   

   (2, 2) (3, 0)    (3, 1)  (3, 2)  
0  Descr3    Id4  Project4  Descr4  
1  Descr3    Id4  Project4  Descr4  
2  Descr3    Id4  Project4  Descr4  
3  Descr3    Id4  Project4  Descr4  
4  Descr3    Id4  Project4  Descr4  
5  Descr3    Id4  Project4  Descr4  

Optionally change column names

df.columns = ['email', 'Id1', 'Project1', 'Descr1', 'Id2', 'Project2', 'Descr2', 'Id3', 'Project3', 'Descr3', 'Id4', 'Project4', 'Descr4']
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.