0

I have a CSV file called mrh.csv which has first two rows representing the header:

Name,Height,Age
"",Metres,""
A,-1,25
B,95,-1

I am using the following code to read it into DataFrame:

import pandas as pd
pd.read_csv('mrh.csv', header=[0,1], na_values=[-1,''])

This results in a Data Frame with the following contents:

    Name                Height  Age
    Unnamed: 0_level_1  Metres  Unnamed: 2_level_1

0   A                   NaN     25.0
1   B                   95.0    NaN

Using the na_values parameter of read_csv I can mark the missing values marked as -1 in the file, but the missing header row values, when marked as "" (I also tried -1) are displayed as Unnamed: x_level_y (or -1 if it is used instead).

Is there a way to not display the missing values - to remove the Unnamed: x_level_y or substitute it with a meaningful value?

Desired output 1:

    Name  Height  Age
          Metres    

0   A     NaN     25.0
1   B     95.0    NaN

Desired output 2:

    Name  Height  Age
    -     Metres  - 

0   A     NaN     25.0
1   B     95.0    NaN
2
  • What do you mean by a meaningful value, can you show the output you desire to get? Commented Jan 2, 2018 at 11:25
  • @Dark I have updated the question with the desired output. Commented Jan 2, 2018 at 12:13

3 Answers 3

1

You can create new MultiIndex and assign to columns:

df = pd.read_csv('mrh.csv', header=[0,1], na_values=[-1,''])

a = df.columns.get_level_values(level=0)
b = df.columns.get_level_values(level=1).str.replace('Un.*','')
df.columns = [a, b]
print (df)
  Name Height   Age
       Metres      
0    A    NaN  25.0
1    B   95.0   NaN

Or:

a = df.columns.get_level_values(level=0)
b = df.columns.get_level_values(level=1).str.replace('Un.*','-')
df.columns = [a, b]
print (df)
  Name Height   Age
     - Metres     -
0    A    NaN  25.0
1    B   95.0   NaN
Sign up to request clarification or add additional context in comments.

3 Comments

This is almost same as mine
Hmmm, are you angry? Because I think not, but I promise you something, so if want you can add this solution to youar answer and I remove this.
Its ohk let it stay. Mine still pointing to the bug that need to be fixed.
1

I dont think its possible using read_csv, you can modify the index after loading that is :

from io import StringIO

txt = '''Name,Height,Age
"",Metres,""
A,-1,25
B,95,-1'''

df = pd.read_csv(StringIO(txt),header=[0,1],na_values=['-1',''])

df.columns = df.columns.set_levels(df.columns.get_level_values(level=1).str.replace('Un.*',''),level=1)
df.columns = df.columns.set_levels(df.columns.get_level_values(level=1).str.replace('Un.*',''),level=1)

Output:

   Name Height   Age
        Metres      
0    A    NaN  25.0
1    B   95.0   NaN

To know assigning df.columns twice you can check here. Its still mysterious

Edit, set_levels is still buggy you can use :

df.columns = df.columns.set_levels(df.columns.levels[1].str.replace('Un.*', ''), level=1)

7 Comments

It looks like bug, last row should be df.columns = df.columns.set_levels(df.columns.get_level_values(level=1),level=1)
@jezrael You can check the link I posted a question, let me wait till the bug is fixed. Im waiting for an answer to my question
I really like answer, but no idea how ;)
But I think if your solution is buggy, better dont use it ;)
@jezrael how about we fix it. It still is a good function, just need a bit updation of bug.
|
0
import pandas as pd
pd.read_csv("mrh.csv").fillna("-").to_csv("mrh.csv",index=None)
df1 = pd.read_csv("mrh.csv",header=[0,1],na_values=[-1,''])
df1

output:

   Name Height  Age
   -    Metres  -
0   A   NaN    25.0
1   B   95     NaN

2 Comments

If possible I would like to avoid modifying the original file.
While this code snippet may be the solution, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.