2

I am a newB python modeller and currently experiencing some issues with a line of code which may be very basic for a lot of you.

I am using python 2.7 and have successfully used xlwings to copy a named range from external workbook in to the pd.dataframe format. Everything thing works fine except the df.index and df.columns. Currently the code is assigning 1 to n (based on number of rows and columns) numbers as a index and column names.

is there a way, I use the first column values of my imported data as df.index and first row as df.column?

Can some one please assist me with getting something like this:

 df = pd.DataFrame(myExcelRange, df.index = 'first column values', df.columns = 'first row values')

The shape and name of myExcelRange could be different each time.

Any guideline will be much appreciated.

Example:

> myExcelRange

ITEM    Dan Jane    Fan 
A   77  78  40
B   89  53  72  
C   20  19  79  
D   81  54  93  
E   77  76  99  

pandas is returning

    0   1   2   3
0   ITEM    Dan Jane    Fan
1   77  78  40  0
2   89  53  72  0
3   20  19  79  0
4   81  54  93  0
5   77  76  99  0

desired

ITEM    Dan Jane    Fan 
A   76  89  100 
B   59  72  24  
C   69  73  19  
D   70  92  43  
E   65  94  30  
7
  • what is type(myExcelRange) ? Commented Jan 20, 2017 at 6:29
  • it's a 'list' type Commented Jan 20, 2017 at 6:38
  • Can you add some sample like myExcelRange = ['a','b','c'] or myExcelRange = [['a','b','c'],['d','e','f']] and desired output? Commented Jan 20, 2017 at 6:41
  • Or you need select first value in column and index by df = pd.DataFrame(myExcelRange).iat[0,0] or df = pd.DataFrame(myExcelRange).iloc[0,0]? Or rename only first index value and first column value? Commented Jan 20, 2017 at 6:43
  • I have been playing with it and used df = df.set_index(0) - this seems to be used the first column values as the index. however, I need to find some set_column kind of function Commented Jan 20, 2017 at 6:44

1 Answer 1

1

You can set_index with first column and then select first row by iloc and assign to df.columns, last remove first row from data by iloc too:

myExcelRange = [['a','b','c'],['d','e','f'],['g','h','i']]
df = pd.DataFrame(myExcelRange)
print (df)
   0  1  2
0  a  b  c
1  d  e  f
2  g  h  i

df = df.set_index(0)
df.columns = df.iloc[0,:]
#for nicer df remove index and column names
df.index.name = None
df.columns.name = None

print (df.iloc[1:,:])
   b  c
d  e  f
g  h  i

As Alex Fung mentioned, maybe is possible use read_excel with parameter index_col:

df = pd.read_excel('file.xlsx', index_col=0)
print (df)
      Dan  Jane  Fan
ITEM                
A      77    78   40
B      89    53   72
C      20    19   79
D      81    54   93
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.