6

I have a csv data file with a header indicating the column names.

xy   wz  hi kq
0    10  5  6
1    2   4  7
2    5   2  6

I run:

X = np.array(pd.read_csv('gbk_X_1.csv').values)

I want to get the column names:

['xy', 'wz', 'hi', 'kg']

I read this post but the solution provides me with None.

4
  • np.genfromtxt() and names=True option might help. See stackoverflow.com/questions/12336234/… Commented Dec 1, 2017 at 7:41
  • I think you need pd.read_csv('gbk_X_1.csv').columns.tolist() Commented Dec 1, 2017 at 7:46
  • Is your problem getting the structured array or getting the names out of the structured array? If the latter: list(x.dtype.fields). Commented Dec 1, 2017 at 8:05
  • Yes, It is also possible to use: X = np.genfromtxt('gbk_X_1.csv', dtype=float, delimiter=',', names=True) print(X.dtype.names) Commented Dec 1, 2017 at 9:33

2 Answers 2

4

Use the following code:

import re

f = open('f.csv','r')

alllines = f.readlines()
columns = re.sub(' +',' ',alllines[0]) #delete extra space in one line
columns = columns.strip().split(',') #split using space

print(columns)

Assume CSV file is like this:

xy   wz  hi kq
0    10  5  6
1    2   4  7
2    5   2  6
Sign up to request clarification or add additional context in comments.

Comments

4

Let's assume your csv file looks like

xy,wz,hi,kq
0,10,5,6
1,2,4,7
2,5,2,6

Then use pd.read_csv to dump the file into a dataframe

df = pd.read_csv('gbk_X_1.csv')

The dataframe now looks like

df

   xy  wz  hi  kq
0   0  10   5   6
1   1   2   4   7
2   2   5   2   6

It's three main components are the

  • data which you can access via the values attribute

    df.values
    
    array([[ 0, 10,  5,  6],
           [ 1,  2,  4,  7],
           [ 2,  5,  2,  6]])
    
  • index which you can access via the index attribute

    df.index
    
    RangeIndex(start=0, stop=3, step=1)
    
  • columns which you can access via the columns attribute

    df.columns
    
    Index(['xy', 'wz', 'hi', 'kq'], dtype='object')
    

If you want the columns as a list, use the to_list method

df.columns.tolist()

['xy', 'wz', 'hi', 'kq']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.