1

I have a .txt file where I have text headers and numerical data. I am working with python 2.7, and am using pandas and numpy in my work. The structure of the file is like the picture shown below:

enter image description here

The data for this file can be gotten from here. In this file, I want to get a list of all tags. For example, in the picture shown above, I want the list to look like the following:

[Tag1, Tag1, Tag1, Tag5, Tag5, Tag6, Tag6]

At present, I am reading the file using:

df = pd.read_csv('dum.txt',sep='\t', header=[0,1], index_col=0)

When I try lst = df.columns.levels[1], I get Index([u'Tag1', u'Tag5', u'Tag6'], dtype='object', name=u'Tag') as my output instead of the list that I desire.

How can I get a list of tags in my problem, i.e. [Tag1, Tag1, Tag1, Tag5, Tag5, Tag6, Tag6] ? Thanks in advance.

1 Answer 1

2

You can use get_level_values(1) instead of levels[1], then convert to list using tolist():

>>> df.columns.get_level_values(1).tolist()
['Tag1', 'Tag1', 'Tag1', 'Tag5', 'Tag5', 'Tag6', 'Tag6']

The reason is that levels[1] will give you, as you saw, a list of all unique levels, whereas get_level_values will actually return:

Return vector of label values for requested level, equal to the length of the index

Sign up to request clarification or add additional context in comments.

2 Comments

Nice answer! :) I deleted. +1
Thanks sacul! Your answer was very helpful. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.