How to check if a CSV has a header using Python?

Question

I have a CSV file and I want to check if the first row has only strings in it (ie a header). I'm trying to avoid using any extras like pandas etc. I'm thinking I'll use an if statement like if row[0] is a string print this is a CSV but I don't really know how to do that :-S any suggestions?

Thanks for your suggestions everyone, I think I've found a way to do it. — plshelp
– plshelp, Commented Oct 22, 2016 at 15:32

ChrisD · Accepted Answer · 2016-10-22 14:51:57Z

11

Python has a built in CSV module that could help. E.g.

import csv
with open('example.csv', 'rb') as csvfile:
    sniffer = csv.Sniffer()
    has_header = sniffer.has_header(csvfile.read(2048))
    csvfile.seek(0)
    # ...

answered Oct 22, 2016 at 14:51

ChrisD

3,5583 gold badges37 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

azhar22k Over a year ago

Thanks. It worked well for me. But can you please explain why did you pass 2048 but not any other number?

ChrisD Over a year ago

@AzharKhan 2048 is an entirely arbitrary number. It just needs to be big enough to read in at least two or three CSV rows. You could instead read in a few lines to a string and pass that to has_header.

Inês Martins Over a year ago

sniffer.has_header always return True... I tested several csv files... :/

Usman Shabbir Over a year ago

It's working but for a large file, it takes too much time.

Mitch Haile Over a year ago

In python3, you might need to change 'rb' to newline='' (python3 gets a lot more specific on bytes vs strings, but changing to 'r' may assume a newline delimiter)

pietz · Accepted Answer · 2019-11-18 10:19:49Z

4

Here is a function I use with pandas in order analyze whether header should be set to 'infer' or None:

def identify_header(path, n=5, th=0.9):
    df1 = pd.read_csv(path, header='infer', nrows=n)
    df2 = pd.read_csv(path, header=None, nrows=n)
    sim = (df1.dtypes.values == df2.dtypes.values).mean()
    return 'infer' if sim < th else None

Based on a small sample, the function checks the similarity of dtypes with and without a header row. If the dtypes match for a certain percentage of columns, it is assumed that there is no header present. I found a threshold of 0.9 to work well for my use cases. This function is also fairly fast as it only reads a small sample of the csv file.

answered Nov 18, 2019 at 10:19

pietz

2,6132 gold badges25 silver badges27 bronze badges

2 Comments

greendino Over a year ago

if the csv files are big. this could be a problem

pietz Over a year ago

@FoggyMindedGreenhorn Why? We don't read the entire file here.

Joe Bashe · Accepted Answer · 2016-10-22 14:47:34Z

3

I'd do something like this:

is_header = not any(cell.isdigit() for cell in csv_table[0])

Given a CSV table csv_table, grab the top (zeroth) row. Iterate through the cells and check if they contain any pure digit strings. If so, it's not a header. Negate that with a not in front of the whole expression.

Results:

In [1]: not any(cell.isdigit() for cell in ['2','1'])
Out[1]: False

In [2]: not any(cell.isdigit() for cell in ['2','gravy'])
Out[2]: False

In [3]: not any(cell.isdigit() for cell in ['gravy','gravy'])
Out[3]: True

answered Oct 22, 2016 at 14:47

Joe Bashe

1501 silver badge8 bronze badges

Comments

Frankthetank · Accepted Answer · 2023-02-24 21:15:28Z

2

For files that are not necessarily in '.csv' format, this is very useful:

built-in function in Python to check Header in a Text file

def check_header(filename):
    with open(filename) as f:
        first = f.read(1)
        return first not in '.-0123456789'

Answer by: https://stackoverflow.com/users/908494/abarnert

Post link: https://stackoverflow.com/a/15671103/7763184

edited Feb 24, 2023 at 21:15

answered Mar 10, 2021 at 6:19

Frankthetank

215 bronze badges

Comments

John · Accepted Answer · 2020-03-11 13:45:41Z

Well i faced exactly the same problem with the wrong return of has_header for sniffer.has_header and even made a very simple checker that worked in my case

    has_header = ''.join(next(some_csv_reader)).isalpha()

I knew that it wasn't perfect but it seemed it was working...and why not it was a simple replace and check if the the result was alpha or not...and then i put it on my def and it failed.... :( and then i saw the "light"
The trouble is not with the has_header the trouble was with my code because i wanted to also check the delimiter before i parse the actual .csv ...but all the sniffing has a "cost" as they advance one line at a time in the csv. !!!
So in order to have has_header working as it should you should make sure you have reset everything before using it. In my case my method is :

  def _get_data(self, filename):
        sniffer = csv.Sniffer()
        training_data = ''
        with open(filename, 'rt') as csvfile:
            dialect = csv.Sniffer().sniff(csvfile.read(2048))
            training_data = csv.reader(csvfile, delimiter=dialect.delimiter)
            csvfile.seek(0)
            has_header=csv.Sniffer().has_header(csvfile.read(2048))
            #has_header = ''.join(next(training_data)).isalpha()
            csvfile.seek(0)

Abhijit · Accepted Answer · 2021-01-10 09:38:28Z

0

I think the best way to check this is -> simply reading 1st line from file and then match your string instead of any library.

answered Jan 10, 2021 at 9:38

Abhijit

3634 silver badges7 bronze badges

Comments

daniel lugo · Accepted Answer · 2021-07-18 05:36:26Z

0

Simply use try and except ::::::::::::::::::::::::::

import pandas as pd
try:
   data = pd.read_csv('file.csv',encoding='ISO-8859-1')
   print('csv file has header::::::')        
except:
    print('csv file has no header::::::')

answered Jul 18, 2021 at 5:36

daniel lugo

1731 gold badge2 silver badges9 bronze badges

Comments

Freddy Mcloughlan · Accepted Answer · 2022-04-18 07:21:15Z

0

An updated version of ChrisD's answer with fallback for empty files:

with open(filename, "r") as f:
    try:
        has_headings = csv.Sniffer().has_header(f.read(1024))
    except csv.Error:
        # The file seems to be empty
        has_headings = False

https://docs.python.org/3/library/csv.html#csv.Sniffer.has_header

answered Apr 18, 2022 at 7:21

Freddy Mcloughlan

4,5472 gold badges17 silver badges34 bronze badges

Collectives™ on Stack Overflow

How to check if a CSV has a header using Python?

8 Answers 8

5 Comments

2 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

5 Comments

2 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related