11

Hi I'd like to get a table from a database, but include the field names so I can use them from column headings in e.g. Pandas where I don't necessarily know all the field names in advance

so if my database looks like

table test1

 a | b | c 
---+---+---
 1 | 2 | 3
 1 | 2 | 3
 1 | 2 | 3
 1 | 2 | 3
 1 | 2 | 3

How can I do a

import psycopg2 as pq
cn = pq.connect('dbname=mydb user=me')
cr = cn.cursor()
cr.execute('SELECT * FROM test1;')
tmp = cr.fetchall()
tmp

such that tmp shows

[('a','b','c'),(1,2,3),(1,2,3),(1,2,3),(1,2,3),(1,2,3)]

Thanks

3
  • 2
    python.org/dev/peps/pep-0249/#description Commented Jun 17, 2013 at 8:20
  • @IgnacioVazquez-Abrams Nice. I was going to suggest fetching from INFORMATION_SCHEMA, but that's much more flexible as it isn't limited to just tables. Commented Jun 17, 2013 at 8:24
  • Thanks @IgnacioVazquez-Abrams I'm not clear how to use that as an answer to my question. cr.description(test1) doesn't work. If you put it into an answer I'd be happy to mark it up. Commented Jun 17, 2013 at 8:43

6 Answers 6

17

If what you want is a dataframe with the data from the db table as its values and the dataframe column names being the field names you read in from the db, then this should do what you want:

import psycopg2 as pq
cn = pq.connect('dbname=mydb user=me')
cr = cn.cursor()
cr.execute('SELECT * FROM test1;')
tmp = cr.fetchall()

# Extract the column names
col_names = []
for elt in cr.description:
    col_names.append(elt[0])

# Create the dataframe, passing in the list of col_names extracted from the description
df = pd.DataFrame(tmp, columns=col_names)
Sign up to request clarification or add additional context in comments.

Comments

13

The column names are available as cr.description[0][0], cr.description[1][0], etc. If you want it in exactly the format you show, you need to do some work to extract it and stick it in front of the result set.

Comments

7

You could also map over it which looks a bit nicer:

cursor.execute(open("blah.sql", "r").read())
data = cursor.fetchall()
cols = list(map(lambda x: x[0], cursor.description))
df = DataFrame(data, columns=cols)

Comments

6

You can use two loop cases, to not use pandas:

temp = []
for x in result:
    temp2 = {}
    c = 0
    for col in cursor.description:
        temp2.update({str(col[0]): x[c]})
        c = c+1
    temp.append(temp2)
print(temp)

This will prints any like this:

[{'column1':'foo1','column2':'foo1'},{'column1':'foo2','column2':'foo2'},...]

I hope this help you! Cheers

1 Comment

I think this is the best answer - simple and actually works.
0
import psycopg2 as pq
cn = pq.connect('dbname=mydb user=me')
cr = cn.cursor()
cr.execute('SELECT * FROM test1;')
tmp = cr.fetchall() #Hi, these are your codes that build a connection to a psql server

cols = []
for col in tmp.description:
    cols.append(col[0]) #Collect all column names into an empty list, cols    
tmp.insert(0, tuple(cols)) #insert elements by list.insert(index, new_item) method

The output is

[('a','b','c'),(1,2,3),(1,2,3),(1,2,3),(1,2,3),(1,2,3)]

Comments

0

Crossposting this from my answer to a nearly identical question.

You shouldn't have to assign column names in a separate statement nor do any looping.

import psycopg2, pandas

con=psycopg2.connect(
    dbname=DBNAME, 
    host=HOST, 
    port=PORT, 
    user=USER, 
    password=PASSWORD
)

sql = """
select * from x
"""

d = pandas.read_sql_query(sql,con)

con.close()

print(type(d))

print(pandas.DataFrame.head(d))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.