3

I'm trying to query from a PostgreSQL database with ANSI drivers but for some queries it fails, giving me the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xfd in position 10: ordinal not in range(128)

Here is the function to set the connection and query:

import psycopg2
import pandas as pd
def query_cdk_database(query):
    conn = psycopg2.connect(host="some_host", port = xxx, 
                            database="xxx", user="xxxx",
                            password="xxx", client_encoding ='auto')
    cur = conn.cursor()
    cur.execute(query)
    dat = cur.fetchall()
    cur.close()
    conn.close()
    return dat

have to say for the mayority of the queries it works but for some it breaks. Here is one that always breaks:

query = u"SELECT * FROM ed.VehicleSales_v"
x = query_cdk_database(query)

it returns the following error:

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-26-06ace3c63c62> in <module>
      1 query = u"SELECT * FROM ed.VehicleSales_v;"
----> 2 x = query_raw(query)

<ipython-input-20-6abf4dcf327f> in query_raw(query)
      7     cur = conn.cursor()
      8     cur.execute(query)
----> 9     dat = cur.fetchall()
     10     cur.close()
     11     conn.close()

UnicodeDecodeError: 'ascii' codec can't decode byte 0xfd in position 10: ordinal not in range(128)

To solve this problem I have tried the following:

  • changing connection parameter "client_encoding" in psycopg2.connect to several different ones.
  • looping through all the columns one by one to detect which one returns an error, but individually none return an error.
  • convert the query string to unicode or other different codec
  • Download the data manually raw from PGAdmin, then read it with pandas, which but with warnings "DtypeWarning: Columns (4,55,70,153) have mixed types" (this could be a separate question)

1 Answer 1

1

I solved the problem using package pyodbc. here:

import pyodbc
import pandas as pd

conn_str = (
    "DRIVER={PostgreSQL Unicode};"
    "DATABASE=adp_report;"
    "UID=db_name;"
    "PWD=password;"
    "SERVER=111.111.11.11;"
    "PORT=5432;"
    )

note that the "DRIVER={PostgreSQL Unicode};" is literally that string. For the other arguments change them accordingly. Here is a handy function to pass the connection string and query from the database.

def query_db(query):
    conn = pyodbc.connect(conn_str)
    dat = pd.read_sql(query, conn)
    conn.close()
    return dat
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.