1

Insert pandas df into local Microsoft SQL Server database table using df.to_sql

Created connection_url for sqlalchemy engine:

connection_url = URL.create(
    "mssql+pyodbc",
    username="",
    password="",
    host="localhost",
    port=1433,
    database="priority",
    query={
        "driver": "ODBC Driver 17 for SQL Server",
        "authentication": "ActiveDirectoryIntegrated",
    },
)

Used connection_url to create engine:

engine = sqlalchemy.create_engine(connection_url)

Attempted to insert df into SQL database table 'test1' with df.to_sql:

df.to_sql('test1', engine, if_exists='replace')

Received error

OperationalError: (pyodbc.OperationalError) ('08001', '[08001] [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: No connection could be made because the target machine actively refused it.\r\n (10061) (SQLDriverConnect); [08001] [Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0); [08001] [Microsoft][ODBC Driver 17 for SQL Server]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. (10061)')

What I've tried

  1. Opened the SQL Port 1433
  • Ran the following in the cmd (administrator mode)
netsh advfirewall firewall add rule name= "SQL Port" dir=in action=allow protocol=TCP localport=1433
  • Checked the port has been added
netsh firewall show state

Which gave the following:

Ports currently open on all network interfaces: Port Protocol Version Program 1433 TCP Any (null)

  1. Replaced 'localhost' with '127.0.0.1'

  2. Replaced 'localhost' with the hostname of SQL Server

  • I found the hostname of SQL Server by executing the following SQL query

SELECT HOST_NAME() AS HostName

Which returned the following (changed the name for security)

LAPTOP-NumDig

[Edit update] 4. Updated ODBC driver to driver 17(https://learn.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server?view=sql-server-ver16).

[Edit update] 5. Replaced "authentication": "ActiveDirectoryIntegrated" in query with "Trusted_Connection": "yes"

Prior connections have worked with pyodbc package

I have been able to connect to the SQL database 'priority' and created the table 'test1' using the pyodbc package in Python.

  • Connect to SQL
conn = pyodbc.connect('Driver={SQL Server};' 
                      'Server=LAPTOP-NumDig;'
                      "Database=priority;"
                      'UID=;' # username
                      'PWD=;' # password
                      )
cursor = conn.cursor()
  • Create table called 'test1' in 'priority' database
cursor.execute("""
CREATE TABLE test1 (
    PersonID int,
    LastName varchar(255),
    FirstName varchar(255),
    Address varchar(255),
    City varchar(255)
);""")

Let me know your thoughts

What I'm wanting to do is insert a df as a table in SQL. Any suggestions would be great :)

FYI - Software & package versions:

  • VSCode 1.71.2
  • Python 3.9.13
  • Microsoft SQL Server 2019 on Windows 10 Home 10.0 (Build 22000: )
  • sqlalchemy 1.4.41
  • pyodbc 4.0.34
2
  • Try using "Trusted_Connection": "yes" instead of "authentication": "ActiveDirectoryIntegrated" Commented Sep 19, 2022 at 16:34
  • @GordThompson thank you for your suggestion, I triedconnection_url=mssql+pyodbc://:@localhost:1433/priority-tool?Trusted_Connection=yes&driver=ODBC+Driver+17+for+SQL+Server, and unfortunately I received the same error message as before. Commented Sep 20, 2022 at 8:14

1 Answer 1

1

I found a solution that uses an alternative approach that works: passing through the exact Pyodbc string to sqlalchemy (https://pydoc.dev/sqlalchemy/latest/sqlalchemy.dialects.mssql.pyodbc.html).

import sqlalchemy
from sqlalchemy.engine import URL

connection_string = "DRIVER={SQL Server};SERVER=LAPTOP-NumDig;DATABASE=priority;UID=;PWD="

connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})

engine = sqlalchemy.create_engine(connection_url)

df.to_sql('test1', engine, if_exists='replace')

Note, this way of inserting took a long time to upload a df into an SQL table.

[Edit update] To make the df.to_sql() run faster, include the chunksize parameter and method='multi'. (https://towardsdatascience.com/dramatically-improve-your-database-inserts-with-a-simple-upgrade-6dfa672f1424).

I used the following code to speed up the pandas df to SQL table upload:

df.to_sql('test1', engine, if_exists='replace', chunksize=20, method='multi') 
Sign up to request clarification or add additional context in comments.

2 Comments

"took a long time to upload a df" - engine = sqlalchemy.create_engine(connection_url, fast_executemany=True) should help with that.
@GordThompson thanks for the suggestion, I use the following solution (adding chunksize parameter and method='multi') to speed up the upload process: df.to_sql('test1', engine, if_exists='replace', chunksize=20, method='multi')

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.