Python (pyodbc)

Connect from Python via pyodbc, the most common Python ODBC binding.

Category: bi-tools

Syntax

pyodbc.connect('DSN=DeltaForge;Uid=...;Pwd=...')

Description

## Install ``` pip install pyodbc ``` On Linux the build needs the unixODBC headers. On macOS the build needs iODBC headers; the Homebrew `unixodbc` package and the iODBC framework can co-exist, but pyodbc by default builds against unixODBC. ## Connect The `pyodbc.connect` argument is a connection string. Anything documented in the Connection String reference works: `DSN=`, `Driver=`, `Server=`, `Token=`, etc. For security, prefer omitting `Pwd=` and storing the password in the OS keychain via the DSN setup GUI's **Save password** flow. The driver looks the value up at connect time. ## Cursor lifecycle The driver supports forward-only cursors only. The pyodbc default (`SQL_CURSOR_FORWARD_ONLY`) is the right choice; do not change `cn.cursor(scrollable=True)`. Close the cursor explicitly when done (or use a `with` block). Long-lived cursors hold server-side resources until closed. ## Pandas integration ``` import pyodbc import pandas as pd cn = pyodbc.connect('DSN=DeltaForge;Uid=...;Pwd=...') df = pd.read_sql('SELECT * FROM ...', cn) ``` `read_sql` calls `cursor.execute` and `fetchall`, then constructs a DataFrame. For wide / long results, pyodbc's per-row Python overhead dominates; consider Polars or DuckDB's Arrow integration for faster ingestion. ## Polars / Arrow Polars's `read_database` understands ODBC connections via the `connectorx` backend, which uses Arrow as the intermediate format. The driver natively transports Arrow to the client, so the round-trip is efficient end-to-end. For analytical workloads this is the preferred path. ## Parameters ``` cur.execute('SELECT * FROM t WHERE x = ? AND y > ?', x_value, y_value) ``` The `?` placeholder is the only parameterisation form. Named parameters (`:foo`) are not supported by pyodbc; build the parameter list positionally. ## NULL handling Fetch returns Python `None` for SQL NULL. Pandas converts to `NaN` for numeric columns and `None` for object columns. The distinction matters when the downstream code branches on falsiness.

Examples

# DSN-based
import pyodbc
cn = pyodbc.connect('DSN=DeltaForge;Uid=alice@example.com;Pwd=hunter2')

# Driver-string form (no system DSN required)
cn = pyodbc.connect(
    'Driver={DeltaForge ODBC Driver};'
    'Server=https://df.example.com;'
    'Uid=alice@example.com;'
    'Pwd=hunter2'
)

# Read into pandas
import pandas as pd
df = pd.read_sql('SELECT * FROM analytics.marts.fact_orders LIMIT 1000', cn)

# Read into Polars (zero-copy via Arrow)
import polars as pl
df = pl.read_database('SELECT * FROM analytics.marts.fact_orders LIMIT 1000', connection=cn)

Pitfalls

Calling `cursor.fetchone()` in a tight loop is the slowest pattern for large result sets. Prefer `fetchall()` for small results, or batch with `fetchmany(N)` for medium ones.
pyodbc converts Decimal columns to Python `Decimal`; for large analytical aggregates this can be 100x slower than letting Pandas / Arrow handle the type natively. Cast to DOUBLE on the SQL side if exact precision is not needed.
An `OperationalError: HYC00 Optional feature not implemented` from a `cursor.scroll()` call indicates the application asked for backward scrolling. The driver is forward-only; rewrite the algorithm to fetch once and process in memory.

Python (pyodbc)

Syntax

Description

Examples

Pitfalls

See Also