Type mapping (Delta Forge to ODBC)

How Delta Forge column types appear over ODBC and which ODBC C types are best for each.

Category: data-types

Syntax

<delta-forge-type> -> <ODBC SQL type> -> <recommended ODBC C type>

Description

## Mapping table | Delta Forge type | ODBC SQL type | Recommended ODBC C type | Notes | |---|---|---|---| | `BOOLEAN` | `SQL_BIT` | `SQL_C_BIT` | 0 / 1, NOT NULL or 1-byte indicator | | `TINYINT` | `SQL_TINYINT` | `SQL_C_STINYINT` | signed | | `SMALLINT` | `SQL_SMALLINT` | `SQL_C_SSHORT` | | | `INTEGER` | `SQL_INTEGER` | `SQL_C_SLONG` | | | `BIGINT` | `SQL_BIGINT` | `SQL_C_SBIGINT` | | | `FLOAT` | `SQL_REAL` | `SQL_C_FLOAT` | 32-bit IEEE 754 | | `DOUBLE` | `SQL_DOUBLE` | `SQL_C_DOUBLE` | 64-bit IEEE 754 | | `DECIMAL(p, s)` | `SQL_DECIMAL` | `SQL_C_NUMERIC` | precision and scale preserved; long-division formatter for `SQL_C_CHAR` | | `DATE` | `SQL_TYPE_DATE` | `SQL_C_TYPE_DATE` | Gregorian, year >= 1 | | `TIME` | `SQL_TYPE_TIME` | `SQL_C_TYPE_TIME` | microsecond resolution | | `TIMESTAMP` | `SQL_TYPE_TIMESTAMP` | `SQL_C_TYPE_TIMESTAMP` | microsecond resolution | | `TIMESTAMP_NTZ` | `SQL_TYPE_TIMESTAMP` | `SQL_C_TYPE_TIMESTAMP` | no timezone applied | | `STRING` / `VARCHAR(n)` | `SQL_VARCHAR` / `SQL_WVARCHAR` | `SQL_C_CHAR` / `SQL_C_WCHAR` | UTF-8 in narrow form, UTF-16 in wide | | `BINARY(n)` | `SQL_VARBINARY` | `SQL_C_BINARY` | raw bytes | | `STRUCT<...>` | `SQL_VARCHAR` | `SQL_C_CHAR` | rendered as JSON; not directly addressable per-field over ODBC | | `ARRAY<T>` | `SQL_VARCHAR` | `SQL_C_CHAR` | rendered as JSON array | | `MAP<K, V>` | `SQL_VARCHAR` | `SQL_C_CHAR` | rendered as JSON object | ## Choosing the C type When the application is free to choose (e.g. `pyodbc` accepting any type), pick the **natural** ODBC C type for the column. Mixing target types per cell on the same column is generally slower than picking one and sticking with it. The .NET `OdbcDataReader` issues each column at its natural type automatically. Python and R users who call `cursor.execute().fetchall()` get the driver-default mapping which uses the **Recommended ODBC C type** column above. ## NULL handling NULL is signalled via the indicator argument to `SQLBindCol` / `SQLGetData`: - `SQL_NULL_DATA` (-1): the cell is NULL. - A non-negative byte length: the cell holds that many bytes (or characters for `SQL_C_CHAR` / `SQL_C_WCHAR`). - `SQL_NTS` for `SQLBindParameter` only: the input string is null-terminated. The driver always populates the indicator. Applications that ignore it (e.g. by passing `NULL` for the indicator pointer) will fail to distinguish empty strings from NULL strings. ## STRING width The driver reports column width via `SQLDescribeCol`: - For `VARCHAR(n)`, the reported width is `n` characters. - For `STRING` (unbounded), the reported width is a sentinel large enough for any practical value but may not match the engine's actual maximum. Downstream tools sometimes preallocate buffers based on this; if the actual cell is smaller, it is fine. If the cell is unexpectedly larger (extremely unlikely), the driver returns `01004` and the application must resize and re-fetch.

Examples

// pyodbc: cur.description shows the ODBC SQL type per column
for col in cur.description:
    print(col[0], col[1])  # name, type code

// .NET: OdbcDataReader.GetSchemaTable() returns the SQL type
var schema = reader.GetSchemaTable();

Pitfalls

Mixing C target types on the same column (e.g. fetching column 0 first as `SQL_C_CHAR` then as `SQL_C_NUMERIC`) bypasses the staging cache. Pick one target type per column and stick with it.
Decimal columns fetched as `SQL_C_DOUBLE` lose precision silently; this is an ODBC-level conversion, not a driver bug. Fetch as `SQL_C_NUMERIC` or `SQL_C_CHAR` to preserve precision.
STRUCT and ARRAY render as JSON. Applications that expect to address sub-fields per ODBC column must flatten on the SQL side (`SELECT t.s.a, t.s.b FROM t`) rather than relying on the ODBC layer to expose nested structure.

Type mapping (Delta Forge to ODBC)

Syntax

Description

Examples

Pitfalls

See Also