You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

576 lines
20KB

  1. # mssql/pyodbc.py
  2. # Copyright (C) 2005-2021 the SQLAlchemy authors and contributors
  3. # <see AUTHORS file>
  4. #
  5. # This module is part of SQLAlchemy and is released under
  6. # the MIT License: http://www.opensource.org/licenses/mit-license.php
  7. r"""
  8. .. dialect:: mssql+pyodbc
  9. :name: PyODBC
  10. :dbapi: pyodbc
  11. :connectstring: mssql+pyodbc://<username>:<password>@<dsnname>
  12. :url: http://pypi.python.org/pypi/pyodbc/
  13. Connecting to PyODBC
  14. --------------------
  15. The URL here is to be translated to PyODBC connection strings, as
  16. detailed in `ConnectionStrings <https://code.google.com/p/pyodbc/wiki/ConnectionStrings>`_.
  17. DSN Connections
  18. ^^^^^^^^^^^^^^^
  19. A DSN connection in ODBC means that a pre-existing ODBC datasource is
  20. configured on the client machine. The application then specifies the name
  21. of this datasource, which encompasses details such as the specific ODBC driver
  22. in use as well as the network address of the database. Assuming a datasource
  23. is configured on the client, a basic DSN-based connection looks like::
  24. engine = create_engine("mssql+pyodbc://scott:tiger@some_dsn")
  25. Which above, will pass the following connection string to PyODBC::
  26. dsn=mydsn;UID=user;PWD=pass
  27. If the username and password are omitted, the DSN form will also add
  28. the ``Trusted_Connection=yes`` directive to the ODBC string.
  29. Hostname Connections
  30. ^^^^^^^^^^^^^^^^^^^^
  31. Hostname-based connections are also supported by pyodbc. These are often
  32. easier to use than a DSN and have the additional advantage that the specific
  33. database name to connect towards may be specified locally in the URL, rather
  34. than it being fixed as part of a datasource configuration.
  35. When using a hostname connection, the driver name must also be specified in the
  36. query parameters of the URL. As these names usually have spaces in them, the
  37. name must be URL encoded which means using plus signs for spaces::
  38. engine = create_engine("mssql+pyodbc://scott:tiger@myhost:port/databasename?driver=SQL+Server+Native+Client+10.0")
  39. Other keywords interpreted by the Pyodbc dialect to be passed to
  40. ``pyodbc.connect()`` in both the DSN and hostname cases include:
  41. ``odbc_autotranslate``, ``ansi``, ``unicode_results``, ``autocommit``,
  42. ``authentication``.
  43. Note that in order for the dialect to recognize these keywords
  44. (including the ``driver`` keyword above) they must be all lowercase.
  45. Multiple additional keyword arguments must be separated by an
  46. ampersand (``&``), not a semicolon::
  47. engine = create_engine(
  48. "mssql+pyodbc://scott:tiger@myhost:port/databasename"
  49. "?driver=ODBC+Driver+17+for+SQL+Server"
  50. "&authentication=ActiveDirectoryIntegrated"
  51. )
  52. Pass through exact Pyodbc string
  53. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  54. A PyODBC connection string can also be sent in pyodbc's format directly, as
  55. specified in `the PyODBC documentation
  56. <https://github.com/mkleehammer/pyodbc/wiki/Connecting-to-databases>`_,
  57. using the parameter ``odbc_connect``. A :class:`_sa.engine.URL` object
  58. can help make this easier::
  59. from sqlalchemy.engine import URL
  60. connection_string = "DRIVER={SQL Server Native Client 10.0};SERVER=dagger;DATABASE=test;UID=user;PWD=password"
  61. connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})
  62. engine = create_engine(connection_url)
  63. .. _mssql_pyodbc_access_tokens:
  64. Connecting to databases with access tokens
  65. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  66. Some database servers are set up to only accept access tokens for login. For
  67. example, SQL Server allows the use of Azure Active Directory tokens to connect
  68. to databases. This requires creating a credential object using the
  69. ``azure-identity`` library. More information about the authentication step can be
  70. found in `Microsoft's documentation
  71. <https://docs.microsoft.com/en-us/azure/developer/python/azure-sdk-authenticate?tabs=bash>`_.
  72. After getting an engine, the credentials need to be sent to ``pyodbc.connect``
  73. each time a connection is requested. One way to do this is to set up an event
  74. listener on the engine that adds the credential token to the dialect's connect
  75. call. This is discussed more generally in :ref:`engines_dynamic_tokens`. For
  76. SQL Server in particular, this is passed as an ODBC connection attribute with
  77. a data structure `described by Microsoft
  78. <https://docs.microsoft.com/en-us/sql/connect/odbc/using-azure-active-directory#authenticating-with-an-access-token>`_.
  79. The following code snippet will create an engine that connects to an Azure SQL
  80. database using Azure credentials::
  81. import struct
  82. from sqlalchemy import create_engine, event
  83. from sqlalchemy.engine.url import URL
  84. from azure import identity
  85. SQL_COPT_SS_ACCESS_TOKEN = 1256 # Connection option for access tokens, as defined in msodbcsql.h
  86. TOKEN_URL = "https://database.windows.net/" # The token URL for any Azure SQL database
  87. connection_string = "mssql+pyodbc://@my-server.database.windows.net/myDb?driver=ODBC+Driver+17+for+SQL+Server"
  88. engine = create_engine(connection_string)
  89. azure_credentials = identity.DefaultAzureCredential()
  90. @event.listens_for(engine, "do_connect")
  91. def provide_token(dialect, conn_rec, cargs, cparams):
  92. # remove the "Trusted_Connection" parameter that SQLAlchemy adds
  93. cargs[0] = cargs[0].replace(";Trusted_Connection=Yes", "")
  94. # create token credential
  95. raw_token = azure_credentials.get_token(TOKEN_URL).token.encode("utf-16-le")
  96. token_struct = struct.pack(f"<I{len(raw_token)}s", len(raw_token), raw_token)
  97. # apply it to keyword arguments
  98. cparams["attrs_before"] = {SQL_COPT_SS_ACCESS_TOKEN: token_struct}
  99. .. tip::
  100. The ``Trusted_Connection`` token is currently added by the SQLAlchemy
  101. pyodbc dialect when no username or password is present. This needs
  102. to be removed per Microsoft's
  103. `documentation for Azure access tokens
  104. <https://docs.microsoft.com/en-us/sql/connect/odbc/using-azure-active-directory#authenticating-with-an-access-token>`_,
  105. stating that a connection string when using an access token must not contain
  106. ``UID``, ``PWD``, ``Authentication`` or ``Trusted_Connection`` parameters.
  107. Pyodbc Pooling / connection close behavior
  108. ------------------------------------------
  109. PyODBC uses internal `pooling
  110. <https://github.com/mkleehammer/pyodbc/wiki/The-pyodbc-Module#pooling>`_ by
  111. default, which means connections will be longer lived than they are within
  112. SQLAlchemy itself. As SQLAlchemy has its own pooling behavior, it is often
  113. preferable to disable this behavior. This behavior can only be disabled
  114. globally at the PyODBC module level, **before** any connections are made::
  115. import pyodbc
  116. pyodbc.pooling = False
  117. # don't use the engine before pooling is set to False
  118. engine = create_engine("mssql+pyodbc://user:pass@dsn")
  119. If this variable is left at its default value of ``True``, **the application
  120. will continue to maintain active database connections**, even when the
  121. SQLAlchemy engine itself fully discards a connection or if the engine is
  122. disposed.
  123. .. seealso::
  124. `pooling <https://github.com/mkleehammer/pyodbc/wiki/The-pyodbc-Module#pooling>`_ -
  125. in the PyODBC documentation.
  126. Driver / Unicode Support
  127. -------------------------
  128. PyODBC works best with Microsoft ODBC drivers, particularly in the area
  129. of Unicode support on both Python 2 and Python 3.
  130. Using the FreeTDS ODBC drivers on Linux or OSX with PyODBC is **not**
  131. recommended; there have been historically many Unicode-related issues
  132. in this area, including before Microsoft offered ODBC drivers for Linux
  133. and OSX. Now that Microsoft offers drivers for all platforms, for
  134. PyODBC support these are recommended. FreeTDS remains relevant for
  135. non-ODBC drivers such as pymssql where it works very well.
  136. Rowcount Support
  137. ----------------
  138. Pyodbc only has partial support for rowcount. See the notes at
  139. :ref:`mssql_rowcount_versioning` for important notes when using ORM
  140. versioning.
  141. .. _mssql_pyodbc_fastexecutemany:
  142. Fast Executemany Mode
  143. ---------------------
  144. The Pyodbc driver has added support for a "fast executemany" mode of execution
  145. which greatly reduces round trips for a DBAPI ``executemany()`` call when using
  146. Microsoft ODBC drivers, for **limited size batches that fit in memory**. The
  147. feature is enabled by setting the flag ``.fast_executemany`` on the DBAPI
  148. cursor when an executemany call is to be used. The SQLAlchemy pyodbc SQL
  149. Server dialect supports setting this flag automatically when the
  150. ``.fast_executemany`` flag is passed to
  151. :func:`_sa.create_engine` ; note that the ODBC driver must be the Microsoft
  152. driver in order to use this flag::
  153. engine = create_engine(
  154. "mssql+pyodbc://scott:tiger@mssql2017:1433/test?driver=ODBC+Driver+13+for+SQL+Server",
  155. fast_executemany=True)
  156. .. warning:: The pyodbc fast_executemany mode **buffers all rows in memory** and is
  157. not compatible with very large batches of data. A future version of SQLAlchemy
  158. may support this flag as a per-execution option instead.
  159. .. versionadded:: 1.3
  160. .. seealso::
  161. `fast executemany <https://github.com/mkleehammer/pyodbc/wiki/Features-beyond-the-DB-API#fast_executemany>`_
  162. - on github
  163. .. _mssql_pyodbc_setinputsizes:
  164. Setinputsizes Support
  165. -----------------------
  166. The pyodbc ``cursor.setinputsizes()`` method can be used if necessary. To
  167. enable this hook, pass ``use_setinputsizes=True`` to :func:`_sa.create_engine`::
  168. engine = create_engine("mssql+pyodbc://...", use_setinputsizes=True)
  169. The behavior of the hook can then be customized, as may be necessary
  170. particularly if fast_executemany is in use, via the
  171. :meth:`.DialectEvents.do_setinputsizes` hook. See that method for usage
  172. examples.
  173. .. versionchanged:: 1.4.1 The pyodbc dialects will not use setinputsizes
  174. unless ``use_setinputsizes=True`` is passed.
  175. """ # noqa
  176. import datetime
  177. import decimal
  178. import re
  179. import struct
  180. from .base import BINARY
  181. from .base import DATETIMEOFFSET
  182. from .base import MSDialect
  183. from .base import MSExecutionContext
  184. from .base import VARBINARY
  185. from ... import exc
  186. from ... import types as sqltypes
  187. from ... import util
  188. from ...connectors.pyodbc import PyODBCConnector
  189. class _ms_numeric_pyodbc(object):
  190. """Turns Decimals with adjusted() < 0 or > 7 into strings.
  191. The routines here are needed for older pyodbc versions
  192. as well as current mxODBC versions.
  193. """
  194. def bind_processor(self, dialect):
  195. super_process = super(_ms_numeric_pyodbc, self).bind_processor(dialect)
  196. if not dialect._need_decimal_fix:
  197. return super_process
  198. def process(value):
  199. if self.asdecimal and isinstance(value, decimal.Decimal):
  200. adjusted = value.adjusted()
  201. if adjusted < 0:
  202. return self._small_dec_to_string(value)
  203. elif adjusted > 7:
  204. return self._large_dec_to_string(value)
  205. if super_process:
  206. return super_process(value)
  207. else:
  208. return value
  209. return process
  210. # these routines needed for older versions of pyodbc.
  211. # as of 2.1.8 this logic is integrated.
  212. def _small_dec_to_string(self, value):
  213. return "%s0.%s%s" % (
  214. (value < 0 and "-" or ""),
  215. "0" * (abs(value.adjusted()) - 1),
  216. "".join([str(nint) for nint in value.as_tuple()[1]]),
  217. )
  218. def _large_dec_to_string(self, value):
  219. _int = value.as_tuple()[1]
  220. if "E" in str(value):
  221. result = "%s%s%s" % (
  222. (value < 0 and "-" or ""),
  223. "".join([str(s) for s in _int]),
  224. "0" * (value.adjusted() - (len(_int) - 1)),
  225. )
  226. else:
  227. if (len(_int) - 1) > value.adjusted():
  228. result = "%s%s.%s" % (
  229. (value < 0 and "-" or ""),
  230. "".join([str(s) for s in _int][0 : value.adjusted() + 1]),
  231. "".join([str(s) for s in _int][value.adjusted() + 1 :]),
  232. )
  233. else:
  234. result = "%s%s" % (
  235. (value < 0 and "-" or ""),
  236. "".join([str(s) for s in _int][0 : value.adjusted() + 1]),
  237. )
  238. return result
  239. class _MSNumeric_pyodbc(_ms_numeric_pyodbc, sqltypes.Numeric):
  240. pass
  241. class _MSFloat_pyodbc(_ms_numeric_pyodbc, sqltypes.Float):
  242. pass
  243. class _ms_binary_pyodbc(object):
  244. """Wraps binary values in dialect-specific Binary wrapper.
  245. If the value is null, return a pyodbc-specific BinaryNull
  246. object to prevent pyODBC [and FreeTDS] from defaulting binary
  247. NULL types to SQLWCHAR and causing implicit conversion errors.
  248. """
  249. def bind_processor(self, dialect):
  250. if dialect.dbapi is None:
  251. return None
  252. DBAPIBinary = dialect.dbapi.Binary
  253. def process(value):
  254. if value is not None:
  255. return DBAPIBinary(value)
  256. else:
  257. # pyodbc-specific
  258. return dialect.dbapi.BinaryNull
  259. return process
  260. class _ODBCDateTime(sqltypes.DateTime):
  261. """Add bind processors to handle datetimeoffset behaviors"""
  262. has_tz = False
  263. def bind_processor(self, dialect):
  264. def process(value):
  265. if value is None:
  266. return None
  267. elif isinstance(value, util.string_types):
  268. # if a string was passed directly, allow it through
  269. return value
  270. elif not value.tzinfo or (not self.timezone and not self.has_tz):
  271. # for DateTime(timezone=False)
  272. return value
  273. else:
  274. # for DATETIMEOFFSET or DateTime(timezone=True)
  275. #
  276. # Convert to string format required by T-SQL
  277. dto_string = value.strftime("%Y-%m-%d %H:%M:%S.%f %z")
  278. # offset needs a colon, e.g., -0700 -> -07:00
  279. # "UTC offset in the form (+-)HHMM[SS[.ffffff]]"
  280. # backend currently rejects seconds / fractional seconds
  281. dto_string = re.sub(
  282. r"([\+\-]\d{2})([\d\.]+)$", r"\1:\2", dto_string
  283. )
  284. return dto_string
  285. return process
  286. class _ODBCDATETIMEOFFSET(_ODBCDateTime):
  287. has_tz = True
  288. class _VARBINARY_pyodbc(_ms_binary_pyodbc, VARBINARY):
  289. pass
  290. class _BINARY_pyodbc(_ms_binary_pyodbc, BINARY):
  291. pass
  292. class MSExecutionContext_pyodbc(MSExecutionContext):
  293. _embedded_scope_identity = False
  294. def pre_exec(self):
  295. """where appropriate, issue "select scope_identity()" in the same
  296. statement.
  297. Background on why "scope_identity()" is preferable to "@@identity":
  298. http://msdn.microsoft.com/en-us/library/ms190315.aspx
  299. Background on why we attempt to embed "scope_identity()" into the same
  300. statement as the INSERT:
  301. http://code.google.com/p/pyodbc/wiki/FAQs#How_do_I_retrieve_autogenerated/identity_values?
  302. """
  303. super(MSExecutionContext_pyodbc, self).pre_exec()
  304. # don't embed the scope_identity select into an
  305. # "INSERT .. DEFAULT VALUES"
  306. if (
  307. self._select_lastrowid
  308. and self.dialect.use_scope_identity
  309. and len(self.parameters[0])
  310. ):
  311. self._embedded_scope_identity = True
  312. self.statement += "; select scope_identity()"
  313. def post_exec(self):
  314. if self._embedded_scope_identity:
  315. # Fetch the last inserted id from the manipulated statement
  316. # We may have to skip over a number of result sets with
  317. # no data (due to triggers, etc.)
  318. while True:
  319. try:
  320. # fetchall() ensures the cursor is consumed
  321. # without closing it (FreeTDS particularly)
  322. row = self.cursor.fetchall()[0]
  323. break
  324. except self.dialect.dbapi.Error:
  325. # no way around this - nextset() consumes the previous set
  326. # so we need to just keep flipping
  327. self.cursor.nextset()
  328. self._lastrowid = int(row[0])
  329. else:
  330. super(MSExecutionContext_pyodbc, self).post_exec()
  331. class MSDialect_pyodbc(PyODBCConnector, MSDialect):
  332. supports_statement_cache = True
  333. # mssql still has problems with this on Linux
  334. supports_sane_rowcount_returning = False
  335. execution_ctx_cls = MSExecutionContext_pyodbc
  336. colspecs = util.update_copy(
  337. MSDialect.colspecs,
  338. {
  339. sqltypes.Numeric: _MSNumeric_pyodbc,
  340. sqltypes.Float: _MSFloat_pyodbc,
  341. BINARY: _BINARY_pyodbc,
  342. # support DateTime(timezone=True)
  343. sqltypes.DateTime: _ODBCDateTime,
  344. DATETIMEOFFSET: _ODBCDATETIMEOFFSET,
  345. # SQL Server dialect has a VARBINARY that is just to support
  346. # "deprecate_large_types" w/ VARBINARY(max), but also we must
  347. # handle the usual SQL standard VARBINARY
  348. VARBINARY: _VARBINARY_pyodbc,
  349. sqltypes.VARBINARY: _VARBINARY_pyodbc,
  350. sqltypes.LargeBinary: _VARBINARY_pyodbc,
  351. },
  352. )
  353. def __init__(
  354. self, description_encoding=None, fast_executemany=False, **params
  355. ):
  356. if "description_encoding" in params:
  357. self.description_encoding = params.pop("description_encoding")
  358. super(MSDialect_pyodbc, self).__init__(**params)
  359. self.use_scope_identity = (
  360. self.use_scope_identity
  361. and self.dbapi
  362. and hasattr(self.dbapi.Cursor, "nextset")
  363. )
  364. self._need_decimal_fix = self.dbapi and self._dbapi_version() < (
  365. 2,
  366. 1,
  367. 8,
  368. )
  369. self.fast_executemany = fast_executemany
  370. def _get_server_version_info(self, connection):
  371. try:
  372. # "Version of the instance of SQL Server, in the form
  373. # of 'major.minor.build.revision'"
  374. raw = connection.exec_driver_sql(
  375. "SELECT CAST(SERVERPROPERTY('ProductVersion') AS VARCHAR)"
  376. ).scalar()
  377. except exc.DBAPIError:
  378. # SQL Server docs indicate this function isn't present prior to
  379. # 2008. Before we had the VARCHAR cast above, pyodbc would also
  380. # fail on this query.
  381. return super(MSDialect_pyodbc, self)._get_server_version_info(
  382. connection, allow_chars=False
  383. )
  384. else:
  385. version = []
  386. r = re.compile(r"[.\-]")
  387. for n in r.split(raw):
  388. try:
  389. version.append(int(n))
  390. except ValueError:
  391. pass
  392. return tuple(version)
  393. def on_connect(self):
  394. super_ = super(MSDialect_pyodbc, self).on_connect()
  395. def on_connect(conn):
  396. if super_ is not None:
  397. super_(conn)
  398. self._setup_timestampoffset_type(conn)
  399. return on_connect
  400. def _setup_timestampoffset_type(self, connection):
  401. # output converter function for datetimeoffset
  402. def _handle_datetimeoffset(dto_value):
  403. tup = struct.unpack("<6hI2h", dto_value)
  404. return datetime.datetime(
  405. tup[0],
  406. tup[1],
  407. tup[2],
  408. tup[3],
  409. tup[4],
  410. tup[5],
  411. tup[6] // 1000,
  412. util.timezone(
  413. datetime.timedelta(hours=tup[7], minutes=tup[8])
  414. ),
  415. )
  416. odbc_SQL_SS_TIMESTAMPOFFSET = -155 # as defined in SQLNCLI.h
  417. connection.add_output_converter(
  418. odbc_SQL_SS_TIMESTAMPOFFSET, _handle_datetimeoffset
  419. )
  420. def do_executemany(self, cursor, statement, parameters, context=None):
  421. if self.fast_executemany:
  422. cursor.fast_executemany = True
  423. super(MSDialect_pyodbc, self).do_executemany(
  424. cursor, statement, parameters, context=context
  425. )
  426. def is_disconnect(self, e, connection, cursor):
  427. if isinstance(e, self.dbapi.Error):
  428. code = e.args[0]
  429. if code in {
  430. "08S01",
  431. "01000",
  432. "01002",
  433. "08003",
  434. "08007",
  435. "08S02",
  436. "08001",
  437. "HYT00",
  438. "HY010",
  439. "10054",
  440. }:
  441. return True
  442. return super(MSDialect_pyodbc, self).is_disconnect(
  443. e, connection, cursor
  444. )
  445. dialect = MSDialect_pyodbc