Changeset 797 for trunk/docs


Ignore:
Timestamp:
Jan 29, 2016, 5:43:22 PM (3 years ago)
Author:
cito
Message:

Cache typecast functions and make them configurable

The typecast functions used by the pgdb module are now cached
using a local and a global Typecasts class. The local cache is
bound to the connection and knows how to cast composite types.

Also added functions that allow registering custom typecast
functions on the global and local level.

Also added a chapter on type adaptation and casting to the docs.

Location:
trunk/docs/contents/pgdb
Files:
1 added
4 edited
1 copied

Legend:

Unmodified
Added
Removed
  • trunk/docs/contents/pgdb/connection.rst

    r796 r797  
    8484    A dictionary with the various type codes for the PostgreSQL types
    8585
    86 You can request the dictionary either via a PostgreSQL type name (which
    87 (is equal to the DB-API 2 *type_code*) or via a PostgreSQL type OIDs.
    88 
    89 The values are *type_code* strings carrying additional attributes:
    90 
    91         - *oid* -- the OID of the type
    92         - *len*  -- the internal size
    93         - *type*  -- ``'b'`` = base, ``'c'`` = composite, ...
    94         - *category*  -- ``'A'`` = Array, ``'B'`` = Boolean, ...
    95         - *delim*  -- delimiter to be used when parsing arrays
    96         - *relid*  -- the table OID for composite types
    97 
    98 For details, see the PostgreSQL documentation on `pg_type
    99 <http://www.postgresql.org/docs/current/static/catalog-pg-type.html>`_.
    100 
    101 The :attr:`Connection.type_cache` also provides a method :meth:`columns`
    102 that returns the names and type OIDs of the columns of composite types.
     86This can be used for getting more information on the PostgreSQL database
     87types or changing the typecast functions used for the connection.  See the
     88description of the :class:`TypeCache` class for details.
    10389
    10490.. versionadded:: 5.0
  • trunk/docs/contents/pgdb/index.rst

    r710 r797  
    1414    cursor
    1515    types
     16    typecache
     17    adaptation
  • trunk/docs/contents/pgdb/module.rst

    r751 r797  
    3636
    3737    con = connect(dsn='myhost:mydb', user='guido', password='234$')
     38
     39
     40get/set/reset_typecast -- Control the global typecast functions
     41---------------------------------------------------------------
     42
     43PyGreSQL uses typecast functions to cast the raw data coming from the
     44database to Python objects suitable for the particular database type.
     45These functions take a single string argument that represents the data
     46to be casted and must return the casted value.
     47
     48PyGreSQL provides built-in typecast functions for the common database types,
     49but if you want to change these or add more typecast functions, you can use
     50the following functions.
     51
     52.. note::
     53
     54    The following functions are not part of the DB-API 2 standard.
     55
     56.. method:: get_typecast(typ)
     57
     58    Get the global cast function for the given database type
     59
     60    :param str typ: PostgreSQL type name or type code
     61    :returns: the typecast function for the specified type
     62    :rtype: function or None
     63
     64.. versionadded:: 5.0
     65
     66.. method:: set_typecast(typ, cast)
     67
     68    Set a global typecast function for the given database type(s)
     69
     70    :param typ: PostgreSQL type name or type code, or list of such
     71    :type typ: str or list
     72    :param cast: the typecast function to be set for the specified type(s)
     73    :type typ: str or int
     74
     75.. versionadded:: 5.0
     76
     77.. method:: reset_typecast([typ])
     78
     79    Reset the typecasts for the specified (or all) type(s) to their defaults
     80
     81    :param str typ: PostgreSQL type name or type code, or list of such,
     82        or None to reset all typecast functions
     83    :type typ: str, list or None
     84
     85.. versionadded:: 5.0
     86
     87Note that database connections cache types and their cast functions using
     88connection specific :class:`TypeCache` objects.  You can also get, set and
     89reset typecast functions on the connection level using the methods
     90:meth:`TypeCache.get_typecast`, :meth:`TypeCache.set_typecast` and
     91:meth:`TypeCache.reset_typecast` of the :attr:`Connection.type_cache`.  This
     92will not affect other connections or future connections. In order to be sure
     93a global change is picked up by a running connection, you must reopen it or
     94call :meth:`TypeCache.reset_typecast` on the :attr:`Connection.type_cache`.
    3895
    3996
  • trunk/docs/contents/pgdb/typecache.rst

    r786 r797  
    1 Cursor -- The cursor object
    2 ===========================
     1TypeCache -- The internal cache for database types
     2==================================================
    33
    44.. py:currentmodule:: pgdb
    55
    6 .. class:: Cursor
    7 
    8 These objects represent a database cursor, which is used to manage the context
    9 of a fetch operation. Cursors created from the same connection are not
    10 isolated, i.e., any changes done to the database by a cursor are immediately
    11 visible by the other cursors. Cursors created from different connections can
    12 or can not be isolated, depending on the level of transaction isolation.
    13 The default PostgreSQL transaction isolation level is "read committed".
    14 
    15 Cursor objects respond to the following methods and attributes.
    16 
    17 Note that ``Cursor`` objects also implement both the iterator and the
    18 context manager protocol, i.e. you can iterate over them and you can use them
    19 in a ``with`` statement.
    20 
    21 description -- details regarding the result columns
    22 ---------------------------------------------------
    23 
    24 .. attribute:: Cursor.description
    25 
    26     This read-only attribute is a sequence of 7-item named tuples.
    27 
    28     Each of these named tuples contains information describing
    29     one result column:
    30 
    31         - *name*
    32         - *type_code*
    33         - *display_size*
    34         - *internal_size*
    35         - *precision*
    36         - *scale*
    37         - *null_ok*
    38 
    39     The values for *precision* and *scale* are only set for numeric types.
    40     The values for *display_size* and *null_ok* are always ``None``.
    41 
    42     This attribute will be ``None`` for operations that do not return rows
    43     or if the cursor has not had an operation invoked via the
    44     :meth:`Cursor.execute` or :meth:`Cursor.executemany` method yet.
    45 
    46 .. versionchanged:: 5.0
    47     Before version 5.0, this attribute was an ordinary tuple.
    48 
    49 rowcount -- number of rows of the result
    50 ----------------------------------------
    51 
    52 .. attribute:: Cursor.rowcount
    53 
    54     This read-only attribute specifies the number of rows that the last
    55     :meth:`Cursor.execute` or :meth:`Cursor.executemany` call produced
    56     (for DQL statements like SELECT) or affected (for DML statements like
    57     UPDATE or INSERT). It is also set by the :meth:`Cursor.copy_from` and
    58     :meth':`Cursor.copy_to` methods. The attribute is -1 in case no such
    59     method call has been performed on the cursor or the rowcount of the
    60     last operation cannot be determined by the interface.
    61 
    62 close -- close the cursor
    63 -------------------------
    64 
    65 .. method:: Cursor.close()
    66 
    67     Close the cursor now (rather than whenever it is deleted)
    68 
    69     :rtype: None
    70 
    71 The cursor will be unusable from this point forward; an :exc:`Error`
    72 (or subclass) exception will be raised if any operation is attempted
    73 with the cursor.
    74 
    75 execute -- execute a database operation
    76 ---------------------------------------
    77 
    78 .. method:: Cursor.execute(operation, [parameters])
    79 
    80     Prepare and execute a database operation (query or command)
    81 
    82     :param str operation: the database operation
    83     :param parameters: a sequence or mapping of parameters
    84     :returns: the cursor, so you can chain commands
    85 
    86 Parameters may be provided as sequence or mapping and will be bound to
    87 variables in the operation. Variables are specified using Python extended
    88 format codes, e.g. ``" ... WHERE name=%(name)s"``.
    89 
    90 A reference to the operation will be retained by the cursor. If the same
    91 operation object is passed in again, then the cursor can optimize its behavior.
    92 This is most effective for algorithms where the same operation is used,
    93 but different parameters are bound to it (many times).
    94 
    95 The parameters may also be specified as list of tuples to e.g. insert multiple
    96 rows in a single operation, but this kind of usage is deprecated:
    97 :meth:`Cursor.executemany` should be used instead.
    98 
    99 Note that in case this method raises a :exc:`DatabaseError`, you can get
    100 information about the error condition that has occurred by introspecting
    101 its :attr:`DatabaseError.sqlstate` attribute, which will be the ``SQLSTATE``
    102 error code associated with the error.  Applications that need to know which
    103 error condition has occurred should usually test the error code, rather than
    104 looking at the textual error message.
    105 
    106 executemany -- execute many similar database operations
    107 -------------------------------------------------------
    108 
    109 .. method:: Cursor.executemany(operation, [seq_of_parameters])
    110 
    111     Prepare and execute many similar database operations (queries or commands)
    112 
    113     :param str operation: the database operation
    114     :param seq_of_parameters: a sequence or mapping of parameter tuples or mappings
    115     :returns: the cursor, so you can chain commands
    116 
    117 Prepare a database operation (query or command) and then execute it against
    118 all parameter tuples or mappings found in the sequence *seq_of_parameters*.
    119 
    120 Parameters are bounded to the query using Python extended format codes,
    121 e.g. ``" ... WHERE name=%(name)s"``.
    122 
    123 callproc -- Call a stored procedure
    124 -----------------------------------
    125 
    126 .. method:: Cursor.callproc(self, procname, [parameters]):
    127 
    128     Call a stored database procedure with the given name
    129 
    130     :param str procname: the name of the database function
    131     :param parameters: a sequence of parameters (can be empty or omitted)
    132 
    133 This method calls a stored procedure (function) in the PostgreSQL database.
    134 
    135 The sequence of parameters must contain one entry for each input argument
    136 that the function expects. The result of the call is the same as this input
    137 sequence; replacement of output and input/output parameters in the return
    138 value is currently not supported.
    139 
    140 The function may also provide a result set as output. These can be requested
    141 through the standard fetch methods of the cursor.
     6.. class:: TypeCache
    1427
    1438.. versionadded:: 5.0
    1449
    145 fetchone -- fetch next row of the query result
    146 ----------------------------------------------
     10The internal :class:`TypeCache` of PyGreSQL is not part of the DB-API 2
     11standard, but is documented here in case you need full control and
     12understanding of the internal handling of database types.
    14713
    148 .. method:: Cursor.fetchone()
     14The TypeCache is essentially a dictionary mapping PostgreSQL internal
     15type names and type OIDs to DB-API 2 "type codes" (which are also returned
     16as the *type_code* field of the :attr:`Cursor.description` attribute).
    14917
    150     Fetch the next row of a query result set
     18These type codes are strings which are equal to the PostgreSQL internal
     19type name, but they are also carrying additional information about the
     20associated PostgreSQL type in the following attributes:
    15121
    152     :returns: the next row of the query result set
    153     :rtype: named tuple or None
     22        - *oid* -- the OID of the type
     23        - *len*  -- the internal size
     24        - *type*  -- ``'b'`` = base, ``'c'`` = composite, ...
     25        - *category*  -- ``'A'`` = Array, ``'B'`` = Boolean, ...
     26        - *delim*  -- delimiter to be used when parsing arrays
     27        - *relid*  -- the table OID for composite types
    15428
    155 Fetch the next row of a query result set, returning a single named tuple,
    156 or ``None`` when no more data is available. The field names of the named
    157 tuple are the same as the column names of the database query as long as
    158 they are valid Python identifiers.
     29For details, see the PostgreSQL documentation on `pg_type
     30<http://www.postgresql.org/docs/current/static/catalog-pg-type.html>`_.
    15931
    160 An :exc:`Error` (or subclass) exception is raised if the previous call to
    161 :meth:`Cursor.execute` or :meth:`Cursor.executemany` did not produce
    162 any result set or no call was issued yet.
     32In addition to the dictionary methods, the :class:`TypeCache` provides
     33the following methods:
    16334
    164 .. versionchanged:: 5.0
    165     Before version 5.0, this method returned ordinary tuples.
     35.. method:: TypeCache.get_fields(typ)
    16636
    167 fetchmany -- fetch next set of rows of the query result
    168 -------------------------------------------------------
     37    Get the names and types of the fields of composite types
    16938
    170 .. method:: Cursor.fetchmany([size=None], [keep=False])
     39    :param typ: PostgreSQL type name or OID of a composite type
     40    :type typ: str or int
     41    :returns: a list of pairs of field names and types
     42    :rtype: list
    17143
    172     Fetch the next set of rows of a query result
     44.. method:: TypeCache.get_typecast(typ)
    17345
    174     :param size: the number of rows to be fetched
    175     :type size: int or None
    176     :param keep: if set to true, will keep the passed arraysize
    177     :tpye keep: bool
    178     :returns: the next set of rows of the query result
    179     :rtype: list of named tuples
     46    Get the cast function for the given database type
    18047
    181 Fetch the next set of rows of a query result, returning a list of named
    182 tuples. An empty sequence is returned when no more rows are available.
    183 The field names of the named tuple are the same as the column names of
    184 the database query as long as they are valid Python identifiers.
     48    :param str typ: PostgreSQL type name or type code
     49    :returns: the typecast function for the specified type
     50    :rtype: function or None
    18551
    186 The number of rows to fetch per call is specified by the *size* parameter.
    187 If it is not given, the cursor's :attr:`arraysize` determines the number of
    188 rows to be fetched. If you set the *keep* parameter to True, this is kept as
    189 new :attr:`arraysize`.
     52.. method:: TypeCache.set_typecast(typ, cast)
    19053
    191 The method tries to fetch as many rows as indicated by the *size* parameter.
    192 If this is not possible due to the specified number of rows not being
    193 available, fewer rows may be returned.
     54    Set a typecast function for the given database type(s)
    19455
    195 An :exc:`Error` (or subclass) exception is raised if the previous call to
    196 :meth:`Cursor.execute` or :meth:`Cursor.executemany` did not produce
    197 any result set or no call was issued yet.
     56    :param typ: PostgreSQL type name or type code, or list of such
     57    :type typ: str or list
     58    :param cast: the typecast function to be set for the specified type(s)
     59    :type typ: str or int
    19860
    199 Note there are performance considerations involved with the *size* parameter.
    200 For optimal performance, it is usually best to use the :attr:`arraysize`
    201 attribute. If the *size* parameter is used, then it is best for it to retain
    202 the same value from one :meth:`Cursor.fetchmany` call to the next.
     61.. method:: TypeCache.reset_typecast([typ])
    20362
    204 .. versionchanged:: 5.0
    205     Before version 5.0, this method returned ordinary tuples.
     63    Reset the typecasts for the specified (or all) type(s) to their defaults
    20664
    207 fetchall -- fetch all rows of the query result
    208 ----------------------------------------------
     65    :param str typ: PostgreSQL type name or type code, or list of such,
     66        or None to reset all typecast functions
     67    :type typ: str, list or None
    20968
    210 .. method:: Cursor.fetchall()
     69.. method:: TypeCache.typecast(typ, value)
    21170
    212     Fetch all (remaining) rows of a query result
     71    Cast the given value according to the given database type
    21372
    214     :returns: the set of all rows of the query result
    215     :rtype: list of named tuples
     73    :param str typ: PostgreSQL type name or type code
     74    :returns: the casted value
    21675
    217 Fetch all (remaining) rows of a query result, returning them as list of
    218 named tuples. The field names of the named tuple are the same as the column
    219 names of the database query as long as they are valid Python identifiers.
    220 
    221 Note that the cursor's :attr:`arraysize` attribute can affect the performance
    222 of this operation.
    223 
    224 .. versionchanged:: 5.0
    225     Before version 5.0, this method returned ordinary tuples.
    226 
    227 arraysize - the number of rows to fetch at a time
    228 -------------------------------------------------
    229 
    230 .. attribute:: Cursor.arraysize
    231 
    232     The number of rows to fetch at a time
    233 
    234 This read/write attribute specifies the number of rows to fetch at a time with
    235 :meth:`Cursor.fetchmany`. It defaults to 1, meaning to fetch a single row
    236 at a time.
    237 
    238 Methods and attributes that are not part of the standard
    239 --------------------------------------------------------
    24076
    24177.. note::
    24278
    243    The following methods and attributes are not part of the DB-API 2 standard.
    244 
    245 .. method:: Cursor.copy_from(stream, table, [format], [sep], [null], [size], [columns])
    246 
    247     Copy data from an input stream to the specified table
    248 
    249     :param stream: the input stream
    250         (must be a file-like object, a string or an iterable returning strings)
    251     :param str table: the name of a database table
    252     :param str format: the format of the data in the input stream,
    253         can be ``'text'`` (the default), ``'csv'``, or ``'binary'``
    254     :param str sep: a single character separator
    255         (the default is ``'\t'`` for text and ``','`` for csv)
    256     :param str null: the textual representation of the ``NULL`` value,
    257         can also be an empty string (the default is ``'\\N'``)
    258     :param int size: the size of the buffer when reading file-like objects
    259     :param list column: an optional list of column names
    260     :returns: the cursor, so you can chain commands
    261 
    262     :raises TypeError: parameters with wrong types
    263     :raises ValueError: invalid parameters
    264     :raises IOError: error when executing the copy operation
    265 
    266 This method can be used to copy data from an input stream on the client side
    267 to a database table on the server side using the ``COPY FROM`` command.
    268 The input stream can be provided in form of a file-like object (which must
    269 have a ``read()`` method), a string, or an iterable returning one row or
    270 multiple rows of input data on each iteration.
    271 
    272 The format must be text, csv or binary. The sep option sets the column
    273 separator (delimiter) used in the non binary formats. The null option sets
    274 the textual representation of ``NULL`` in the input.
    275 
    276 The size option sets the size of the buffer used when reading data from
    277 file-like objects.
    278 
    279 The copy operation can be restricted to a subset of columns. If no columns are
    280 specified, all of them will be copied.
    281 
    282 .. versionadded:: 5.0
    283 
    284 .. method:: Cursor.copy_to(stream, table, [format], [sep], [null], [decode], [columns])
    285 
    286     Copy data from the specified table to an output stream
    287 
    288     :param stream: the output stream (must be a file-like object or ``None``)
    289     :param str table: the name of a database table or a ``SELECT`` query
    290     :param str format: the format of the data in the input stream,
    291         can be ``'text'`` (the default), ``'csv'``, or ``'binary'``
    292     :param str sep: a single character separator
    293         (the default is ``'\t'`` for text and ``','`` for csv)
    294     :param str null: the textual representation of the ``NULL`` value,
    295         can also be an empty string (the default is ``'\\N'``)
    296     :param bool decode: whether decoded strings shall be returned
    297         for non-binary formats (the default is True in Python 3)
    298     :param list column: an optional list of column names
    299     :returns: a generator if stream is set to ``None``, otherwise the cursor
    300 
    301     :raises TypeError: parameters with wrong types
    302     :raises ValueError: invalid parameters
    303     :raises IOError: error when executing the copy operation
    304 
    305 This method can be used to copy data from a database table on the server side
    306 to an output stream on the client side using the ``COPY TO`` command.
    307 
    308 The output stream can be provided in form of a file-like object (which must
    309 have a ``write()`` method). Alternatively, if ``None`` is passed as the
    310 output stream, the method will return a generator yielding one row of output
    311 data on each iteration.
    312 
    313 Output will be returned as byte strings unless you set decode to true.
    314 
    315 Note that you can also use a ``SELECT`` query instead of the table name.
    316 
    317 The format must be text, csv or binary. The sep option sets the column
    318 separator (delimiter) used in the non binary formats. The null option sets
    319 the textual representation of ``NULL`` in the output.
    320 
    321 The copy operation can be restricted to a subset of columns. If no columns are
    322 specified, all of them will be copied.
    323 
    324 .. versionadded:: 5.0
    325 
    326 .. method:: Cursor.row_factory(row)
    327 
    328     Process rows before they are returned
    329 
    330     :param list row: the currently processed row of the result set
    331     :returns: the transformed row that the fetch methods shall return
    332 
    333 This method is used for processing result rows before returning them through
    334 one of the fetch methods. By default, rows are returned as named tuples.
    335 You can overwrite this method with a custom row factory if you want to
    336 return the rows as different kids of objects. This same row factory will then
    337 be used for all result sets. If you overwrite this method, the method
    338 :meth:`Cursor.build_row_factory` for creating row factories dynamically
    339 will be ignored.
    340 
    341 Note that named tuples are very efficient and can be easily converted to
    342 dicts (even OrderedDicts) by calling ``row._asdict()``. If you still want
    343 to return rows as dicts, you can create a custom cursor class like this::
    344 
    345     class DictCursor(pgdb.Cursor):
    346 
    347         def row_factory(self, row):
    348             return {key: value for key, value in zip(self.colnames, row)}
    349 
    350     cur = DictCursor(con)  # get one DictCursor instance or
    351     con.cursor_type = DictCursor  # always use DictCursor instances
    352 
    353 .. versionadded:: 4.0
    354 
    355 .. method:: Cursor.build_row_factory()
    356 
    357     Build a row factory based on the current description
    358 
    359     :returns: callable with the signature of :meth:`Cursor.row_factory`
    360 
    361 This method returns row factories for creating named tuples. It is called
    362 whenever a new result set is created, and :attr:`Cursor.row_factory` is
    363 then assigned the return value of this method. You can overwrite this method
    364 with a custom row factory builder if you want to use different row factories
    365 for different result sets. Otherwise, you can also simply overwrite the
    366 :meth:`Cursor.row_factory` method. This method will then be ignored.
    367 
    368 The default implementation that delivers rows as named tuples essentially
    369 looks like this::
    370 
    371     def build_row_factory(self):
    372         return namedtuple('Row', self.colnames, rename=True)._make
    373 
    374 .. versionadded:: 5.0
    375 
    376 .. attribute:: Cursor.colnames
    377 
    378     The list of columns names of the current result set
    379 
    380 The values in this list are the same values as the *name* elements
    381 in the :attr:`Cursor.description` attribute. Always use the latter
    382 if you want to remain standard compliant.
    383 
    384 .. versionadded:: 5.0
    385 
    386 .. attribute:: Cursor.coltypes
    387 
    388     The list of columns types of the current result set
    389 
    390 The values in this list are the same values as the *type_code* elements
    391 in the :attr:`Cursor.description` attribute. Always use the latter
    392 if you want to remain standard compliant.
    393 
    394 .. versionadded:: 5.0
     79    Note that the :class:`TypeCache` is always bound to a database connection.
     80    You can also get, set and reset typecast functions on a global level using
     81    the functions :func:`pgdb.get_typecast`, :func:`pgdb.set_typecast` and
     82    :func:`pgdb.reset_typecast`.  If you do this, the current database
     83    connections will continue to use their already cached typecast functions
     84    unless you call the :meth:`TypeCache.reset_typecast` method on the
     85    :attr:`Connection.type_cache` of the running connections.
  • trunk/docs/contents/pgdb/types.rst

    r796 r797  
    33
    44.. py:currentmodule:: pgdb
     5
     6.. _type_constructors:
    57
    68Type constructors
     
    7072    SQL ``NULL`` values are always represented by the Python *None* singleton
    7173    on input and output.
     74
     75.. _type_objects:
    7276
    7377Type objects
Note: See TracChangeset for help on using the changeset viewer.