Changeset 683 for trunk


Ignore:
Timestamp:
Jan 1, 2016, 6:11:01 PM (4 years ago)
Author:
cito
Message:

Return rows as named tuples in pgdb

By default, we now return result rows as named tuples in pgdb.

Note that named tuples can be accessed like normal lists and tuples,
and easily converted to these. They can also be easily converted to
(ordered) dictionaries by calling row._asdict(). Therefore the need
for alternative Cursor types with different row types has been greatly
reduced, so I have simplified the implementation in the last revision
by removing the added Cursor classes and cursor() methods again,
leaving only the old row_factory method for customizing the returned
row types. I complemented this with a new build_row_factory method,
because different named tuple classes must be created for different
result sets, so a static row_factory method is not so appropriate.

Tests and documentation for these changes are included.

Location:
trunk
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • trunk/docs/changelog.rst

    r681 r683  
    77- The supported versions are Python 2.6, 2.7, 3.3, 3.4 and 3.5.
    88- The supported PostgreSQL versions are 9.0, 9.1, 9.2, 9.3 and 9.4.
     9- The DB-API 2 module now always returns result rows as named tuples
     10  instead of simply lists as before. The documentation explains how
     11  you can restore the old behavior or use custom row objects instead.
    912- The names of the various types supported by the classic and DB-API 2
    1013  modules have been renamed to become simpler, more intuitive and in
  • trunk/docs/pgdb.rst

    r681 r683  
    191191------------------------------------
    192192
    193 .. method:: Connection.cusor()
     193.. method:: Connection.cursor()
    194194
    195195    Return a new cursor object using the connection
     
    200200This method returns a new :class:`Cursor` object that can be used to
    201201operate on the database in the way described in the next section.
     202
     203Attributes that are not part of the standard
     204--------------------------------------------
     205
     206.. note::
     207
     208   The following attributes are not part of the DB-API 2 standard.
     209
     210.. attribute:: cursor_type
     211
     212    The default cursor type used by the connection
     213
     214If you want to use your own custom subclass of the :class:`Cursor` class
     215with he connection, set this attribute to you custom cursor class. You will
     216then get your custom cursor whenever you call :meth:`Connection.cursor`.
    202217
    203218
     
    319334
    320335    :returns: the next row of the query result set
    321     :rtype: tuple or None
    322 
    323 Fetch the next row of a query result set, returning a single tuple,
    324 or ``None`` when no more data is available.
     336    :rtype: named tuple or None
     337
     338Fetch the next row of a query result set, returning a single named tuple,
     339or ``None`` when no more data is available. The field names of the named
     340tuple are the same as the column names of the database query as long as
     341they are valid Python identifiers.
    325342
    326343An :exc:`Error` (or subclass) exception is raised if the previous call to
     
    340357    :tpye keep: bool
    341358    :returns: the next set of rows of the query result
    342     :rtype: list of tuples
    343 
    344 Fetch the next set of rows of a query result, returning a list of tuples.
    345 An empty sequence is returned when no more rows are available.
     359    :rtype: list of named tuples
     360
     361Fetch the next set of rows of a query result, returning a list of named
     362tuples. An empty sequence is returned when no more rows are available.
     363The field names of the named tuple are the same as the column names of
     364the database query as long as they are valid Python identifiers.
    346365
    347366The number of rows to fetch per call is specified by the *size* parameter.
     
    371390
    372391    :returns: the set of all rows of the query result
    373     :rtype: list of tuples
    374 
    375 Fetch all (remaining) rows of a query result, returning them as list of tuples.
     392    :rtype: list of named tuples
     393
     394Fetch all (remaining) rows of a query result, returning them as list of
     395named tuples. The field names of the named tuple are the same as the column
     396names of the database query as long as they are valid Python identifiers.
     397
    376398Note that the cursor's :attr:`arraysize` attribute can affect the performance
    377399of this operation.
    378400
    379 row_factory -- process a row of the query result
    380 ------------------------------------------------
    381 
    382 .. method:: Cursor.row_factory(row)
    383 
    384     Process rows before they are returned
    385 
    386     :param tuple row: the currently processed row of the result set
    387     :returns: the transformed row that the cursor methods shall return
    388 
    389 Note that this method is not part of the DB-API 2 standard.
    390 
    391 You can overwrite this method with a custom row factory, e.g.
    392 if you want to return rows as dicts instead of tuples::
    393 
    394     class DictCursor(pgdb.Cursor):
    395 
    396         def row_factory(self, row):
    397             return {desc[0]:value
    398                 for desc, value in zip(self.description, row)}
    399 
    400     cur = DictCursor(con)
    401 
    402401arraysize - the number of rows to fetch at a time
    403402-------------------------------------------------
     
    408407
    409408This read/write attribute specifies the number of rows to fetch at a time with
    410 :meth:`Cursor.fetchmany`. It defaults to 1 meaning to fetch a single row
     409:meth:`Cursor.fetchmany`. It defaults to 1, meaning to fetch a single row
    411410at a time.
     411
     412Methods and attributes that are not part of the standard
     413--------------------------------------------------------
     414
     415.. note::
     416
     417   The following methods and attributes are not part of the DB-API 2 standard.
     418
     419.. method:: Cursor.row_factory(row)
     420
     421    Process rows before they are returned
     422
     423    :param tuple row: the currently processed row of the result set
     424    :returns: the transformed row that the fetch methods shall return
     425
     426This method is used for processing result rows before returning them through
     427one of the fetch methods. By default, rows are returned as named tuples.
     428You can overwrite this method with a custom row factory if you want to
     429return the rows as different kids of objects. This same row factory will then
     430be used for all result sets. If you overwrite this method, the method
     431:meth:`Cursor.build_row_factory` for creating row factories dynamically
     432will be ignored.
     433
     434Note that named tuples are very efficient and can be easily converted to
     435dicts (even OrderedDicts) by calling ``row._asdict()``. If you still want
     436to return rows as dicts, you can create a custom cursor class like this::
     437
     438    class DictCursor(pgdb.Cursor):
     439
     440        def row_factory(self, row):
     441            return {key: value for key, value in zip(self.colnames, row)}
     442
     443    cur = DictCursor(con)  # get one DictCursor instance or
     444    con.cursor_type = DictCursor  # always use DictCursor instances
     445
     446
     447.. method:: Cursor.build_row_factory()
     448
     449    Build a row factory based on the current description
     450
     451    :returns: callable with the signature of :meth:`Cursor.row_factory`
     452
     453This method returns row factories for creating named tuples. It is called
     454whenever a new result set is created, and :attr:`Cursor.row_factory` is
     455then assigned the return value of this method. You can overwrite this method
     456with a custom row factory builder if you want to use different row factories
     457for different result sets. Otherwise, you can also simply overwrite the
     458:meth:`Cursor.row_factory` method. This method will then be ignored.
     459
     460The default implementation that delivers rows as named tuples essentially
     461looks like this::
     462
     463    def build_row_factory(self):
     464        return namedtuple('Row', self.colnames, rename=True)._make
     465
     466.. attribute:: Cursor.colnames
     467
     468    The list of columns names of the current result set
     469
     470The values in this list are the same values as the *name* elements
     471in the :attr:`Cursor.description` attribute. Always use the latter
     472if you want to remain standard compliant.
     473
     474.. attribute:: Cursor.coltypes
     475
     476    The list of columns types of the current result set
     477
     478The values in this list are the same values as the *type_code* elements
     479in the :attr:`Cursor.description` attribute. Always use the latter
     480if you want to remain standard compliant.
    412481
    413482
     
    473542    Used to describe the ``oid`` column of PostgreSQL database tables
    474543
    475 The following more specific types are not part of the DB-API 2 standard:
     544.. note::
     545
     546  The following more specific types are not part of the DB-API 2 standard.
    476547
    477548.. class:: BOOL
  • trunk/module/pgdb.py

    r682 r683  
    229229    """Cursor object."""
    230230
    231     row_factory = tuple  # the factory for creating result rows
    232 
    233231    def __init__(self, dbcnx):
    234232        """Create a cursor object for the database connection."""
     
    237235        self._type_cache = dbcnx._type_cache
    238236        self._src = self._cnx.source()
     237        # the official attribute for describing the result columns
     238        self.description = None
     239        # unofficial attributes for convenience and performance
    239240        self.colnames = self.coltypes = None
    240         self.description = None
     241        if self.row_factory is Cursor.row_factory:
     242            # the row factory needs to be determined dynamically
     243            self.row_factory = None
     244        else:
     245            self.build_row_factory = None
    241246        self.rowcount = -1
    242247        self.arraysize = 1
     
    303308        """Close the cursor object."""
    304309        self._src.close()
     310        self.description = None
    305311        self.colnames = self.coltypes = None
    306         self.description = None
    307312        self.rowcount = -1
    308313        self.lastrowid = None
     
    326331            # don't do anything without parameters
    327332            return
     333        self.description = None
    328334        self.colnames = self.coltypes = None
    329         self.description = None
    330335        self.rowcount = -1
    331336        # first try to execute all queries
    332         totrows = 0
     337        rowcount = 0
    333338        sql = "BEGIN"
    334339        try:
     
    348353                rows = self._src.execute(sql)
    349354                if rows:  # true if not DML
    350                     totrows += rows
     355                    rowcount += rows
    351356                else:
    352357                    self.rowcount = -1
     
    367372            self.description = description
    368373            self.lastrowid = None
    369         else:
    370             self.rowcount = totrows
     374            if self.build_row_factory:
     375                self.row_factory = self.build_row_factory()
     376        else:
     377            self.rowcount = rowcount
    371378            self.lastrowid = self._src.oidstatus()
    372379        # return the cursor object, so you can write statements such as
     
    435442        pass  # unsupported, but silently passed
    436443
     444    @staticmethod
     445    def row_factory(row):
     446        """Process rows before they are returned.
     447
     448        You can overwrite this statically with a custom row factory, or
     449        you can build a row factory dynamically with build_row_factory().
     450
     451        For example, you can create a Cursor class that returns rows as
     452        Python dictionaries like this:
     453
     454            class DictCursor(pgdb.Cursor):
     455
     456                def row_factory(self, row):
     457                    return {desc[0]: value
     458                        for desc, value in zip(self.description, row)}
     459
     460            cur = DictCursor(con)  # get one DictCursor instance or
     461            con.cursor_type = DictCursor  # always use DictCursor instances
     462
     463        """
     464        raise NotImplementedError
     465
     466    def build_row_factory(self):
     467        """Build a row factory based on the current description.
     468
     469        This implementation builds a row factory for creating named tuples.
     470        You can overwrite this method if you want to dynamically create
     471        different row factories whenever the column description changes.
     472
     473        """
     474        colnames = self.colnames
     475        if colnames:
     476            try:
     477                try:
     478                    return namedtuple('Row', colnames, rename=True)._make
     479                except TypeError:  # Python 2.6 and 3.0 do not support rename
     480                    colnames = [v if v.isalnum() else 'column_%d' % n
     481                             for n, v in enumerate(colnames)]
     482                    return namedtuple('Row', colnames)._make
     483            except ValueError:  # there is still a problem with the field names
     484                colnames = ['column_%d' % n for n in range(len(colnames))]
     485                return namedtuple('Row', colnames)._make
     486
    437487
    438488CursorDescription = namedtuple('CursorDescription',
    439489    ['name', 'type_code', 'display_size', 'internal_size',
    440490     'precision', 'scale', 'null_ok'])
    441 
    442 
    443 class ListCursor(Cursor):
    444     """Cursor object that returns rows as lists."""
    445 
    446     row_factory = list
    447 
    448 
    449 class DictCursor(Cursor):
    450     """Cursor object that returns rows as dictionaries."""
    451 
    452     def row_factory(self, row):
    453         """Turn a row from a tuple into a dictionary."""
    454         # not using dict comprehension to stay compatible with Py 2.6
    455         return dict((key, value) for key, value in zip(self.colnames, row))
    456 
    457 
    458 class OrderedDictCursor(Cursor):
    459     """Cursor object that returns rows as ordered dictionaries."""
    460 
    461     def row_factory(self, row):
    462         """Turn a row from a tuple into an ordered dictionary."""
    463         return OrderedDict(
    464             (key, value) for key, value in zip(self.colnames, row))
    465 
    466 
    467 class NamedTupleCursor(Cursor):
    468     """Cursor object that returns rows as named tuples."""
    469 
    470     @property
    471     def colnames(self):
    472         return self._colnames
    473 
    474     @colnames.setter
    475     def colnames(self, value):
    476         self._colnames = value
    477         if value:
    478             try:
    479                 try:
    480                     factory = namedtuple('Row', value, rename=True)._make
    481                 except TypeError:  # Python 2.6 and 3.0 do not support rename
    482                     value = [v if v.isalnum() else 'column_%d' % n
    483                              for n, v in enumerate(value)]
    484                     factory = namedtuple('Row', value)._make
    485             except ValueError:
    486                 value = ['column_%d' % n for n in range(len(value))]
    487                 factory = namedtuple('Row', value)._make
    488         else:
    489             factory = tuple
    490         self._colnames, self.row_factory = value, factory
    491491
    492492
     
    514514        self._type_cache = TypeCache(cnx)
    515515        self.cursor_type = Cursor
    516         self.row_type = tuple
    517516        try:
    518517            self._cnx.source()
     
    580579            raise _op_error("connection has been closed")
    581580
    582     def cursor(self, cls=None):
     581    def cursor(self):
    583582        """Return a new cursor object using the connection."""
    584583        if self._cnx:
    585584            try:
    586                 return (cls or self.cursor_type)(self)
     585                return self.cursor_type(self)
    587586            except Exception:
    588587                raise _op_error("invalid connection")
    589588        else:
    590589            raise _op_error("connection has been closed")
    591 
    592     def list_cursor(self):
    593         """Return a new tuple cursor object using the connection."""
    594         return self.cursor(ListCursor)
    595 
    596     def tuple_cursor(self):
    597         """Return a new tuple cursor object using the connection."""
    598         return self.cursor(Cursor)
    599 
    600     def named_tuple_cursor(self):
    601         """Return a new named tuple cursor object using the connection."""
    602         return self.cursor(NamedTupleCursor)
    603 
    604     def dict_cursor(self):
    605         """Return a new dict cursor object using the connection."""
    606         return self.cursor(DictCursor)
    607 
    608     def ordered_dict_cursor(self):
    609         """Return a new ordered dict cursor object using the connection."""
    610         return self.cursor(OrderedDictCursor)
    611590
    612591    if shortcutmethods:  # otherwise do not implement and document this
  • trunk/module/tests/test_dbapi20.py

    r682 r683  
    6767        dbapi20.DatabaseAPI20Test.tearDown(self)
    6868
    69     def test_cursortype(self):
     69    def test_cursor_type(self):
     70
     71        class TestCursor(pgdb.Cursor):
     72            pass
     73
    7074        con = self._connect()
    7175        self.assertIs(con.cursor_type, pgdb.Cursor)
    7276        cur = con.cursor()
    7377        self.assertIsInstance(cur, pgdb.Cursor)
    74         self.assertNotIsInstance(cur, pgdb.ListCursor)
    75         cur.close()
    76         con.cursor_type = pgdb.ListCursor
     78        self.assertNotIsInstance(cur, TestCursor)
     79        con.cursor_type = TestCursor
     80        cur = con.cursor()
     81        self.assertIsInstance(cur, TestCursor)
     82        cur = con.cursor()
     83        self.assertIsInstance(cur, TestCursor)
     84        con = self._connect()
     85        self.assertIs(con.cursor_type, pgdb.Cursor)
    7786        cur = con.cursor()
    7887        self.assertIsInstance(cur, pgdb.Cursor)
    79         self.assertIsInstance(cur, pgdb.ListCursor)
    80         cur.close()
    81         cur = con.cursor()
    82         self.assertIsInstance(cur, pgdb.ListCursor)
    83         cur.close()
    84         con.close()
    85         con = self._connect()
    86         self.assertIs(con.cursor_type, pgdb.Cursor)
    87         cur = con.cursor()
    88         self.assertIsInstance(cur, pgdb.Cursor)
    89         self.assertNotIsInstance(cur, pgdb.ListCursor)
    90         cur.close()
    91         con.close()
    92 
    93     def test_list_cursor(self):
    94         con = self._connect()
    95         cur = con.list_cursor()
    96         self.assertIsInstance(cur, pgdb.ListCursor)
    97         cur.execute("select 1, 2, 3")
     88        self.assertNotIsInstance(cur, TestCursor)
     89
     90    def test_row_factory(self):
     91
     92        class TestCursor(pgdb.Cursor):
     93
     94            def row_factory(self, row):
     95                return dict(('column %s' % desc[0], value)
     96                    for desc, value in zip(self.description, row))
     97
     98        con = self._connect()
     99        con.cursor_type = TestCursor
     100        cur = con.cursor()
     101        self.assertIsInstance(cur, TestCursor)
     102        res = cur.execute("select 1 as a, 2 as b")
     103        self.assertIs(res, cur, 'execute() should return cursor')
    98104        res = cur.fetchone()
    99         cur.close()
    100         con.close()
     105        self.assertIsInstance(res, dict)
     106        self.assertEqual(res, {'column a': 1, 'column b': 2})
     107        cur.execute("select 1 as a, 2 as b union select 3, 4 order by 1")
     108        res = cur.fetchall()
    101109        self.assertIsInstance(res, list)
    102         self.assertEqual(res, [1, 2, 3])
    103 
    104     def test_tuple_cursor(self):
    105         con = self._connect()
    106         cur = con.tuple_cursor()
    107         self.assertIsInstance(cur, pgdb.Cursor)
    108         cur.execute("select 1, 2, 3")
     110        self.assertEqual(len(res), 2)
     111        self.assertIsInstance(res[0], dict)
     112        self.assertEqual(res[0], {'column a': 1, 'column b': 2})
     113        self.assertIsInstance(res[1], dict)
     114        self.assertEqual(res[1], {'column a': 3, 'column b': 4})
     115
     116    def test_build_row_factory(self):
     117
     118        class TestCursor(pgdb.Cursor):
     119
     120            def build_row_factory(self):
     121                keys = [desc[0] for desc in self.description]
     122                return lambda row: dict((key, value)
     123                    for key, value in zip(keys, row))
     124
     125        con = self._connect()
     126        con.cursor_type = TestCursor
     127        cur = con.cursor()
     128        self.assertIsInstance(cur, TestCursor)
     129        cur.execute("select 1 as a, 2 as b")
    109130        res = cur.fetchone()
    110         cur.close()
    111         con.close()
    112         self.assertIsInstance(res, tuple)
    113         self.assertEqual(res, (1, 2, 3))
    114         self.assertRaises(AttributeError, getattr, res, '_fields')
    115 
    116     def test_named_tuple_cursor(self):
    117         con = self._connect()
    118         cur = con.named_tuple_cursor()
    119         self.assertIsInstance(cur, pgdb.NamedTupleCursor)
    120         cur.execute("select 1 as abc, 2 as de, 3 as f")
     131        self.assertIsInstance(res, dict)
     132        self.assertEqual(res, {'a': 1, 'b': 2})
     133        cur.execute("select 1 as a, 2 as b union select 3, 4 order by 1")
     134        res = cur.fetchall()
     135        self.assertIsInstance(res, list)
     136        self.assertEqual(len(res), 2)
     137        self.assertIsInstance(res[0], dict)
     138        self.assertEqual(res[0], {'a': 1, 'b': 2})
     139        self.assertIsInstance(res[1], dict)
     140        self.assertEqual(res[1], {'a': 3, 'b': 4})
     141
     142    def test_cursor_with_named_columns(self):
     143        con = self._connect()
     144        cur = con.cursor()
     145        res = cur.execute("select 1 as abc, 2 as de, 3 as f")
     146        self.assertIs(res, cur, 'execute() should return cursor')
    121147        res = cur.fetchone()
    122         cur.close()
    123         con.close()
    124148        self.assertIsInstance(res, tuple)
    125149        self.assertEqual(res, (1, 2, 3))
     
    128152        self.assertEqual(res.de, 2)
    129153        self.assertEqual(res.f, 3)
    130 
    131     def test_named_tuple_cursor_with_bad_names(self):
    132         con = self._connect()
    133         cur = con.named_tuple_cursor()
    134         self.assertIsInstance(cur, pgdb.NamedTupleCursor)
     154        cur.execute("select 1 as one, 2 as two union select 3, 4 order by 1")
     155        res = cur.fetchall()
     156        self.assertIsInstance(res, list)
     157        self.assertEqual(len(res), 2)
     158        self.assertIsInstance(res[0], tuple)
     159        self.assertEqual(res[0], (1, 2))
     160        self.assertEqual(res[0]._fields, ('one', 'two'))
     161        self.assertIsInstance(res[1], tuple)
     162        self.assertEqual(res[1], (3, 4))
     163        self.assertEqual(res[1]._fields, ('one', 'two'))
     164
     165    def test_cursor_with_unnamed_columns(self):
     166        con = self._connect()
     167        cur = con.cursor()
    135168        cur.execute("select 1, 2, 3")
    136169        res = cur.fetchone()
     
    159192        else:
    160193            self.assertEqual(res._fields, ('abc', '_1'))
    161         cur.close()
    162         con.close()
    163 
    164     def test_dict_cursor(self):
    165         con = self._connect()
    166         cur = con.dict_cursor()
    167         self.assertIsInstance(cur, pgdb.DictCursor)
    168         cur.execute("select 1 as abc, 2 as de, 3 as f")
    169         res = cur.fetchone()
    170         cur.close()
    171         con.close()
    172         self.assertIsInstance(res, dict)
    173         self.assertEqual(res, {'abc': 1, 'de': 2, 'f': 3})
    174         self.assertRaises(TypeError, res.popitem, last=True)
    175 
    176     def test_ordered_dict_cursor(self):
    177         con = self._connect()
    178         cur = con.ordered_dict_cursor()
    179         self.assertIsInstance(cur, pgdb.OrderedDictCursor)
    180         cur.execute("select 1 as abc, 2 as de, 3 as f")
    181         try:
    182             res = cur.fetchone()
    183         except pgdb.NotSupportedError:
    184             if OrderedDict is None:
    185                 return
    186             self.fail('OrderedDict supported by Python, but not by pgdb')
    187         finally:
    188             cur.close()
    189             con.close()
    190         self.assertIsInstance(res, dict)
    191         self.assertEqual(res, {'abc': 1, 'de': 2, 'f': 3})
    192         self.assertEqual(res.popitem(last=True), ('f', 3))
    193 
    194     def test_row_factory(self):
    195 
    196         class DictCursor(pgdb.Cursor):
    197 
    198             def row_factory(self, row):
    199                 # not using dict comprehension to stay compatible with Py 2.6
    200                 return dict(('column %s' % desc[0], value)
    201                     for desc, value in zip(self.description, row))
    202 
    203         con = self._connect()
    204         cur = DictCursor(con)
    205         ret = cur.execute("select 1 as a, 2 as b")
    206         self.assertTrue(ret is cur, 'execute() should return cursor')
    207         self.assertEqual(cur.fetchone(), {'column a': 1, 'column b': 2})
    208194
    209195    def test_colnames(self):
     
    227213        self.assertEqual(types, ['int2', 'int4', 'int8'])
    228214
    229     def test_description_named(self):
     215    def test_description_fields(self):
    230216        con = self._connect()
    231217        cur = con.cursor()
     
    236222        desc = desc[0]
    237223        self.assertIsInstance(desc, tuple)
     224        self.assertEqual(len(desc), 7)
    238225        self.assertEqual(desc.name, 'col')
    239226        self.assertEqual(desc.type_code, 'int8')
Note: See TracChangeset for help on using the changeset viewer.