Opened 4 years ago

Last modified 5 months ago

#53 new enhancement

Consider avoiding encoding/decoding bytea in queries

Reported by: cito Owned by:
Priority: major Milestone: 6.0
Component: C Module Version: 4.1
Keywords: bytea Cc:


Currently, when executing queries that contain bytea values on input or output, these are encoded from bytes and decoded to bytes on the level of the pgdb module and the pg.DB wrapper class.

This is needed because on the lowest level we use the PQexec() method or the PQexecParams() method without setting paramLengths and paramFormats. In both cases, Postgres only uses text format for input and output, so we need to encode and decode.

If we would always use PQexecParams() and set paramLengths and paramFormats, we could avoid the encoding and decoding between bytes and bytea text format, by passing these values in binary format.

(Using binary could also speed up passing other parameters with types that have the same binary representation in Python and Postgres. But that could be brittle because it might depend on the Python and Postgres versions. However, bytes are always bytes, so it would be useful in the case of bytea. Also bytea values are usually big, so time and memory demand for encoding/decoding are more relevant for these.)

Change History (1)

comment:1 Changed 5 months ago by cito

  • Milestone changed from 5.1 to 6.0

I see two problems if we want to use binary only for bytea values:

  1. For input, it is possible to send only some of the parameters as binary. In Python 3, maybe we could do this for parameters sent in as bytes. But in Python 2, we can't discern strings from bytes in the input values. So I suggested postponing this to a future Python-3-only-version of PyGres.
  2. For output, it is currently not even possible to send only some of the result columns as parameters. Also, we would need to analyze the query first to find which columns are bytea value.

If we decide to use binary for all values, this is difficult because the binary representations must match and according to the PostgreSQL docs it also requires knowledge of the internal representation expected by the backend.

Postponing this to 6.0 for now.

Note: See TracTickets for help on using tickets.