5

I want to insert bytes into my PostgreSQL (9.5.7) database column with the type bytea, using the Psycopg2 (2.7.1) copy_from() method.

I can insert my bytes with the following code :

psycopg2_cursor.copy_from(
    StringIO("\x30\x40\x50"),
    "my_table",
)

By executing a SELECT into my table after the insertion, I get the expected value from the bytea column:

\x304050

Now, I want to prepend my bytes with the byte 0:

psycopg2_cursor.copy_from(
    StringIO("\x00\x30\x40\x50"),
    "my_table",
)

I get the error : psycopg2.DataError: invalid byte sequence for encoding "UTF-8": 0x00. From my understanding, this error should only be triggered when inserting a null byte into a text field, but should work as expected into a bytea field. Am I missing something ? Is there any simple way to insert a null byte into a bytea column ?

Thanks!

2
  • what's your standard_conforming_strings setting? Commented Jun 12, 2017 at 7:42
  • did you try StringIO('\x30\x40\x50') instead?.. Commented Jun 12, 2017 at 7:57

2 Answers 2

2

https://www.postgresql.org/docs/current/static/sql-copy.html

the following characters must be preceded by a backslash if they appear as part of a column value: backslash itself, newline, carriage return, and the current delimiter character.

just realized you are using COPY, so you have to escape backslash:

t=# copy b from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> \\x00
>> \.
COPY 1
t=# copy b from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> \x00
>> \.
ERROR:  invalid byte sequence for encoding "UTF8": 0x00
CONTEXT:  COPY b, line 1: "\x00"

this should do the trick:

psycopg2_cursor.copy_from(
    StringIO("\\x00\\x30\\x40\\x50"),
    "my_table",
)
Sign up to request clarification or add additional context in comments.

6 Comments

I added the following parameter to my psycopg2 connection : options="-c standard_conforming_strings=on", but I still get the same error. I tried StringIO('\x30\x40\x50') and it works correctly, but I really need to insert this null byte. I tried to use a BytesIO instead of a StringIO, but I still get the same result.
try that? select decode('00203040','hex') - is it producing your wanted result?..
Yes, it produces the wanted result when using it in PSQL (INSERT INTO my_table VALUES(decode('001060', 'hex'));), but I need to insert my bytes with the copy_from() method from my python code, not directly with a SQL query.
last attempt: psycopg2_cursor.copy_from( StringIO("\\x00\\x30\\x40\\x50"), "my_table", ) ?..
Still the same (invalid byte sequence for encoding "UTF8": 0x00). The good news is that I can insert bytes from my python code when using cursor.execute("INSERT INTO my_table VALUES(decode('005566', 'hex'));") but for performances issues (details on my github: github.com/jean553/massive-insert-postgresql-tornado), I would prefer to insert with copy_from().
|
1

To insert a binary with copy it is necessary to use the binary format which is not what you want. Use the extras.execute_values method

from psycopg2.extensions import Binary

binaries = [[Binary('\x00\x20')], [Binary('\x00\x30')]]

insert_query = 'insert into t (b) values %s'
psycopg2.extras.execute_values (
    cursor, insert_query, binaries, page_size=100
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.