4

I have data in CSV file I am trying to insert into a postgresSQL table using pgloader. The input file is from MS SQL server export, and NULL values are already explicitly cast as NULL.

My pgloader scripts seems to fail for the keywords NULL, noticeably for integer and timestamp fields.

I really don't know what I am missing. Your help will be much appreciated.

I can successfully insert into the table from psql console:

insert into raw.a2
(NUM , F_FILENO , F_FOLIONO , F_DOC_TYPE , F_DOCDATE , F_BATCH , F_BOX , F_BLUCPY , F_ROUTOPOST , F_ROUTOUSR , F_WFCREATE , LINKEDFILE , DATECREATE , USERCREATE , DATEUPDATE , USERUPDATE , MEDIA , PGCOUNT , GROUPNUM , SUBJECT , PRI , F_FILECAT)
values
(
16,'18',3,'Nomination Details',NULL,NULL,NULL,1,NULL,NULL,1,'00000016.TIF','2011-02-08 13:02:11.000','isaac','2012-01-12 08:52:31.000','henrey','Multi',4,1.0,0,'-',NULL
);

INSERT 0 1

file-sample

1,'6',1,'Details',2011-02-22 00:00:00.000,NULL,NULL,1,NULL,NULL,2,'00000001.TIF',2011-02-08 09:42:24.000,'kevin',2011-10-27 09:08:42.000,'james','Multi',1,1.0,0,'-',NULL
2,'6',2,'Bio data',NULL,NULL,NULL,1,NULL,NULL,2,'00000002.TIF',2011-02-08 10:25:11.000,'kevin',2012-11-19 16:20:49.000,'pattie','Multi',4,1.0,0,'-',NULL
4,'10',1,'Details',2011-02-22 00:00:00.000,NULL,NULL,1,NULL,NULL,2,'00000004.TIF',2011-02-08 10:43:38.000,'kevin',2014-07-18 10:46:06.000,'brian','Multi',1,1.0,0,'-',NULL

pgloader commands

pgloader --type csv --with truncate --with "fields optionally enclosed by '''" --with "fields terminated by ','" --set "search_path to 'raw'" - "postgresql://postgres:postgres@localhost/doc_db?a2" < null_test

Table

                               Table "raw.a2"
   Column    |            Type             | Collation | Nullable | Default 
-------------+-----------------------------+-----------+----------+---------
 num         | integer                     |           | not null | 
 f_fileno    | character varying(15)       |           |          | 
 f_foliono   | integer                     |           |          | 
 f_doc_type  | character varying(50)       |           |          | 
 f_docdate   | timestamp without time zone |           |          | 
 f_batch     | integer                     |           |          | 
 f_box       | integer                     |           |          | 
 f_blucpy    | integer                     |           |          | 
 f_routopost | integer                     |           |          | 
 f_routousr  | character varying(49)       |           |          | 
 f_wfcreate  | integer                     |           |          | 
 linkedfile  | character varying(255)      |           |          | 
 datecreate  | timestamp without time zone |           |          | 
 usercreate  | character varying(50)       |           |          | 
 dateupdate  | timestamp without time zone |           |          | 
 userupdate  | character varying(50)       |           |          | 
 media       | character varying(5)        |           |          | 
 pgcount     | smallint                    |           |          | 
 groupnum    | double precision            |           |          | 
 subject     | smallint                    |           |          | 
 pri         | character varying(1)        |           |          | 
 f_filecat   | character varying(50)       |           |          | 

Indexes:

"a2_pkey" PRIMARY KEY, btree (num)

Output/Error

2019-07-24T05:55:24.231000Z WARNING Target table "\"raw\".\"a2\"" has 1 indexes defined against it.
2019-07-24T05:55:24.237000Z WARNING That could impact loading performance badly.
2019-07-24T05:55:24.237000Z WARNING Consider the option 'drop indexes'.
2019-07-24T05:55:24.460000Z ERROR PostgreSQL ["\"raw\".\"a2\""] Database error 22P02: invalid input syntax for integer: "NULL"
CONTEXT: COPY a2, line 1, column f_batch: "NULL"
2019-07-24T05:55:24.461000Z ERROR PostgreSQL ["\"raw\".\"a2\""] Database error 22007: invalid input syntax for type timestamp: "NULL"
CONTEXT: COPY a2, line 1, column f_docdate: "NULL"
2
  • Try to replace the word NULL in the file with empty string (without the quotes), for example: 1,'6',1,'Details',2011-02-22 00:00:00.000,,,1,,,2,'00000001.TIF',... Commented Jul 24, 2019 at 6:33
  • 1
    This has worked. Thank you very much. I will now just need to update my SQL server export script to spew out "" in place of the text NULL for NULL values. Commented Jul 24, 2019 at 7:41

1 Answer 1

1

From the docs of pgloader:

null if

This option takes an argument which is either the keyword blanks or a double-quoted string.

When blanks is used and the field value that is read contains only space characters, then it’s automatically converted to an SQL NULL value.

When a double-quoted string is used and that string is read as the field value, then the field value is automatically converted to an SQL NULL value.

Looks like you're missing --with 'null if "NULL"' in your command.

Otherwise, you should be able to load the CSV data directly from psql:

\copy raw.a2 (NUM, F_FILENO, F_FOLIONO, F_DOC_TYPE, F_DOCDATE, F_BATCH, F_BOX, F_BLUCPY, F_ROUTOPOST, F_ROUTOUSR, F_WFCREATE, LINKEDFILE, DATECREATE, USERCREATE, DATEUPDATE, USERUPDATE, MEDIA, PGCOUNT, GROUPNUM, SUBJECT, PRI, F_FILECAT) FROM 'file.csv' WITH (FORMAT csv, NULL 'NULL')
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot. I tried that, but it either seems to a directive pgloader doesn't understand, or I have to provide more parameters with that directive. I may just have to replace the text "NULL" in my input string with "" for now.
You might what to try just leaving the null fields out like ( from your 1st example line) "1,'6',1,'Details',2011-02-22 00:00:00.000,,,1,,,2,'00000001.TIF',..." Changing to "" may cause it to generate a 0 length string. NULL and 0-length string are not the same in Postgres.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.