10

I'm looking for an awk command, or similar tool, to transform standard well formatted SQL insert script into csv file.
By standard I mean there is no database vendor specific stuff anywhere.
By well formatted I mean the case where each line of the sql script has a full column set to insert, even if there are NULLs. Also the order of fields to insert is the same.
Sample input SQL script:

INSERT INTO tbl VALUES (1, 'asd', 923123123, 'zx');
INSERT INTO tbl VALUES (1, NULL, 923123123, 'zxz');
INSERT INTO tbl VALUES (3, 'asd3', 923123123, NULL);

Optionally:

INSERT INTO tbl (colA, colB, colC, colD) VALUES (1, 'asd', 923123123, 'zx');

Expected output should be a csv file:

1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,

Looking for a performance efficient solution.

2
  • 1
    csv is practically evil. if text data never contains a single-quote (aka tick = ') then i guess its ok. Other test cases to add include which would simulate common name and address data: INSERT INTO tbl VALUES (1, NULL, 923123123, 'foo '' bar'); INSERT INTO tbl VALUES (1, NULL, 923123123, 'foo , bar'); INSERT INTO tbl VALUES (1, NULL, 923123123, 'foo ,'', bar'); Commented Jun 30, 2020 at 20:31
  • csv is most widely used data exchange format, if it is an evil, it is still something we have to comprehend. If you have decent tool that extracts data to csv, then you will be good. Commented Jun 30, 2020 at 20:45

4 Answers 4

9
+100
$ awk -F' *[(),]+ *' -v OFS=, '{for (i=2;i<NF;i++) printf "%s%s", ($i=="NULL"?"":$i), (i<(NF-1)?OFS:ORS)}' file
1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,

I'd recommend you test all potential solutions with this input:

$ cat file
INSERT INTO tbl VALUES (1, NULL, 923123123, 'foo NULL bar');

$ awk -F' *[(),]+ *' -v OFS=, '{for (i=2;i<NF;i++) printf "%s%s", ($i=="NULL"?"":$i), (i<(NF-1)?OFS:ORS)}' file
1,,923123123,'foo NULL bar'

to make sure the string NULL and blank chars are not deleted when they appear as part of a literal string.

Sign up to request clarification or add additional context in comments.

1 Comment

FPAT may be of interest as well: stackoverflow.com/questions/45420535/… Just noticed you both asked and answered that which is awesome.
6

Try this with GNU grep and sed:

grep -oP '\(\K[^)]*(?=\);)' file | sed 's/NULL//g;s/ //g'

Output from all four lines:

1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,
1,'asd',923123123,'zx'

or only with GNU sed:

sed 's/.*(\([^)]*\));/\1/;s/NULL//g;s/ //g' file

Output from all four lines:

1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,
1,'asd',923123123,'zx'

1 Comment

Neither of those check for the word NULL being a field of it's own rather than part of a field so it will delete NULL within literal strings, and also it deletes all spaces including those within literal strings.
2
awk -F'[()]' -v OFS=, '{gsub(/NULL|;/,"")}{gsub(/, /,",")}{print $(NF -1)}' file
1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,

Comments

0

This Python helper: https://github.com/jamesmishra/mysqldump-to-csv may be of interest. It may be a bit hacky, and it is a bit encoding broken, but it is a start, and handle extremely long INSERT lines as produced by mysqldump without blowing up.

https://stackoverflow.com/a/28168617/895245 mentions it, sample usage:

wget https://raw.githubusercontent.com/jamesmishra/mysqldump-to-csv/master/mysqldump_to_csv.py
mysqldump database table | python mysqldump_to_csv.py > table.csv

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.