1

I have several CSV files I'd like to combine by matching column headers but still keep the unmatched columns, for example:

Input file1.csv:

col1,col2,col3,col5
a,b,c,d
d,e,b,g
c,a,d,h

Input file2.csv:

col1,col3,col4,col5
g,d,b,c
o,e,x,h
b,n,w,e

Desired output:

col1,col2,col3,col4,col5
a,b,c,,d
d,e,b,,g
c,a,d,,h
g,,d,b,c
o,,e,x,h
b,,n,w,e

2
  • I assume you mean CSV files! CVS is a version control system Commented Jan 10, 2023 at 11:54
  • maybe have a look at something like github.com/BurntSushi/xsv for working with csv file directly. either that or import into an RDBMS like Postgres, or maybe something like Pandas in Python Commented Jan 10, 2023 at 11:57

1 Answer 1

2

I would use Miller (available here for several OSs):

mlr --csv unsparsify file1.csv file2.csv
col1,col2,col3,col5,col4
a,b,c,d,
d,e,b,g,
c,a,d,h,
g,,d,c,b
o,,e,h,x
b,,n,e,w

remark: The columns are outputted in the order in which they first appear; if need be, you can specify a custom ordering, but you'll need to know the column names in advance.

Sign up to request clarification or add additional context in comments.

1 Comment

Excellent! Worked excatly as intended, many thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.