2

I would like to merge four .txt files into in a unique file. However, the idea is not a simple concatenation, but otherwise an 'interlacement' between the input files where the file1 will be the first three columns and files 2-4 must be pasted column by column in a subsequent order. Thus we have:

file1:

file1 <- '  AX-1   1    125    
            AX-2   2    456
            AX-3   3    3445'
file1 <- read.table(text=file1, header=F)
write.table(file1, "file1.txt", col.names=F, row.names=F, quote=F) 

file2:

file2 <- '  AX-1   AA  AB  AA    
            AX-2   AA  AA  AB
            AX-3   BB  NA  AB'
file2 <- read.table(text=file2, header=F)
write.table(file2, "file2.txt", col.names=F, row.names=F, quote=F)

file3:

file3 <- '  AX-1   0.20  -0.89  0.005    
            AX-2   0  -0.56  -0.003
            AX-3   1.2  0.002  0.005'
file3 <- read.table(text=file3, header=F)
write.table(file3, "file3.txt", col.names=F, row.names=F, quote=F)

file4:

file4 <- '  AX-1   1  0  0.56    
            AX-2   0  0.56  0
            AX-3   1  0  0.55'
file4 <- read.table(text=file34, header=F)
write.table(file4, "file4.txt", col.names=F, row.names=F, quote=F)

Where my expected out file could be something like:

out <- 'AX-1   1    125  AA  0.2  1 AB -0.89 0 AA 0.005 0.56
        AX-2   2    456  AA  0   0 AA -0.56 0.56 AB -0.003 0
        AX-3   3    3445  BB  1.2  1 NA  0.002 0 AA 0.005 0.55'
out <- read.table(text=out, header=F)
write.table(out, "out.txt", col.names=F, row.names=F, quote=F)

Thus, in the out: the column 1-3 are the file1, the columns 4,7 and 10 came from file2, the columns 5,8 and 11 came from file3 and the columns 6,9 and 12 came from file4.

I have an idea how to do it in R, but my original files are too large and it will take a lot of time. I would be grateful if someone has an idea how to perform it directly in bash.

1
  • Perhaps explain what the code does, for those of us who know Bash but not R, Commented Jan 29, 2016 at 9:52

2 Answers 2

3

This should work:

$ join a1 a2 | join - a3 | join - a4 | awk '{printf "%s %s %s %s %s %s %s %s %s %s %s %s\n", $1, $2, $3, $4, $7, $10, $5, $8, $11, $6, $9, $12}'
AX-1 1 125 AA 0.20 1 AB -0.89 0 AA 0.005 0.56
AX-2 2 456 AA 0 0 AA -0.56 0.56 AB -0.003 0
AX-3 3 3445 BB 1.2 1 NA 0.002 0 AB 0.005 0.55
Sign up to request clarification or add additional context in comments.

Comments

1

Try this:

 paste file1 file2 file3 file4 | awk '{ print $1 " " $2 " " $3 " " $5 " " $9 " " $13 " " $6 " " $10 " " $14 " " $7 " " $11 " " $15 }'

this works if your files have ordered rows, join suggested by Mauro is better choice.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.