When assigning by reference with a data.table using a column from a second data.table, the results are inconsistent. When there are no matches by the key columns of both data.tables, it appears the assigment expression y := y is totally ignored - not even NAs are returned.
library(data.table)
dt1 <- data.table(id = 1:2, x = 3:4, key = "id")
dt2 <- data.table(id = 3:4, y = 5:6, key = "id")
print(dt1[dt2, y := y])
## id x # Would have also expected column: y
## 1: 1 3 # NA
## 2: 2 4 # NA
However, when there is a partial match, non-matching columns have a placeholder NA.
dt2[, id := 2:3]
print(dt1[dt2, y := y])
## id x y
## 1: 1 3 NA # <-- placeholder NA here
## 2: 2 4 5
This wreaks havoc on later code that assumes a y column exists in all cases. Otherwise I keep having to write cumbersome additional checks to take into account both cases.
Is there an elegant way around this inconsistency?
ycolumn first...dt1[, y:=NA_integer_];dt1[dt2, y:=y][]yisn't in the joined table, I notice I made a mistake in my code more quickly. But yes what you suggested is probably better as it's more robust