What is happening here is that you are trying to join on a float which cannot be precisely represented and is altered by saving to csv. Here's a simple example:
library(data.table) dat1 <- data.table(x = 1/3, label = "one third") dat2 <- data.table(x = 1/3)
It's possible (though inadvisable - see below) to merge these:
merge(dat1, dat2) # Key: <x> # x label # <num> <char> # 1: 0.3333333 one third
However, let's write dat1 to a csv and then read it back in. Floating-point numbers are stored in binary format with finite precision, meaning decimal fractions which cannot be expressed as a power of 2 (like 1/3) cannot be represented exactly. When written to a csv, values which cannot be expressed precisely as decimals are rounded to a limited number of decimal places, and reading them back introduces tiny errors.
fwrite(dat1, "./dat.csv") dat1_fread <- fread("./dat.csv") all.equal(dat1, dat1_fread) # TRUE identical(dat1, dat1_fread) # FALSE
You can see that dat1 and dat1_fread are all.equal(), as this is a utility to compare R objects x and y testing ‘near equality’. However, they are not identical(). Joins require exact matches. And so we don't return any results if we join on x:
merge(dat1_fread, dat2) # Empty data.table (0 rows and 2 cols): x,label
You can see the issue if you print the values with high enough precision:
sprintf("%.20f", dat1$x) # [1] "0.33333333333333331483" sprintf("%.20f", dat2$x) # [1] "0.33333333333333331483" sprintf("%.20f", dat1_fread$x) # [1] "0.33333333333333298176"
Interestingly, the text value of x stored in dat.csv is 0.333333333333333, which is not the same as the value when it is read back in. This is because when fwrite() saves the csv, it rounds 1/3 to 15 significant digits. When fread() reads the csv, it sees a number which cannot be exactly represented in binary. It then represents it as 0.33333333333333298176.... This is a fundamental limitation of floating-point arithmetic, not an R issue.
How to resolve this
The best plan is not to join on a float. If you have to join on a float, you could round, if you can afford to lose some precision (which essentially you can because you can't rely on floats being stored with very high accuracy). If you don't want to round, don't save your file to csv but serialize to some binary format instead. For example, rds:
saveRDS(dat1, "./dat.rds") dat1_rds <- readRDS("./dat.rds") merge(dat1_rds, dat2) # Key: <x> # x label # <num> <char> # 1: 0.3333333 one third
all.equaltests if the values are equal, but there may be more going on there. Comparestr(region)andstr(region2)as a next step.as.data.tableandraster_to_tablefunctions come from the package?