join expects its input to be sorted (lexically, using the same collation order as the one it will use to compare fields, so same locale at least in the LC_CTYPE and LC_COLLATE categories) on their respective join field, not on the whole line.
For sort, by default, fields are delimited by the transition from a non-blank to a blank, that's the same for join except that leading blanks are ignored, like when sort is called with -b, though with both sort and join, one can specify a single-character¹ field separator with the -t option.
As the POSIX specification of the join utility puts it:
The files file1 and file2 shall be ordered in the collating sequence of sort -b on the fields on which they shall be joined, by default the first in each line. All selected output shall be written in the same collating sequence.
-t char
Use character char as a separator, for both input and output. Every appearance of char in a line shall be significant. When this option is specified, the collating sequence shall be the same as sort without the -b option.
So, if joining file1 and file2 (where fields are blank-separated) on the first field, you need:
join <(sort -bk1,1 file1) <(sort -bk2,2 file2)
(here assuming a shell with support for ksh-style process substitution such as ksh, zsh or bash)
And if the fields are TAB-separated:
join -t $'\t' <(sort -t $'\t' -k1,1 file1) <(sort -t $'\t' -k1,1 file2)
Now, en_US.UTF-8 is probably not the best choice of locale as it will give non-deterministic outcome if the input contains sequences of bytes that can't be decoded in UTF-8; and that decoding and the complex en_US collation order is costly to process, and at least on GNU systems, those human collection orders have characters that sort the same so can give non-deterministic outcome even on valid UTF-8 encoded text.
If you don't care about the actual order the the join keys in the files are sorted in as long as the files are joined, using the C/POSIX locale would be much more efficient and reliable as it's a single-byte locale with no decoding taking place, and the comparison function is just a byte-to-byte comparison.
LC_ALL=C join -t $'\t' <(LC_ALL=C sort -t $'\t' -k1,1 file1) \
<(LC_ALL=C sort -t $'\t' -k1,1 file2)
Now beware that tab and newline are as valid a character as any in a file or directory name and neither sort nor join have any provision to escape the field separator or record delimiter.
The GNU implementations of sort and join have a -z option to process NUL-delimited records instead of newline-delimited ones which helps in that that character cannot occur in a file name but that won't help for Tab.
So to process arbitrary file paths, your options are either to encode those TAB/NL one way or another like as \t, \n (and \\ for backslash) or with URI encoding (%08, %0A and %25 for %).
Or use the more advanced forms of TSV as recognised by mlr for instance where fields with newlines and/or tabs are quoted like in CSVs. But you can't process those with sort/join, though you could use mlr instead which would also have the benefit of allowing you to use headers and not require the inputs to be sorted (though its join supports a -s for sorted input which helps for sorted files that can't fit in memory).
~$ sed -n l file1
dir\tvalue$
directory\t0.000106811523$
directory_1\t1.059814453265$
directory_123\t0.564987182688$
directory_123123\t0.564987182688$
"directory$
with newline"\t0.123$
"directory with\ttab"\t0.456$
~$ mlr --tsv join -j dir --lp file1. --rp file2. -f file1 file1
dir file1.value file2.value
directory 0.000106811523 0.000106811523
directory_1 1.059814453265 1.059814453265
directory_123 0.564987182688 0.564987182688
directory_123123 0.564987182688 0.564987182688
"directory
with newline" 0.123 0.123
"directory with tab" 0.456 0.456
¹ Beware that with many sort implementations including current versions of GNU tar, that character has to be single-byte, which is the case of TAB.
localefrom both systems.locale?