Table Schema
For the two tables, the CREATE queries are given below:
Table1: (file_path_key, dir_path_key)
create table Table1(file_path_key varchar(500), dir_path_key varchar(500), primary key(file_path_key)) engine = innodb;
Example, file_path_key = /home/playstation/a.txt
dir_path_key = /home/playstation/
Table2: (file_path_key, hash_key)
create table Table2(file_path_key varchar(500) not null, hash_key bigint(20) not null, foreign key (file_path_key) references Table1(file_path_key) on update cascade on delete cascade) engine = innodb;
Objective:
Given a hash value *H* and a directory string *D*, I need to find all those
hashes which equal to *H* from Table2, such that, the corresponding file entry
doesn't have *D* as it's directory.
In this particular case, Table1 has around 40,000 entries and Table2 has 5,000,000 entries, which makes my current query really slow.
select distinct s1.file_path_key from Table1 as s1 join (select * from Table2 where hash_key = H) as s2 on s1.file_path_key = s2.file_path_key and s1.dir_path_key !=D;
file_path_keycould be turned into justfile(which would potentially reduce mismatches). Too bad you're not using an RDBMS that supports recursive CTEs - they work perfectly for folder structures.