Join 2 tables based on row-specific time frames (crossing order information) KDB+/Q

Question

I have two tables with order information, and am looking to find whether or not each order in one table has crossing potential with the other. Example tables:

t1:([]ric:`a`b`c;side:`buy`sell`buy;size:100 1000 400; startTime:09:31 09:40 09:40; endTime:09:33 09:45 09:41)

t2:([]ric:`a`b`c;side:`sell`buy`buy;size:90 1000 400; startTime:09:30 09:40 09:41; endTime:09:32 09:45 09:42)

I'm trying to find a way to iterate through each entry in t1 and query t2, and check if the side is opposite, and the time range overlaps, and if so then output the crossed size. Expected result for these table would be:

ric a can cross 90 shares, ric b can cross all 1000, and ric c cannot cross.

Is this better achieved with a window join using the time frames? What I was imagining as the most intuitive approach was to find a way to break t1 into a list of rows and pass each row into a function to query t2, but open to suggestions if that's inefficient.

EDIT: For more context, I'm essentially trying to come up with a faux-matching engine for these tables. Window join is starting to look like the better option than breaking one of the tables with an iterator, but I'm unfamiliar with using window join with 2 time ranges and filters for opposing buy/sell and size (although the filters could come after the fact if that data is retained after the join?)

rianoc · Accepted Answer · 2025-02-26 21:55:15Z

2

wj cannot be directly used as it finds events in windows, it does not capture overlapping windows.

An initial version might look like:

q)i:{$[0h~type x;x[;y];x[y]]}
q)f:{$[0h~type x;[x:flip x;$[1~count x;first x;x]];x]}
q)overlap:{if[0 in count each (x;y);:`boolean$()];max (i[x;0] within f y;i[x;1] within f y;i[y;0] within f x;i[y;1] within f x)}

q)raze {select from t2 where ric=x`ric,not side=x`side,overlap[x`startTime`endTime;flip (startTime;endTime)]} each t1
ric side size startTime endTime
-------------------------------
a   sell 90   09:30     09:32
b   buy  1000 09:40     09:45

You may want to prevent multiple rows in t1 matching the same row in t2 and only take each row from t2 a single time:

q)select from t2 where i in distinct raze {exec i from t2 where ric=x`ric,not side=x`side,overlap[x`startTime`endTime;flip (startTime;endTime)]} each t1
ric side size startTime endTime
-------------------------------
a   sell 90   09:30     09:32
b   buy  1000 09:40     09:45

If you want to have a row for each of t1 and it's corresponding matches:

{x[`matchSum]:exec sum size from t2 where ric=x`ric,not side=x`side,overlap[x`startTime`endTime;flip (startTime;endTime)];x} each update ind:i from t1
ric side size startTime endTime ind matchSum
--------------------------------------------
a   buy  100  09:31     09:33   0   190
b   sell 1000 09:40     09:45   1   1000
c   buy  400  09:40     09:41   2   0

edited Feb 26 at 21:55

answered Feb 26 at 9:58

rianoc

4,27812 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Cole Feb 26 at 14:33

Thanks for the answer! The latter point you have doesn't consider a row in t1 that matches to multiple rows in t2 though. Suppose

t2:([]ric:`a`b`c`d`a;side:`sell`buy`buy`sell`sell;size:90 1000 400 50 100; startTime:09:30 09:40 09:41 09:41 09:32; endTime:09:32 09:45 09:42 09:42 09:33)

the resulting output will have 2 entries for ric=a, though it's pulling both from the same singular order in t1

rianoc Feb 26 at 14:47

The examples returns matched rows from t2 so the 2 matches is as designed. You'll want to tweak anything like this in several ways to meet your exact requirements and data, it's just an example as a starting point based off the question. An extra example has been added to the answer which returns a row for each in t1 instead.

Cole Apr 8 at 21:01

Is there any way to make this more computationally efficient? My understanding is that it's iterating through each row of t1 and then looking in t2 for overlaps; if t1/t2 are much larger this balloons into quite an inefficient function

Collectives™ on Stack Overflow

Join 2 tables based on row-specific time frames (crossing order information) KDB+/Q

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related