DB - Query processing - Index Nested Loop Join

Question

I have a relation R with 10 blocks. S with 1000 I have also 50 unique records for attribute A in relation R, and 5000 unique records for attribute A in relation S. I have 100 records on each block. Note that we assume a uniform distribution of the different values in each relation. S has a clustring index on the join attribute A.

The question is : How many blocks of S store any of the records that participate in the join with R. I need to answer with the best and the worst case.

I thought that if R has 50 unique records for A and it's clustering index, it will take minimum 1 block for each unique and maximum 2. and then the answer is 50 or 100.

But, why can't I put 5 unique records in each block so the maximum number of block is 10?

Do you have clustered index for attribute A on relation R as well? — Johnny Graber
– Johnny Graber, Commented Jan 10, 2012 at 17:23
No I don't have clustered index for attribute A on relation R — fgfjhgrjr erjhm
– fgfjhgrjr erjhm, Commented Jan 10, 2012 at 17:40
I think the scenario misses some informations. My results currently go from 1 (min) to 951 (max) as needed blocks to read. Can you please edit your questions to the exact words for the question? — Johnny Graber
– Johnny Graber, Commented Jan 10, 2012 at 18:12

Johnny Graber · Accepted Answer · 2012-01-11 16:58:38Z

As far as I understand this is the situation:

S has 1000 blocks with 100 records/block which leads to 100000 records (max). Of those 100000 records are 5000 unique (different) values for attribute A.

Edit:

If they are all evenly distributed every unique value for A would have 20 rows in S . If all of the 50 unique values for A in R are present in S, then 50 row groups would be fetched.

In the best case there are all stored together (thanks to the clustered index) and you need to read 10 Blocks. [(50 values for A * 20 rows with same value in S / 100 records per block = 10 blocks]

In the worst case the 20 rows for every value in A are using 2 blocks. This would lead to 100 blocks you need to read from S.

To your second question:

Since you have a clustered index containing the column A all the same values for A will be stored together. They only use more than one block if they don’t fit into one or if the block was filled by other values and therefore can’t fit in one block.

Attention: I may not have fully understand your initial question and therefore my answer could be totally wrong!

Collectives™ on Stack Overflow

DB - Query processing - Index Nested Loop Join

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related