0

I have a relation R with 10 blocks. S with 1000 I have also 50 unique records for attribute A in relation R, and 5000 unique records for attribute A in relation S. I have 100 records on each block. Note that we assume a uniform distribution of the different values in each relation. S has a clustring index on the join attribute A.

The question is : How many blocks of S store any of the records that participate in the join with R. I need to answer with the best and the worst case.

I thought that if R has 50 unique records for A and it's clustering index, it will take minimum 1 block for each unique and maximum 2. and then the answer is 50 or 100.

But, why can't I put 5 unique records in each block so the maximum number of block is 10?

4
  • Do you have clustered index for attribute A on relation R as well? Commented Jan 10, 2012 at 17:23
  • No I don't have clustered index for attribute A on relation R Commented Jan 10, 2012 at 17:40
  • I think the scenario misses some informations. My results currently go from 1 (min) to 951 (max) as needed blocks to read. Can you please edit your questions to the exact words for the question? Commented Jan 10, 2012 at 18:12
  • Did you get an answer to the homework in the meantime? Commented Feb 13, 2012 at 22:19

1 Answer 1

1

As far as I understand this is the situation:

S has 1000 blocks with 100 records/block which leads to 100000 records (max). Of those 100000 records are 5000 unique (different) values for attribute A.

Edit:

If they are all evenly distributed every unique value for A would have 20 rows in S . If all of the 50 unique values for A in R are present in S, then 50 row groups would be fetched.

In the best case there are all stored together (thanks to the clustered index) and you need to read 10 Blocks. [(50 values for A * 20 rows with same value in S / 100 records per block = 10 blocks]

In the worst case the 20 rows for every value in A are using 2 blocks. This would lead to 100 blocks you need to read from S.

To your second question:

Since you have a clustered index containing the column A all the same values for A will be stored together. They only use more than one block if they don’t fit into one or if the block was filled by other values and therefore can’t fit in one block.

Attention: I may not have fully understand your initial question and therefore my answer could be totally wrong!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.