0

I have a (MySQL) table containing dates of the last scan of hosts combined with a report ID:

+--------------+---------------------+--------+
| host         | last_scan           | report |
+--------------+---------------------+--------+
| 112.86.115.0 | 2012-01-03 01:39:30 |      4 |
| 112.86.115.1 | 2012-01-03 01:39:30 |      4 |
| 112.86.115.2 | 2012-01-03 02:03:40 |      4 |
| 112.86.115.2 | 2012-01-03 04:33:47 |      5 |
| 112.86.115.1 | 2012-01-03 04:20:23 |      5 |
| 112.86.115.6 | 2012-01-03 04:20:23 |      5 |
| 112.86.115.2 | 2012-01-05 04:29:46 |      8 |
| 112.86.115.6 | 2012-01-05 04:17:35 |      8 |
| 112.86.115.5 | 2012-01-05 04:29:48 |      8 |
| 112.86.115.4 | 2012-01-05 04:17:37 |      8 |
+--------------+---------------------+--------+

I want to select a list of all hosts with the date of the last scan and the corresponding report id. I have built the following nested query, but I am sure it can be done in a single query:

SELECT rh.host, rh.report, rh.last_scan
FROM report_hosts rh
WHERE rh.report = (
    SELECT rh2.report
    FROM report_hosts rh2
    WHERE rh2.host = rh.host
    ORDER BY rh2.last_scan DESC
    LIMIT 1
)
GROUP BY rh.host

Is it possible to do this with a single, non-nested query?

2 Answers 2

3

No, but you can do a JOIN in your query

SELECT x.*
FROM report_hosts x
INNER JOIN (
    SELECT host,MAX(last_scan) AS last_scan FROM report_hosts GROUP BY host
) y ON x.host=y.host AND x.last_scan=y.last_scan

Your query is doing a filesort, which is very inefficient. My solutions doesn't. It's very advisable to create an index on this table

ALTER TABLE `report_hosts` ADD INDEX ( `host` , `last_scan` ) ;

Else your query will do a filesort twice.

Sign up to request clarification or add additional context in comments.

1 Comment

This seems reasonable. Thank's for the index!
0

If you want to select from the report_hosts table only once then you could use a sort of 'RANK OVER PARTITION' method (available in Oracle but not, sadly, in MySQL). Something like this should work:

select h.host,h.last_scan as most_recent_scan,h.report
from
(
select rh.*,
case when @curHost != rh.host then @rank := 1 else @rank := @rank+1 end as rank,
case when @curHost != rh.host then @curHost := rh.host end
from report_hosts rh
cross join (select @rank := null,@curHost = null) t
order by host asc,last_scan desc
) h
where h.rank = 1;

Granted it is still nested but it does avoid the 'double select' problem. Not sure if it will be more efficient or not - kinda depends what indexes you have and volume of data.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.