3

I was wondering how to properly extract amount of hours between given 2 timestamps objects.

For instance, when the following SQL query gets executed:

    select x, extract(HOUR FROM x) as result
    from
    (select (TIMESTAMP'2021-01-22T05:00:00' - TIMESTAMP'2021-01-01T09:00:00') as x)

The result value is 20, while I'd expect it to be 500.

It seems odd to me considering that x value indicates the expected return value.

Can anyone please explain to me what I'm doing wrong and perhaps suggest additional way of query so the desired result would return?

Thanks in advance!

3 Answers 3

3

I think you have to do the maths with this one as datediff in SparkSQL only supports days. This worked for me:

SELECT (unix_timestamp(to_timestamp('2021-01-22T05:00:00') ) - unix_timestamp(to_timestamp('2021-01-01T09:00:00'))) / 60 / 60 diffInHours

My results (in Synapse Notebook, not Databricks but I expect it to be the same):

enter image description here

The unix_timestamp function converts the timestamp to a Unix timestamp (in seconds) and then you can apply date math to it. Subtracting them gives the number of seconds between the two timestamps. Divide by 60 for the number minutes between the two dates and by 60 again for the number of hours between the two dates.

Sign up to request clarification or add additional context in comments.

2 Comments

seems bit long to me but as long as it gets the job done it's fine be my :) Thank you very much!
Might be worth checking if there’s something new in Spark 3.
0

It's because extract only extracts the hour component of the interval. So in your case the difference is 20 days and 20 hours, so it gives you back 20 hours.

To add the date component you could do this:

select x, extract(HOUR FROM x) + (extract(DAY FROM x) * 24) as result
    from
(select (TIMESTAMP'2021-01-22T05:00:00' - TIMESTAMP'2021-01-01T09:00:00') as x)

Not sure if this is more convenient than the other answer, but it's a different approach to consider.

Comments

0

If you have two timestamps rather than an interval, then you could use the timediff function:

Example usage (note it does not round up):

;WITH Dates(t1,t2) AS
(
    SELECT 
        '2000-01-01 00:00:00'::timestamp, 
        '2000-01-02 03:59:00'::timestamp
)
SELECT timediff(HOUR, t1, t2)
FROM Dates
--27

And if you are starting with just an interval, here is a workaround that uses some timestamp functions like unix_millis and timestamp_millis.

DECLARE OR REPLACE VARIABLE var_interval INTERVAL DAY TO SECOND;
SET VARIABLE var_interval = INTERVAL '1 00:30:36.000' DAY TO SECOND; --1 day, 30 minutes, 36 seconds

SELECT unix_millis(timestamp_millis(0) + var_interval)/1000/60/60 as hours_in_interval
--24.509999999999998

It is unfortunate how there are no built-in functions for extracting stuff from an interval.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.