3

I have a table which has 3 columns:

task_name -- data type is varchar(50)
start_date -- data type is date
end_date -- data type is date

I want to list the dates which fall within the range (between start_date & end_date) with how many times they fall in descending order.

For example if the data is :

task_name start_date end_date
ABC 2024-12-01 2025-01-31
DEF 2025-01-15 2025-02-10
GHI 2025-01-31 2025-02-03

then I want the result this way:

date count (comment)
2025-01-31 3 only date with the 3 tasks running
2025-01-15 2 2025-01-15 to 2025-01-30 has ABC and DEF overlapping
2025-01-16 2
2
2025-01-29 2
2025-01-30 2
no 2025-01-31 here as it is the only date with count 3
2025-02-01 2 2025-02-01 to 2025-02-03 has DEF and GHI overlapping
2025-02-02 2
2025-02-03 2
2024-12-01 1 all other dates between 2024-12-01 and 2025-01-14, and 2025-02-04 to 2025-02-10, have count = 1
2024-12-02 1
1
2025-01-14 1
2025-02-04 1
1
2025-02-10 1

How can I do that?

6
  • 1
    Do we agree that your expected results lacks some dates, and that in reality under count 2 you want every date between 01-15 and 01-30 (end of ABC, start of DEF)? And under count 1 all other dates (2024-12-01 to 01-14; 02-04 to 02-10)? Commented Jun 6 at 21:14
  • … As would something like this working fiddle show? I think I'll base an answer on it (and simplify it. segments doesn't seem necessary), feel free to play with it inbetween. For now time to sleep! Commented Jun 6 at 21:26
  • Is date range '2025-01-31'-'2025-02-04' query parameters? What version SQL Server is used? Commented Jun 6 at 21:49
  • 1
    @GuillaumeOutters thanks for pointing out the dates discrepancy; I have updated my question Commented Jun 7 at 1:01
  • @ValNik there are no query parameters; SQL Server version is 15.0 Commented Jun 7 at 1:05

3 Answers 3

2

A solution in your case is a recursive CTE.

A CTE (Common Table Expression) is a kind of temporary view that you define at the beginning of the query with WITH.
A recursive CTE is a CTE that calls itself, and is used to build data iteratively or recursively.

You need this because SQL doesn't have a native RANGE(start, end) type. You have to generate the rows one by one.

With a recursive CTE, you tell SQL:
Start at start_date. Then, each time, add a day. Repeat until you get past end_date.

Then you use this to generate the dates in between, which you can then count.

Example:

-- Create CTE
WITH DateExpanded AS (
    SELECT 
        task_name,
        start_date AS work_date,
        end_date
    FROM tasks
    UNION ALL
    SELECT 
        task_name,
        DATEADD(DAY, 1, work_date),
        end_date
    FROM DateExpanded
    WHERE work_date < end_date
)
-- Now group by data and count occurrences
SELECT 
    work_date,
    COUNT(*) AS occurrences
FROM DateExpanded
GROUP BY work_date
ORDER BY occurrences DESC
-- This piece is important if the date range is big, to avoid the default limit (100 layers) 
-- Be careful with that: set the maxrecursion only for the interval of time you need (maxdate - mindate)
-- If the recursion is too large or loops infinitely, the server will crash.
-- NEVER SET IT AT 0!!!
OPTION (MAXRECURSION 365);

SQL Server CTE Docs

Sign up to request clarification or add additional context in comments.

Comments

2

With OP new desired result example: start and end date are min and max dates in table.
Query parts:

  1. Generate date range with recursive CTE ("dates")
  2. Join source table to date range.
  3. Count active task's for date
with dates as(
  select min(start_date) as dt,max(end_date) as maxDt
  from sample
  union all
  select dateadd(day,1,dt) as dt,maxDt
  from dates
  where dt<maxDt
)
select dt,count(task_name) as qty
from dates d
left join sample t on d.dt between t.start_date and t.end_date
group by dt
order by count(task_name) desc,dt
option (maxrecursion 1000)
;

Option maxrecursion should be greater than difference between min and max date in source table in days.

Fiddle

old part of answer

if start_date and end_date is query parameters (or can be "hardcoded"), first, generate sequence of this date range.
Then join target table rows (table "sample" in example) with active tasks for this date
and count rows(active tasks for this date).

First query uses generate_series for this table range (available with SQL Server 2022).
In second example used recursive query to generate date range (works in earlier versions).

task_name start_date end_date
ABC 2024-12-01 2025-01-31
DEF 2025-01-15 2025-02-10
GHI 2025-01-31 2025-02-03
declare @start_date date ='2025-01-31';
declare @end_date date ='2025-02-04';
with dates as(
  select dateadd(day,value,@start_date) as dt
  from generate_series(0,datediff(day,@start_date,@end_date))
)
select dt,count(task_name) as qty
from dates d
left join sample t on d.dt between t.start_date and t.end_date
group by dt
order by dt -- or order by count(task_name)
;
dt qty
2025-01-31 3
2025-02-01 2
2025-02-02 2
2025-02-03 2
2025-02-04 1

We use count(task_name) instead count(*) to get count=0 for dates without active tasks.

Example with recursive geared date range:

declare @start_date date ='2025-01-31';
declare @end_date date ='2025-02-04';
with dates as(
  select @start_date as dt
  union all
  select dateadd(day,1,dt) 
  from dates
  where dt<@end_date
)
select dt,count(task_name) as qty
from dates d
left join sample t on d.dt between t.start_date and t.end_date
group by dt
order by dt
;

Column names "date" and "count" are not very suitable because they match reserved words and require quotation. Used "dt" and "qty" instead.

fiddle

4 Comments

As OP mentioned the dates were not hardcoded, would just adapt your answer with min(start_date), max(end_date) instead of @start_date, @end_date? (deserved my vote anyway for explanations, clarity and hints, but it would make it more in line with OP's need)
Why not use use CROSS APPLY with GENERATE_SERIES?
@MatBailie: OP told (in the comments of the question) that he was limited to SQL Server 15.0 (2019). I just reflected it on the question's tags
Previous desired result example assume setting date range as parameters, and it seems more usual to me. Selecting the maximum activity values is still a slightly different task. However, I agree, the question is asked like this.
1

Work with dates ranges, not individual days

The most efficient way will be to first work with dates ranges (not immediately denormalizing them to day-by-day rows):

with
  -- Determine every date where we have a count change: that is, every start_date or end_date:
  cuts as
  (
    select start_date d from t
    union -- Not union all, so that identical dates get merged: we want one occurrence of each.
    select dateadd(day, 1, end_date) from t -- The end date's cut happen on the next day at 00:00
  ),
  -- Now transform each task into as many entries as it crosses cuts;
  -- then group by cut date, and count how many tasks it crossed.
  crossings as
  (
    select c.d, count(t.task_name) count -- count(task_name) and not count(*): count only where we joined to our tasks table (to have the last, closing date, not being counted).
    from cuts c left join t on c.d between t.start_date and t.end_date
    group by c.d
  ),
  -- A dates range will be between a cut and the next one (lead()), as we ensured that cuts covered every date appearing in our tasks array.
  slices as
  (
    select
      d start_date,
      dateadd(day, -1, lead(d) over (order by d)) end_date, -- Get back to "last day of the dates range" instead of "first day after the range ended".
      count
    from crossings
  )
select * from slices order by count desc, start_date;

which returns:

start_date end_date n
2025-01-31 2025-01-31 3
2025-01-15 2025-01-30 2
2025-02-01 2025-02-03 2
2024-12-01 2025-01-14 1
2025-02-04 2025-02-10 1
2025-02-11 null 0

(as seen in the first block of this fiddle)

… Then optionally get your daily results table

Now that we have a clean view over disjoint dates ranges with a stable count of running tasks,
if we want a more verbose day-by-day view we just have to "denormalize" each range to its individual days.

Here we can use one of two general techniques:

Joining with a serie of numbers

Each dates range is (end_date - start_date + 1)-long.
If we have a source of subsequent numbers from 0 to this length, we can join to it and emit a row whose date will be dateadd(day, thisnumber, start_date).

We have various possibilities for this source of numbers, but there are already a lot of stuff about it elsewhere.
Let's just choose one of these to demonstrate:

with
  -- Everything of our first part, but we replace the final select * from slices by what follows:
  -- Recurse over days of each range.
  alldays as
  (
    select dateadd(day, v.number, start_date) date, count
    from slices
    join master..spt_values v                           -- The spt_values hack gives us ranges until 2000 days
    on v.type = 'P'                                     -- Part of the spt_values hack.
    and v.number <= datediff(day, start_date, end_date) -- No more than the targeted range!
    where end_date is not null -- The null end_date represents "everything after our last end_date". We don't want to recurse infinitely, so filter it out.
  )
select date, count from alldays order by count desc, date;
Recursing over the dates ranges

SQL recursive Common Table Expressions is another way of iterating (more than recursing) over our dates ranges.
Note that we could use a 1-by-1 recursion (from 2024-12-01 to 2025-02-10), but it will be much more efficient if we make as many recursions run in parallel as they are unique dates ranges (instead of a 70 iterations process, we'll have 5 recursions of which the longest (12-01 to 01-14) will run for no more than 45 iterations).

with
  -- Everything of our first part, but we replace the final select * from slices by what follows:
  -- Recurse over days of each range.
  alldays as
  (
    select start_date date, count, end_date from slices where end_date is not null -- The null end_date represents "everything after our last end_date". We don't want to recurse infinitely, so remove it from the ranges to walk through.
    union all
    select dateadd(day, 1, date), count, end_date from alldays -- As we have computed our disjoint time spans, we know that the count is stable…
    where date < end_date -- … until we reach the end of the span.
  )
select date, count from alldays order by count desc, date
option (maxrecursion 100) -- If your tasks overlaps can span over more than 100 days, increase this value. 
;
Demo time!

Both solutions run in the same fiddle as above, returning your intended result:

date count
2025-01-31 3
2025-01-15 2
2025-01-16 2

2 Comments

Please don't teach , implicit join notation. You have a join predicate in the WHERE clause, use INNER JOIN!
@MatBailie given the totally hackish nature of spt_values from the start, I wasn't trying to make this part clean. But you're right, it could be misinterpreted as a general good practice, so let this ambiguity go away.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.