4

I have follwing data :

Product Price   StartDate                   EndDate
Apples  4.9     2010-03-01 00:00:00.000     2010-03-01 00:00:00.000
Apples  4.9     2010-03-02 00:00:00.000     2010-03-02 00:00:00.000
Apples  2.5     2010-03-03 00:00:00.000     2010-03-03 00:00:00.000
Apples  4.9     2010-03-05 00:00:00.000     2010-03-05 00:00:00.000
Apples  4.9     2010-03-06 00:00:00.000     2010-03-06 00:00:00.000
Apples  4.9     2010-03-09 00:00:00.000     2010-03-09 00:00:00.000
Apples  2.5     2010-03-10 00:00:00.000     2010-03-10 00:00:00.000
Apples  4.9     2010-03-11 00:00:00.000     2010-03-11 00:00:00.000
Apples  4.9     2010-03-12 00:00:00.000     2010-03-12 00:00:00.000
Apples  4.9     2010-03-13 00:00:00.000     2010-03-13 00:00:00.000
Apples  4.9     2010-03-15 00:00:00.000     2010-03-15 00:00:00.000
Apples  4.9     2010-03-16 00:00:00.000     2010-03-16 00:00:00.000

want to group like product, price, min(startdate), max(startdate) but should have grouping in start date and end date as well........ something like below

Desired result

Apples  4.9     2010-03-01 00:00:00.000     2010-03-02 00:00:00.000
Apples  2.5     2010-03-03 00:00:00.000     2010-03-03 00:00:00.000
Apples  4.9     2010-03-05 00:00:00.000     2010-03-09 00:00:00.000
Apples  2.5     2010-03-10 00:00:00.000     2010-03-10 00:00:00.000
Apples  4.9     2010-03-11 00:00:00.000     2010-03-16 00:00:00.000
3
  • 1
    Welcome to StackOverflow: if you post code, XML or data samples, please highlight those lines in the text editor and click on the "code samples" button ( { } ) on the editor toolbar to nicely format and syntax highlight it! Commented Dec 24, 2012 at 11:46
  • What is the purpose of the EndDate column? It appears to always be equal to StartDate. Is this assumption true? If so, please remove the EndDate column from your example data. If it's not true, I wish you would provide the most "tricky" data instead of the most uniform/boring data so that people providing answers can determine the proper query to always provide the correct result. Commented Dec 27, 2012 at 2:53
  • So just for clarity: even though you show no data for 2010-03-14, you want to see the final row for Apples span it as 2010-03-11 through 20100316? Commented Dec 27, 2012 at 21:58

5 Answers 5

3

My approach.

Data:

create table t ( producte varchar(50), 
                 price money, 
                 start_date date,
                 end_date date);

insert into t values
( 'apple', 4.9, '2012-01-01', '2012-01-01' ),
( 'apple', 4.9, '2012-01-02', '2012-01-02' ),
( 'apple', 8, '2012-01-04', '2012-01-04' ),
( 'cat', 5, '2012-01-01', '2012-01-01' ),
( 'cat', 6, '2012-01-02', '2012-01-02' ),
( 'cat', 6, '2012-01-03', '2012-01-03' );

Query:

with start_dates as (
  select 
    t.producte, t.price, t.start_date, t.end_date, t.start_date as gr_date    
  from 
    t left outer join 
    t t1 on 
        t.price = t1.price and                         --new
        t.producte = t1.producte and
        t.start_date = dateadd(day,1, t1.end_date )
  where t1.producte is null
  union all
  select 
      t.producte, t.price, t.start_date,t. end_date, gr_date
  from
      t inner join 
      start_dates t1 on  
        t.price = t1.price and                         --new
        t.producte = t1.producte and
        t.start_date = dateadd(day,1, t1.end_date )
)
select t.producte, t.price , min( t.start_date ), max( t.end_date )
from start_dates t
group by  t.producte, gr_date  ,t.price

Results:

| PRODUCTE | PRICE |   COLUMN_2 |   COLUMN_3 |
----------------------------------------------
|    apple |   4.9 | 2012-01-01 | 2012-01-02 |
|    apple |     8 | 2012-01-04 | 2012-01-04 |
|      cat |     5 | 2012-01-01 | 2012-01-01 |
|      cat |     6 | 2012-01-02 | 2012-01-03 |

Explanation

This is a recursive CTE expression. Base query take inital dates for each group of prices. Recursive query looks for last data with this price.

Sign up to request clarification or add additional context in comments.

3 Comments

@user1926569, I have updated query after accepted it. Review.
Please remove any incorrect queries from your answer. The purpose of editing is not provide not a historical record but the best possible answer. Having incorrect queries and evidence of how the answer changed over time is not the best possible answer.
@ErikE, thanks about your comment. fixed. Now is right? Regards.
3
SELECT  product, price, MIN(start_date), MAX(end_date)
FROM    (
        SELECT  product, price, start_date, end_date,
                ROW_NUMBER() OVER (PARTITION BY product ORDER BY startDate) rn1,
                ROW_NUMBER() OVER (PARTITION BY product, price ORDER BY startDate) rn2
        FROM    mytable
        ) q
GROUP BY
        product, price, rn2 - rn1
ORDER BY
        product, MIN(start_date), price

5 Comments

it's giving me the same result which
{select product, price, product, price, MIN(start_date), MAX(end_date) from mytable GROUP BY product, price } will give
@user1926569: sure, my fault. Please try the updated query: sqlfiddle.com/#!3/cf7ad/16
Best answer on the page yet! Quassnoi, is there any chance you saw this post of mine? How long have you known about this subtract Row_Number() technique to find changing groups?
@ErikE: no, did not see your post. stackoverflow.com/questions/5662545/… and there is also a number of posts on this matter on my blog.
3

Here is a SQLFiddle demo

with t2 as 
(
select t1.*,
(select count(Price) 
  from t 
  where startdate<t1.startdate 
        and Price<>t1.price
        and Product=t1.Product
)
rng  
from t as t1
)
select Product,Price,min(startDate),max(EndDate)  
from t2 group by Product,Price,RNG
order by 3

Comments

1

Here's a suggestion: for each row, you must find the maximum previous date for which the price is different and you Group on that. For example, for any line between 2010-03-11 and 2010-03-16, you must retrieve the date 2010-03-10 because this is the maximum previous date for which the price is different (2.5 versus 4.9). The first row(s) will return a null date but that shouldn't be a problem.

However, for a very long table, this kind of query could become very slow. Therefore, if you have some speed problem, you should look into the possibility of adding a column and use a cursor to fill it incrementally: you loop through it by date and each time you see a new price, you change its value. The final Grouping is then trivial.

Here's something:

Select Product, Price, Min(StartDate) as StartDate, PreviousDate from (
    Select product, price, StartDate, (Select max (StartDate) from table_2 t3 where t3.price <> t2.price and t3.StartDate < t2.StartDate and t3.Product = t2.Product) as previousDate
    from table_2 t2) SQ

Group by Product, Price, PreviousDate
Order by PreviousDate

Comments

0

I believe this is the best-performing solution so far:

WITH Calc AS (
   SELECT *,
      Grp = DateAdd(day, -Row_Number()
         OVER (PARTITION BY Product, Price ORDER BY StartDate), StartDate
      )
   FROM dbo.PriceHistory
)
SELECT Product, Price, FromDate = Min(StartDate), ToDate = Max(StartDate)
FROM Calc
GROUP BY Product, Price, Grp
ORDER BY FromDate;

Try this out yourself

2 Comments

It does not return what the @op asked for.
You're right, Quassnoi. I missed that the OP wants to skip gaps. I'll return to this soon.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.