2

I have this view that represent the status of connections for each user to a system inside table as below:

---------------------------------------
|id |   date     | User  |  Connexion |
|1  | 01/01/2018 |  A    |      1     |
|2  | 02/01/2018 |  A    |      0     |
|3  | 03/01/2018 |  A    |      1     |
|4  | 04/01/2018 |  A    |      1     |
|5  | 05/01/2018 |  A    |      0     |
|6  | 06/01/2018 |  A    |      0     |
|7  | 07/01/2018 |  A    |      0     |
|8  | 08/01/2018 |  A    |      1     |
|9  | 09/01/2018 |  A    |      1     |
|10 | 10/01/2018 |  A    |      1     |
|11 | 11/01/2018 |  A    |      1     |
---------------------------------------

The target output would be to get the count of succeeded and failed connection order by date so the output would be like that

---------------------------------------------------------------
|StartDate         EndDate       User     Connexion     Length|
|01/01/2018  |   01/01/2018  |     A    |    1      |      1  |
|02/01/2018  |   02/01/2018  |     A    |    0      |      1  |
|03/01/2018  |   04/01/2018  |     A    |    1      |      2  |
|05/01/2018  |   07/01/2018  |     A    |    0      |      3  |
|08/01/2018  |   11/01/2018  |     A    |    1      |      4  |
---------------------------------------------------------------
1
  • 2
    And what attempts have you made so far? Could you post the SQL you've tried and tell us why it didn't work please? Commented Nov 13, 2018 at 15:54

1 Answer 1

4

This is what is called a gaps-and-islands problem. The best solution for your version is a difference of row numbers:

select user, min(date), max(date), connexion, count(*) as length
from (select t.*,
             row_number() over (partition by user order by date) as seqnum,
             row_number() over (partition by user, connexion order by date) as seqnum_uc
      from t
     ) t
group by user, connexion, (seqnum - seqnum_uc);

Why this works is a little tricky to explain. Generally, I find that if you stare at the results of the subquery, you'll see how the difference is constant for the groups that you care about.

Note: You should not use user or date for the names of columns. These are keywords in SQL (of one type or another). If you do use them, you have to clutter up your SQL with escape characters, which just makes the code harder to write, read, and debug.

Sign up to request clarification or add additional context in comments.

2 Comments

[User] and [Date] please
Definitely [User], as USER is a reserved word. date is only a keyword, so it can be used for object names unquoted, however, I agree that it should be [quoted].

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.