2

I have a table, let's call it 'entries' that looks like this (simplified):

id [pk]
user_id [fk]
created [date]
processed [boolean, default false]

and I want to create an UPDATE query which will set the processed flag to true on all entries except for the latest 3 for each user (latest in terms of the created column). So, for the following entries:

1,456,2009-06-01,false
2,456,2009-05-01,false
3,456,2009-04-01,false
4,456,2009-03-01,false

Only entry 4 would have it's processed flag changed to true.

Anyone know how I can do this?

2 Answers 2

6

I don't know postgres, but this is standard SQL and may work for you.

update entries set
  processed = true
where (
  select count(*)
  from entries as E
  where E.user_id = entries.user_id
  and E.created > entries.created
) >= 3

In other words, update the processed column to true whenever there are three or more entries for the same user_id on later dates. I'm assuming the [created] column is unique for a given user_id. If not, you'll need an additional criterion to pin down what you mean as "latest".

In SQL Server you can do this, which is a little easier to follow and will probably be more efficiently executed:

with T(id, user_id, created, processed, rk) as (
  select
    id, user_id, created, processed,
    row_number() over (
      partition by user_id
      order by created desc, id
    )
  from entries
)
  update T set
    processed = true
  where rk > 3;

Updating a CTE is a non-standard feature, and not all database systems support row_number.

Sign up to request clarification or add additional context in comments.

2 Comments

Yes your SQL query worked perfectly. I did try doing something just like this but it didn't work for me. I'm not sure because I didn't retain that query but I think I was trying to select count(*), user_id in the subquery for some reason, but I don't know why I would have done that.
Thanks. By the way, I changed > to >= after reading depesz's solution, which was like mine, except correct. :) Be sure you don't keep my original off-by-one error.
4

First, let's start with query that will list all rows to be updated:

select e.id
from entries as e
where (
    select count(*)
    from entries as e2
    where e2.user_id = e.user_id
        and e2.created > e.created
) > 2

This lists all ids of records, that have more than 2 such records that user_id is the same, but created is later than created in row to be returned.

That is it will list all records but last 3 per user.

Now, we can:

update entries as e
set processed = true
where (
    select count(*)
    from entries as e2
    where e2.user_id = e.user_id
        and e2.created > e.created
) > 2;

One thing thought - it can be slow. In this case you might be better off with custom aggregate, or (if you're on 8.4) window functions.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.