I have to query once per 10 minutes for the amount of users that have been active the last 1, 24, 724 and 3024 hours from a datapool where we store one line per user action.
When a user does something, we store the hashed userId, hashed action, timestamp and the group the user belongs to in a table. This table is used for a lot of statistical purposes (e.g. decide which features are used most, which features lead to user loss and so on.)
However the query that happens most often on this table is to get the amount of unique users in a given period of time.
SELECT
count(user) as "1m",
count(*) FILTER (WHERE "timestamp" >= (now() - interval '7 days')::timestamp) as "1w",
count(*) FILTER (WHERE "timestamp" >= (now() - interval '1 day')::timestamp) as "1d",
count(*) FILTER (WHERE "timestamp" >= (now() - interval '1 hour')::timestamp) as "1h"
FROM (
SELECT
"user" as "user",
(max(timestamp) + interval '1 hour')::timestamp as "timestamp"
FROM public.user_activity
WHERE
public.user_activity."timestamp" >= (now() - interval '1 month')::timestamp
AND "system" = 'enterprise'
GROUP BY "user"
) as a
so in the subquery
- we select entries, where the timestamp was within the last month and belongs to a given system
- we group these entries by user
- we then select the userId and the last timestamp of the given grouped user
this subquery returns usually between 10k and 100k entries (but should work for more, too)
then we do another query on this subquery:
- we count the amount of entries as users last month
- we count the filtered amount of entries where the timestamp is newer than a specific point in time
This query runs on a few million entries (growing rapidly).
How can I improve the query to run faster? What indexes would be beneficial? (Using AWS RDS hitting the IOPS limit of our 100GB SSD)
count(*) as "1m"would be slightly faster thancount(user) as "1m"