0

Heroku sends Postgres metric logs as "samples" to DataDog. I want a parser that can extract the data from those logs so I can turn them into DataDog metrics. I had a pattern that was working for a while, but recently broke because Heroku added additional metrics on May 15th, 2024.

1 Answer 1

0

I figured out how to make the parser more tolerant to new fields being added (since I'm sure Heroku will do this again).

In "Advanced Settings", add _any_sample (?:(?:\s+sample#[^=]+=[^ ]+)?)+. This matches "0 or more" instances of samples I don't explicitly call out.

I have a few rules for the other sample types (redis, heroku memory)

Then I use three different rules for Postgres:

  1. One for followers (has "lag commits" metric)
  2. One for the lead (without the "lag commits")
  3. One that's minimal in case they remove/change some so I don't lose everything

The properties have to be in order, which the minimal log will hopefully accommodate if small changes happen.

Rules

serverMetrics source\=%{notSpace:source}\.1\s+dyno\=heroku\.*\.%{notSpace:dyno}\s+sample#load_avg_1m\=%{number:load_avg_1m}\s+sample#load_avg_5m\=%{number:load_avg_5m}\s+sample#load_avg_15m\=%{number:load_avg_15m}

memRuntimeMetrics source\=%{notSpace:source}\s+dyno\=%{notSpace:dyno}\s+sample#memory_total\=%{number:memory_total}MB\s+sample#memory_rss\=%{number:memory_rss}MB\s+sample#memory_cache\=%{number:memory_cache}MB\s+sample#memory_swap\=%{number:memory_swap}MB\s+sample#memory_pgpgin\=%{number:memory_pgpgin}pages\s+sample#memory_pgpgout\=%{number:memory_pgpgout}pages\s+sample#memory_quota\=%{number:memory_quota}MB

redisMetrics source\=%{notSpace:source}\s+addon\=%{notSpace:addon}\s+sample#active-connections\=%{number:active_connections}\s+sample#load-avg-1m\=%{number:load_avg_1m}\s+sample#load-avg-5m\=%{number:load_avg_5m}\s+sample#load-avg-15m\=%{number:load_avg_15m}\s+sample#read-iops\=%{number:read_iops}\s+sample#write-iops\=%{number:write_iops}\s+sample#memory-total\=%{number:memory_total}kB\s+sample#memory-free\=%{number:memory_free}kB\s+sample#memory-cached\=%{number:memory_cached}kB\s+sample#memory-redis\=%{number:memory_redis}bytes\s+sample#hit-rate\=%{number:hit_rate}\s+sample#evicted-keys\=%{number:evicted_keys}

postgresFollower source\=%{notSpace:source}\s+addon\=%{notSpace:addon}%{_any_sample} sample#current_transaction\=%{number:current_transaction}%{_any_sample} sample#db_size\=%{number:db_size}bytes%{_any_sample} sample#db-max-size\=%{number:db_max_size}bytes%{_any_sample} sample#db-size-percentage-used\=%{number:db_size_percentage_used}%{_any_sample} sample#tables\=%{number:tables}%{_any_sample} sample#active-connections\=%{number:active_connections}%{_any_sample} sample#waiting-connections\=%{number:waiting_connections}%{_any_sample} sample#max-connections\=%{number:max_connections}%{_any_sample} sample#connections-percentage-used\=%{number:connections_percentage_used}%{_any_sample} sample#index-cache-hit-rate\=%{number:index_cache_hit_rate}%{_any_sample} sample#table-cache-hit-rate\=%{number:table_cache_hit_rate}%{_any_sample} sample#load-avg-1m\=%{number:load_avg_1m}%{_any_sample} sample#load-avg-5m\=%{number:load_avg_5m}%{_any_sample} sample#load-avg-15m\=%{number:load_avg_15m}%{_any_sample} sample#read-iops\=%{number:read_iops}%{_any_sample} sample#write-iops\=%{number:write_iops}%{_any_sample} sample#max-iops\=%{number:max_iops}%{_any_sample} sample#iops-percentage-used\=%{number:iops_percentage_used}%{_any_sample} sample#tmp-disk-used\=%{number:tmp_disk_used}%{_any_sample} sample#tmp-disk-available\=%{number:tmp_disk_available}%{_any_sample} sample#memory-total\=%{number:memory_total}kB%{_any_sample} sample#memory-free\=%{number:memory_free}kB%{_any_sample} sample#memory-percentage-used\=%{number:memory_percentage_used}%{_any_sample} sample#memory-cached\=%{number:memory_cached}kB%{_any_sample} sample#memory-postgres\=%{number:memory_postgres}kB%{_any_sample} sample#follower-lag-commits\=%{number:follower_lag_commits}%{_any_sample} sample#wal-percentage-used\=%{number:wal_percentage_used}%{_any_sample} sample#rollback-from\=%{date("yyyy-MM-dd'T'HH:mmz"):rollback_from}%{_any_sample}

postgresLead source\=%{notSpace:source}\s+addon\=%{notSpace:addon}%{_any_sample} sample#current_transaction\=%{number:current_transaction}%{_any_sample} sample#db_size\=%{number:db_size}bytes%{_any_sample} sample#db-max-size\=%{number:db_max_size}bytes%{_any_sample} sample#db-size-percentage-used\=%{number:db_size_percentage_used}%{_any_sample} sample#tables\=%{number:tables}%{_any_sample} sample#active-connections\=%{number:active_connections}%{_any_sample} sample#waiting-connections\=%{number:waiting_connections}%{_any_sample} sample#max-connections\=%{number:max_connections}%{_any_sample} sample#connections-percentage-used\=%{number:connections_percentage_used}%{_any_sample} sample#index-cache-hit-rate\=%{number:index_cache_hit_rate}%{_any_sample} sample#table-cache-hit-rate\=%{number:table_cache_hit_rate}%{_any_sample} sample#load-avg-1m\=%{number:load_avg_1m}%{_any_sample} sample#load-avg-5m\=%{number:load_avg_5m}%{_any_sample} sample#load-avg-15m\=%{number:load_avg_15m}%{_any_sample} sample#read-iops\=%{number:read_iops}%{_any_sample} sample#write-iops\=%{number:write_iops}%{_any_sample} sample#max-iops\=%{number:max_iops}%{_any_sample} sample#iops-percentage-used\=%{number:iops_percentage_used}%{_any_sample} sample#tmp-disk-used\=%{number:tmp_disk_used}%{_any_sample} sample#tmp-disk-available\=%{number:tmp_disk_available}%{_any_sample} sample#memory-total\=%{number:memory_total}kB%{_any_sample} sample#memory-free\=%{number:memory_free}kB%{_any_sample} sample#memory-percentage-used\=%{number:memory_percentage_used}%{_any_sample} sample#memory-cached\=%{number:memory_cached}kB%{_any_sample} sample#memory-postgres\=%{number:memory_postgres}kB%{_any_sample} sample#wal-percentage-used\=%{number:wal_percentage_used}%{_any_sample} sample#rollback-from\=%{date("yyyy-MM-dd'T'HH:mmz"):rollback_from}%{_any_sample}

postgresMinimal source\=%{notSpace:source}\s+addon\=%{notSpace:addon}%{_any_sample} sample#current_transaction\=%{number:current_transaction}%{_any_sample} sample#db_size\=%{number:db_size}bytes%{_any_sample} sample#db-max-size\=%{number:db_max_size}bytes%{_any_sample} sample#db-size-percentage-used\=%{number:db_size_percentage_used}%{_any_sample} sample#tables\=%{number:tables}%{_any_sample} sample#active-connections\=%{number:active_connections}%{_any_sample} sample#waiting-connections\=%{number:waiting_connections}%{_any_sample} sample#max-connections\=%{number:max_connections}%{_any_sample} sample#connections-percentage-used\=%{number:connections_percentage_used}%{_any_sample} sample#index-cache-hit-rate\=%{number:index_cache_hit_rate}%{_any_sample} sample#table-cache-hit-rate\=%{number:table_cache_hit_rate}%{_any_sample} sample#load-avg-1m\=%{number:load_avg_1m}%{_any_sample} sample#load-avg-5m\=%{number:load_avg_5m}%{_any_sample} sample#load-avg-15m\=%{number:load_avg_15m}%{_any_sample} sample#read-iops\=%{number:read_iops}%{_any_sample} sample#write-iops\=%{number:write_iops}%{_any_sample} sample#max-iops\=%{number:max_iops}%{_any_sample} sample#iops-percentage-used\=%{number:iops_percentage_used}%{_any_sample} sample#tmp-disk-used\=%{number:tmp_disk_used}%{_any_sample} sample#tmp-disk-available\=%{number:tmp_disk_available}%{_any_sample} sample#memory-total\=%{number:memory_total}kB%{_any_sample} sample#memory-free\=%{number:memory_free}kB%{_any_sample} sample#memory-percentage-used\=%{number:memory_percentage_used}%{_any_sample} sample#memory-cached\=%{number:memory_cached}kB%{_any_sample} sample#memory-postgres\=%{number:memory_postgres}kB%{_any_sample} sample#wal-percentage-used\=%{number:wal_percentage_used}%{_any_sample}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.