0

I have an elastic search db deployed within an AWS VPC. It holds millions of records all with a timestamp added based on the unix datestamp (new Date().getTime()). I am trying to pull (1) record per time slot based on min/max hour and minute values.

Index Mapping: { timestamp: "date", ...rest of record }

Elastic Search Query:

let params = {
  query: {
    bool: {
      must: [{
          range: {
            timestamp: {
              gte: (unix date),
              lte: (unix date)
            }
          }
        },
        {
          script: {
            script: {
              source: "long datestamp = doc['timestamp'].value.getMillis(); " +
                "Date dt = new java.util.Date(datestamp*1L); " +
                "Calendar instance = Calendar.getInstance(); " +
                "instance.setTime(dt); " +
                "int hod = instance.get(Calendar.HOUR_OF_DAY); " +
                "int tod = instance.get(Calendar.MINUTE);  " +
                "if (hod >= params.hourMin && hod <= params.hourMax && (hod === params.hourMin && tod >= params.timeMin || hod === params.hourMax && tod <= params.timeMax)) { return true; } else { return false }",
              params: {
                hourMin: 7,
                hourMax: 8,
                timeMin: 30,
                timeMax: 10
              }
            }
          }
        }
      ]
    }
  },
  from: 0,
  size: 500
};

Issue: I often run into an error while searching indicating that

  1. "dynamic method [java.lang.Long, getMillis/0] not found"

It shows up every 4~5th query generally speaking.

Question:

  1. Is there a better way? I have poured over the elastic search docs regarding intervals, histograms, etc and came up with query above. Not sure if this is the most efficient method nor the most robust.

  2. If this is a community accept approach to find records within an interval then how do I mitigate the errors I am encountering. Do I skip over a specific record or reformat the unix timestamp another way?

Appreciate your support ahead of time.

5
  • 1
    Are you absolutely sure that all your documents have a non-null timestamp field? Commented Feb 25, 2021 at 9:51
  • 1
    A MUCH better way would be to store the hour and time components as new fields into each of your documents. Script queries can be awfully slow depending on the volume of data you have Commented Feb 25, 2021 at 9:52
  • I did search for timestamp fields that did not exist. Based on this question: discuss.elastic.co/t/…. Returned empty array of results. Commented Feb 26, 2021 at 17:04
  • Agreed with hour and time being added to each record to make searching easier. I will implement this moving forward. It's the pain of reindexing existing records I am attempting to avoid. Commented Feb 26, 2021 at 17:06
  • You don't need to reindex anything. You can just do it with the update by query API endpoint and a small script Commented Feb 26, 2021 at 17:11

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.