1

I am trying to sort a field inside the first object of an array in the following docs each docs has an array i want to retrieve the docs sorted by they first objects by there city name lets name that in the following result I want to have first the third documents because the name of the city its start by "L" ('london') then the second "M" ('Moscow') then the third "N" ('NYC')

the structure is a record that:

  1. has an array
  2. the array contains an object (called 'address')
  3. the object has a field (called 'city')

i want to sort the docs by the first address.cities

    get hello/_mapping 
    {
      "hello": {
        "mappings": {
          "jack": {
            "properties": {
              "houses": {
                "type": "nested",
                "properties": {
                  "address": {
                    "properties": {
                      "city": {
                        "type": "text",
                        "fields": {
                          "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }

Thos are the document that i indexed

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "hello",
        "_type": "jack",
        "_id": "2",
        "_score": 1,
        "_source": {
          "houses": [
            {
              "address": {
                "city": "moscow"
              }
            },
            {
              "address": {
                "city": "belgrade"
              }
            },
            {
              "address": {
                "city": "Sacramento"
              }
            }
          ]
        }
      },
      {
        "_index": "hello",
        "_type": "jack",
        "_id": "1",
        "_score": 1,
        "_source": {
          "houses": [
            {
              "address": {
                "city": "NYC"
              }
            },
            {
              "address": {
                "city": "PARIS"
              }
            },
            {
              "address": {
                "city": "TLV"
              }
            }
          ]
        }
      },
      {
        "_index": "hello",
        "_type": "jack",
        "_id": "3",
        "_score": 1,
        "_source": {
          "houses": [
            {
              "address": {
                "city": "London"
              }
            }
          ]
        }
      }
    ]
  }
}

1 Answer 1

1

Try this (of course, add some test inside the script if field could be empty. Note it could be pretty slow, because elastic wont have this value indexed. Add a main address field would be faster (and really faster) for sure and would be the good way to do it.

{
      "sort" : {
        "_script" : {
            "script" : "params._source.houses[0].address.city",
            "type" : "string",
            "order" : "asc"
        }
    }
}

You have to use _source instead of doc[yourfield] because you dont know in witch order elastic store your array.

EDIT: test if field exist

{
  "query": {
    "nested": {
      "path": "houses",
      "query": {
        "bool": {
          "must": [
            {
              "exists": {
                "field": "houses.address"
              }
            }
          ]
        }
      }
    }
  },
  "sort": {
    "_script": {
      "script" : "params._source.houses[0].address.city",
      "type": "string",
      "order": "asc"
    }
  }
}
Sign up to request clarification or add additional context in comments.

3 Comments

he return a null pointer because some record has no value in [0] index
hi thank a lot it works when the collection is small but in my production env that there are 5M records he returns a timeout { "statusCode": 504, "error": "Gateway Timeout", "message": "Client request timeout" }
Yeah it s normal. Like i said elastic doesnot index field he doesnot need, so in fact the script is looking for all your record. You have to add a new field in your type (the main_adress = adress[0] you want) and request this new indexed field. Elastic will reponse almost immediatly then.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.