2

I'm seeing high response times on elasticsearch searches (took is 5000ms) but if I check profile, query time is low ~15ms. I think this only happens when the request rate is high, but CPU is also far from fully saturated (~50%).

I'm requesting many items (size=4096), but I set _source=false to exclude document data. If I lower size to 10, responses are very fast (took=35). If I add from=4096, size=10 responses are still very fast. I'm using track_total_hits=true, but removing doesn't seem to make any difference. The index is quite small (<2GB) and should definitely fit in cache (96GB RAM). Heap size is around 75% most of the time. CPU usage varies between 20-80%.

I've tried looking at perf_events and I'm seeing a lot of cpu (40 cores at ~30% in osq_lock()) coming from ZFS reads locks, so I suspect this might be a problem with the filesystem.

perf top during slow response times

enter image description here

I would not expect fs reads to be the bottleneck. I'm a bit surprised it's reading from disk at all, I would assume that would not be necessary when _source=false? I'm really just interested in the document ids. Could it be that ZFS is just not suited for Elasticsearch?

Software versions:

  • Elasticsearch 7.10.1 (official docker image)
  • ZFS: 0.7.8

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.