1

I want to query wikidata entities and their labels in multiple languages. But for some reason querying the labels is very inperformant.

My base query looks like this (find 3 life forms that have unicode characters associated with them), which takes around 200ms to run:

SELECT ?lifeform ?unicode_character WHERE {
  ?lifeform wdt:P31 wd:Q16521;
            wdt:P487 ?unicode_character.
}
LIMIT 3

try it

What I want is to add labels in 4 languages to the result: english (en), german (de), spanish (es) and french (fr). In my opinion that doesn't make the query more difficult, because it is just additional information on the results. It doesn't change the number of results or which results are found.

Here is what an answer would look like:

lifeform unicode_character label_en label_de label_fr label_es
wd:Q80117 🍤 Caridea Caridea Caridea camarón
wd:Q726 🐎 horse Hauspferd cheval caballo
wd:Q71516 🐪 Camelus dromedarius Dromedar dromadaire dromedario

My first approach was based on this stack overflow answer:

SELECT ?lifeform ?unicode_character ?label_en ?label_de ?label_fr ?label_es WHERE {
  ?lifeform wdt:P31 wd:Q16521;
            wdt:P487 ?unicode_character.
  
  OPTIONAL { ?lifeform rdfs:label ?label_en filter (lang(?label_en) = "en"). }
  OPTIONAL { ?lifeform rdfs:label ?label_de filter (lang(?label_de) = "de"). }
  OPTIONAL { ?lifeform rdfs:label ?label_fr filter (lang(?label_fr) = "fr"). }
  OPTIONAL { ?lifeform rdfs:label ?label_es filter (lang(?label_es) = "es"). }
}
LIMIT 3

try it

This works, but brings up the time to almost 7000ms.

So I tried this answer next, resulting in this query:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?lifeform ?unicode_character ?label_en ?label_de ?label_fr ?label_es WHERE {
  ?lifeform wdt:P31 wd:Q16521;
            wdt:P487 ?unicode_character.
  ?lifeform rdfs:label ?label_en, ?label_de, ?label_fr, ?label_es.
  FILTER(
    ((LANG(?label_de)) = "de") && 
    ((LANG(?label_en)) = "en") &&
    ((LANG(?label_fr)) = "fr") &&
    ((LANG(?label_es)) = "es"))
}
LIMIT 3

try it

This doesn't even terminate (but I think it is close to terminating, because If I reduce it to 2 or 3 languages it does).

(I also did ask the AI for help and it has a lot of suggestions, none of which was even a valid query.)

Honestly, I am a bit baffled about how hard this seems to be. I get that searching for patterns a graph might be hard, but this seems to be a very easy task: Just look up a few associated values for those 3 entities. How does this justify a 14x increase in time? Adding more variables to the query doesn't seem to be such a hit on perfomance (like asking for optional images and parent taxons on the ?lifeform).

Later I hope to increase the size of the request in the future (lets say 10 items with 10 languages each). So I am interested in a better approach, not a cheat to make this very special query work but doesn't generalize.

So my questions are:

  • Why do my queries take so long to complete when the base query was reasonably fast?
  • How can I query labels in multiple languages in a more performant way?
3
  • hi, looks like the query optimizer alone isn't that smart for your query. Commented Sep 15 at 6:42
  • 1
    You could work around this using a subquery, i.e. SELECT ?lifeform ?unicode_character ?label_en ?label_de ?label_fr ?label_es WHERE { { SELECT * { ?lifeform wdt:P31 wd:Q16521; wdt:P487 ?unicode_character. } LIMIT 3 } OPTIONAL { ?lifeform rdfs:label ?label_en filter (lang(?label_en) = "en"). } OPTIONAL { ?lifeform rdfs:label ?label_de filter (lang(?label_de) = "de"). } OPTIONAL { ?lifeform rdfs:label ?label_fr filter (lang(?label_fr) = "fr"). } OPTIONAL { ?lifeform rdfs:label ?label_es filter (lang(?label_es) = "es"). } } Commented Sep 15 at 6:43
  • @UninformedUser This works and I don't get it. I guess I was spoiled by advanced SQL-query-optimizers. Thank you. Work this comment into an answer and I will gladly accept it as the correct one. Commented Sep 15 at 18:55

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.