Is it possible to index external to RDF data? Like in RDF there is a triple with the object as a link to an external file. Can the content of this file be indexed instead of the link value?
2 Answers
I suspect that the answer above misunderstood the question. The question refers to external content - i.e., if GraphDB's Lucene is able to index the content available at http://example.org, rather than the RDF literal associated with it (and then return in searches the triple pointing to that content).
From what I was able to try no, this is not currently supported.
Comments
Absolutely. Lucene is a core part of GraphDB and it offers the standard functionality which comes with a standalone Lucene. The data will have to be parametrized as a String literal. <http://www.example.org/> rdfs:label "An example webpage url."@EN .
Then you can configure a Lucene Index:
PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
INSERT DATA {
luc:index luc:setParam "uris" .
luc:include luc:setParam "literals" .
luc:moleculeSize luc:setParam "1" .
luc:includePredicates luc:setParam "http://www.w3.org/2000/01/rdf-schema#label" .
}
And once you have the configuration, you can create the index.
PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
INSERT DATA {
luc:myTestIndex luc:createIndex "true" .
}
And, given the index and your data, you can query it.
PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
SELECT * {
?subj luc:myTestIndex "web*"
}
Since you are asking about the subject of something which contains the string web*, you'll get <http://www.example.org/>. If you had other triples linking to this one, they might have also appeared.
More information about the way in which GraphDB interacts with Lucene and its Full-Text-Search capabilities can be found within the GraphDB documentation.