1

In GraphDB 10.6 I need to search across English and French words ignoring accents. I am looking for ASCII folding.

I have tried this SPARQL to generate the Lucene connector, but I get 500: Error - Unable to create connector: Unable to init Lucene index.

PREFIX luc: <http://www.ontotext.com/connectors/lucene#>
PREFIX luc-index: <http://www.ontotext.com/connectors/lucene/instance#>

INSERT DATA {
  luc-index:myindex luc:createConnector 
  '''
{
  "fields": [
    {
      "fieldName": "label",
      "fieldNameTransform": "predicate.localName", 
      "propertyChain": ["$literal"],
      "ignoreInvalidValues": true
    }
  ],
  "languages": [],
  "types": ["$any"],
  "analyzer": {
    "tokenizer": "org.apache.lucene.analysis.standard.StandardTokenizerFactory",
    "filters": [
      "org.apache.lucene.analysis.standard.StandardFilterFactory",
      "org.apache.lucene.analysis.lowercase.LowerCaseFilterFactory",
      "org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilterFactory"
    ]
  }
}
''' .
}

I cannot find documentation on selecting among standard analyzers other then this Graphdb page https://graphdb.ontotext.com/documentation/10.6/lucene-graphdb-connector.html

How can I set the analyzer to one of the existing choices? Or must I create a custom one. I feel like these must be an existing one that includes ASCII folding!

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.