Is it valid in Harfbuzz to specify repeating cluster IDs in the unicode input?

I'm writing code which needs to find the advances for each grapheme in the original text (e.g., for cursor positioning and selection), and I would like to use the Unicode grapheme boundaries rather than HarfBuzz's shaping boundaries. It seems cluster IDs are necessary for this.

I assume that I need to manually specify cluster IDs in the hb_buffer before calling hb_shape(), and I have choice about how to do this. (What happens if I don't write these fields?) I think it would make the most sense if I gave all code points the same cluster ID, so that they are guaranteed to have the same cluster ID in the shaped output. If the cluster ID is just the grapheme's index in the (conceptual) list of graphemes, then I can easily go backwards from a glyph to a string position, as I'm keeping an array of all the graphemes' utf8 offsets.

My problem is that the documentation is not clear on whether it's acceptable usage to (contiguously) duplicate cluster IDs in the unicode input. I can't find any example that does this (or really any example at all that shows setting the cluster IDs).

Is this permitted? Is this the best way to accomplish this?

asked Mar 1, 2024 at 7:37

trbabb

2,1351 gold badge23 silver badges41 bronze badges

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Is it valid in Harfbuzz to specify repeating cluster IDs in the unicode input?

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest