11 questions
0
votes
1
answer
198
views
CLR function for counting combining characters in SQL Server
The following C# method counts string characters considering combining characters (Grapheme Clusters). Here it is:
public static class StringExtensions
{
public static SqlInt32 GetStrLength(this ...
1
vote
2
answers
484
views
Difficulty getting grapheme lengths with ICU in C++
Finding examples for ICU is difficult, but here is what I'm trying to do. I need to be able to carve graphemes out of strings. In order to do this, I need to get the sequence of grapheme lengths in ...
8
votes
2
answers
2k
views
Maximum number of codepoints in a grapheme cluster
I am using the C++ ICU library. I wish to split a utf-8 string into approximately equal chunks. However, I want the chunks to be demarcated at grapheme cluster boundaries. I do not wish to convert my ...
0
votes
1
answer
1k
views
How to convert String index to character index in Dart
If I have an arbitrary String like this:
final family = '\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}'; // 👨👩👧
final myString = 'Let me introduce my $family to you.';
And I know the String index ...
0
votes
1
answer
282
views
Can word2vec deal with sequence of number?
I am very new to network embedding, especially for the attributed network embedding. Currently, I am studying the node2vec algorithm. I think the process is
RandomWalk with p and q
Fed the walks to ...
8
votes
2
answers
2k
views
Handling grapheme clusters in Dart
From what I can tell Dart does not have support for grapheme clusters, though there is talk of supporting it:
Dart Strings should support Unicode grapheme cluster operations #34
Minimal Unicode ...