Questions tagged [unicode]
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems.
504 questions
0
votes
1
answer
113
views
Is there a list of every scancode that Linux uses?
I am making another remapper like xkb, sxhkd, xmodmap etc. because I don't like the other ones and in this one I want a more simple and terse syntax that I find nice to use and to make an API that ...
1
vote
1
answer
174
views
Gibberish characters in EFI variables
Do gibberish characters found in EFI variables serve any purpose?
Out of curiosity, i am trying to read out EFI variables. Specifically ones related to the booting mechanism.
Under /sys/firmware/efi/...
6
votes
3
answers
568
views
How to make Perl half/full width-insensitive regular expressions?
In Perl, /a/i matches both A and a, so I don't have to write /A|a/.
What is the easy way to write /4|4/ ?
Yes, I'm talking about
$ unicode 4 4|grep U+
U+FF14 FULLWIDTH DIGIT FOUR
U+0034 DIGIT FOUR
...
0
votes
2
answers
208
views
How to insert text before the first line of an UTF-8 with BOM file
This question is closely related to: How to insert text before the first line of a file?. I deliberately made the title similar to that question to highlight this.
Except the target file is UTF-8 with ...
0
votes
0
answers
73
views
Cross-platform method of checking if using terminal emulator or tty
I am looking for a cross platform way to check if I am using a terminal emulator (with support for unicode characters) or a TTY session (with only support for ASCII chars). I initially tried to use if ...
2
votes
1
answer
732
views
Which interpreter for "Unicode text, UTF-8 text executable"
I'm trying to set up a keybinding for an executable which is in my home. For this, I set the command:
sh -c '\"/path/to/the/executable\" --options'
But, it does not work, and, when I'm ...
1
vote
0
answers
59
views
Ignore Accent Differences in Zsh Autocomplete
Suppose I have a directory named cálculo in the current directory. How can I autocomplete its name after typing the starting characters without the accent?
$ cd calc<tab>
$ cd cálculo/
I failed ...
2
votes
0
answers
175
views
How do I disable UTF-8 in an xterm (or X, really)?
I have a system running Debian unstable where I don't want to have UTF-8 in my xterms (or at all). But I recently discovered that somehow I now have UTF-8 in my xterms and other windows. It might have ...
2
votes
2
answers
359
views
Search and replace composed Unicode characters
I have a deep folder structure on a Debian machine
where the directory names and the filenames
contain some "special" characters (ä,ö,ü).
However, these are not in "ISO-8859-1"
...
0
votes
0
answers
91
views
Terminal: Help understanding behavior with UTF-8 text
I am trying to understand the following behavior I am observing on my Ubuntu system. Consider the following two files:
$ hexdump -C 1.txt
00000000 d9 82 d8 a8 d8 a7 d9 86 d9 8a 5e d9 84 d9 86 d8 |.....
1
vote
1
answer
81
views
Crossmark symbol (\u274c) doesn't work in debian 12
I have moved from Ubuntu 22.04 to Debian 12, I have a bash function that outputs crossmark if command failed and checkmark if command succeed. The checkmark works, but the crossmark doesn't.
Here is ...
1
vote
1
answer
76
views
How to use unix `mv` to rename files with unicode spaces(not U+20)?
$ ls cn*
cn blah blah.txt
$ ls cn\ *
ls: cannot access 'cn *': No such file or directory
$ ls cn*|hexdump -C
00000000 63 6e e2 80 85 62 6c 61 68 c2 a0 62 6c 61 68 2e |cn...blah..blah.|
00000010 74 ...
2
votes
1
answer
723
views
Can awk be told to count the character string length rather than byte string length for '%10s' printf formats?
Try this for an output of |Ü| X|:
echo 'Ü X' | awk '{printf("|% 2s|% 2s|\n", $1, $2)}'
Obviously awk counts the byte length, not the character length of the Ü, so the count is 2 and no left ...
0
votes
1
answer
173
views
Why is MB_CUR_MAX 6 instead of 4 for UTF-8? (Linux, glibc)
MB_CUR_MAX is defined by glibc as 'a positive integer expression that is the maximum number of bytes in a multibyte character in the current locale.'
If I print the value I get 1. I assume that this ...
0
votes
2
answers
367
views
Listing filenames with special characters
I have a zsh shell (with oh-my-zsh default config). Why I ls filenames with special characters, they are printed as:
''$'\316\262''=0.35-L=32-m=10.jld2'
This should be:
β=0.35-L=32-m=10.jld2
but the ...
1
vote
0
answers
119
views
Ctrl-Shift-U requires *extra* U in Ubuntu 23.04 Cinnamon?
I'm running a new install of Ubuntu 23.04 with cinnamon desktop 5.6.7
Typing Ctrl-Shift-u in a terminal does nothing unles the next character is another u; then the underlined u appears and I can ...
1
vote
1
answer
106
views
Expand tabs in file with utf8 characters
I use expand to expand tabs to spaces. For utf8 files expand doesn't work correctly. E.g. in ć\ta tab is expanded to 6 spaces while in a\ta to 7 spaces.
How do I make it work for utf8 files?
2
votes
2
answers
230
views
Unicode Supplementary Multilingual Plane (Plane 1) glyphs in xterm
I'm trying to display Unicode Supplementary Multilingual Plane (Plane 1) glyphs in xterm. Those glyphs are in the U+010000..U+01FFFF range (https://unifoundry.com/pub/unifont/unifont-15.0.01/...
4
votes
4
answers
550
views
Collect chars from strings and print their unicode
Context (skip, if you don't care; read, if you suspect I'm totally on the wrong track)
For an embedded system with small memory, I want to generate fonts which contain only those glyphs actually ...
0
votes
0
answers
273
views
Script for awscli check not working with crontab schedule
I have written a small code snippet to check the aws cli version
#!/usr/bin/env bash
if [ -e "/usr/local/bin/aws" ];
then
myAWS="/usr/local/bin/aws"
else
...
2
votes
2
answers
993
views
How to combine settings from multiple locales in Linux?
When I installed Linux I set my locale to en_US.UTF-8. However I want to override some but not all of the settings in that locale. Specifically, I would like the Measurement to be Metric instead of ...
4
votes
1
answer
318
views
How do I create a zip that preserves unicode character composition on linux?
I'm on Debian. I have a file called Sóanr.jpg. According to https://emojidissector.com/, this is made of the following code points:
S 0053 LATIN CAPITAL LETTER S
o 006F LATIN SMALL LETTER O
...
3
votes
1
answer
318
views
Different encoding/Unicode interpretation using terminal vs using shell script
I was working on a keymap script (map keys from one language keyboard layout to another). And after a lot of hard time trying to get everything working I found out that different characters are ...
0
votes
0
answers
167
views
Is there a way to remove specific emoji from being rendered in any application while using Cinnamon desktop?
I am slightly annoyed with some emojis.
So I was wondering, how could I remove/prevent some emojis from being rendered at all?
Replacing them with some other emoji like cute cat face could work too.
...
1
vote
0
answers
46
views
Cannot use unicode shortcut on non-english layouts
I’m using US and RU layouts, and while I can use Ctrl+Shift+u, when I have US layout selected, when I try to use it with RU layout selected, it just doesn’t work. Didn’t find anything related to it in ...