Incorrect shell output encoding

Question

I'm not an expert in Linux, but I am following the development of a software that runs on Linux Buildroot. The device can only use the program for the graphical interface, access the shell, or connect through a serial cable or SSH. The end-users of the device are given the option to connect to their Wi-Fi network.

The program is written in C++, and to perform the scanning and obtain the list of available Wi-Fi networks, a shell command is executed.

My problem lies in the encoding of the result of this operation; I can't display accented characters correctly.

In the program, I can enter a word with an accent and display it on the screen without any issues. However, when I retrieve the result from the shell command to get the Wi-Fi networks, I get an "incorrect" ESSID because if, for example, the network name contains the character è, I see its encoding as \xC3\xA8, and consequently, it is impossible to connect to the network.

I tried to encode the output of the command using a method, but it didn't work. I also ran the command I execute in the code directly in the shell, and the result was incorrectly encoded, as described above in the example. I deduced that it's a system issue, and I attempted to manually set the system encoding, but it didn't change anything.

I have noticed that some files are missing in the system I am using. Is there a solution to this problem??

Chris Davies · Accepted Answer · 2023-08-04 15:11:25Z

0

The pair of hex codes 0xC3 0xA8 represents è in UTF-8 encoding. I suspect you're using the ISO-8859-1 encoding in your program (or a similar variant such as ISO-8859-15), where you would expect to see 0xA8 for the same character.

You'll need to adapt your program to handle UTF8 characters, or tell your Linux-based system to use ISO8850-1 encoding instead of UTF8.

Run this command on your Linux-based system to see what encoding it is configured to use as a default,

locale

On my system this reports that I'm using the UK (GB) English flavour of UTF-8,

LANG=en_GB.UTF-8
LANGUAGE=en_GB:en
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"

And for my particular account I can identify that I am also using this locale setting with env | grep -E 'LC_|LANG', which produces this output,

LANGUAGE=en_GB:en
LANG=en_GB.UTF-8

On a Debian-based system you should be able to change the system locale with dpkg-reconfigure locales. This command also allows you to define additional locales available for use.

edited Aug 4, 2023 at 15:11

answered Aug 4, 2023 at 14:46

Chris Davies

128k16 gold badges179 silver badges324 bronze badges

But it seems like my system isnt using LANG, as initially it was nowhere to be found, and even after I added it, nothing has changed.

porrokynoa
– porrokynoa

2023-08-04 14:55:37 +00:00
Commented Aug 4, 2023 at 14:55
@porrokynoa answer modified for the system default setting

Chris Davies
– Chris Davies

2023-08-04 15:11:37 +00:00
Commented Aug 4, 2023 at 15:11
Ok... But on my system locale not exists. My problem s that some files seem to be missing in the system, or at least if they are there, they are in a different location than usual.

porrokynoa
– porrokynoa

2023-08-04 15:20:31 +00:00
Commented Aug 4, 2023 at 15:20
In that case it sounds like you're going to need to adapt your program to support UTF-8

Chris Davies
– Chris Davies

2023-08-04 15:52:51 +00:00
Commented Aug 4, 2023 at 15:52
Initially, I wanted to encode the result from the code, but once I obtained the output, the code already recognizes it as UTF-8. Isn't there a way to execute the commands and set the encoding directly from the command? This way, we could bypass the system settings.

porrokynoa
– porrokynoa

2023-08-04 16:47:54 +00:00
Commented Aug 4, 2023 at 16:47

| Show 6 more comments

Stack Exchange Network

Incorrect shell output encoding

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Incorrect shell output encoding

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions