-1

I'm not an expert in Linux, but I am following the development of a software that runs on Linux Buildroot. The device can only use the program for the graphical interface, access the shell, or connect through a serial cable or SSH. The end-users of the device are given the option to connect to their Wi-Fi network.

The program is written in C++, and to perform the scanning and obtain the list of available Wi-Fi networks, a shell command is executed.

My problem lies in the encoding of the result of this operation; I can't display accented characters correctly.

In the program, I can enter a word with an accent and display it on the screen without any issues. However, when I retrieve the result from the shell command to get the Wi-Fi networks, I get an "incorrect" ESSID because if, for example, the network name contains the character è, I see its encoding as \xC3\xA8, and consequently, it is impossible to connect to the network.

I tried to encode the output of the command using a method, but it didn't work. I also ran the command I execute in the code directly in the shell, and the result was incorrectly encoded, as described above in the example. I deduced that it's a system issue, and I attempted to manually set the system encoding, but it didn't change anything.

I have noticed that some files are missing in the system I am using. Is there a solution to this problem??

1 Answer 1

0

The pair of hex codes 0xC3 0xA8 represents è in UTF-8 encoding. I suspect you're using the ISO-8859-1 encoding in your program (or a similar variant such as ISO-8859-15), where you would expect to see 0xA8 for the same character.

You'll need to adapt your program to handle UTF8 characters, or tell your Linux-based system to use ISO8850-1 encoding instead of UTF8.

Run this command on your Linux-based system to see what encoding it is configured to use as a default,

locale

On my system this reports that I'm using the UK (GB) English flavour of UTF-8,

LANG=en_GB.UTF-8
LANGUAGE=en_GB:en
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"

And for my particular account I can identify that I am also using this locale setting with env | grep -E 'LC_|LANG', which produces this output,

LANGUAGE=en_GB:en
LANG=en_GB.UTF-8

On a Debian-based system you should be able to change the system locale with dpkg-reconfigure locales. This command also allows you to define additional locales available for use.

11
  • But it seems like my system isnt using LANG, as initially it was nowhere to be found, and even after I added it, nothing has changed. Commented Aug 4, 2023 at 14:55
  • @porrokynoa answer modified for the system default setting Commented Aug 4, 2023 at 15:11
  • Ok... But on my system locale not exists. My problem s that some files seem to be missing in the system, or at least if they are there, they are in a different location than usual. Commented Aug 4, 2023 at 15:20
  • In that case it sounds like you're going to need to adapt your program to support UTF-8 Commented Aug 4, 2023 at 15:52
  • Initially, I wanted to encode the result from the code, but once I obtained the output, the code already recognizes it as UTF-8. Isn't there a way to execute the commands and set the encoding directly from the command? This way, we could bypass the system settings. Commented Aug 4, 2023 at 16:47

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.