I have my own personal movie database system, within which context I NEVER want to see "extended" characters (with accents, umlauts, etc.) in any text fields.
MS Co-pilot tells me that i could use something based on...
iconv_t cd = iconv_open("ASCII//TRANSLIT", "ISO-8859-1");
...
size_t result = iconv(cd, &inptr, &inbytesleft, &outptr, &outbytesleft);
...to reliably convert anything I get back from API calls to https://www.omdbapi.com and themoviedb.org into "nearest equivalent" ASCII characters, but it also tells me there's NO STANDARD WAY of forcing "single byte in = single byte out". So if the input happens to contain the SINGLE BYTE 'ß' (Eszett or sharp S) then iconv() may convert it to TWO BYTES ("ss").
I find this hard to believe. So before I go to the trouble of writing my own logic to convert my text byte-by-byte (replacing any multi-byte outputs with or some other 'special' char), I thought I'd ask here.
Is there a standard way to reduce every "extended" character (single byte with 128-bit set) to "nearest equivalent" ASCII char (i.e. - WITHOUT the high bit set)?
In my context, fixed text length is more important than "accuracy", so just "s" would be better than "ss" for 'ß'.
ßto convert to?man 3 iconv_openconfirms what Copilot told you. Do you really need to keep the byte count? Or were you just hoping to avoid writing code to handleE2BIG? To keep the byte count, writing your own function that simply looks up the equivalent in a 256-byte string might be simpler than trying to bend someone else's function to your will.ISO-8859-1, as often text can be in the similarISO-8859-15orCP-1252encodings instead.