I am using PostgreSQL 9.5 64bit version on windows server. The character encoding of the database is set to UTF8.
I'd like to create a function that manipulates multibyte strings. (e.g. cleansing, replace etc.)
I copied C language logic for manipulating characters from a other system, The logic assumes that the character code is sjis.
I do not want to change C language logic, so I want to convert from UTF8 to sjis in C language function of Postgresql. Like the convert_to function. (However, since the convert_to function returns bytea type, I want to obtain it with TEXT type.)
Please tell me how to convert from UTF 8 to sjis in C language.
Create Function Script:
CREATE FUNCTION CLEANSING_STRING(character varying)
RETURNS character varying AS
'$libdir/MyFunc/CLEANSING_STRING.dll', 'CLEANSING_STRING'
LANGUAGE c VOLATILE STRICT;
C Source:
#include <stdio.h>
#include <string.h>
#include <postgres.h>
#include <port.h>
#include <fmgr.h>
#include <stdlib.h>
#include <builtins.h>
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
extern PGDLLEXPORT Datum CLEANSING_STRING(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(CLEANSING_STRING);
Datum CLEANSING_STRING(PG_FUNCTION_ARGS)
{
// Get Arg
text *arg1 = (text *)PG_GETARG_TEXT_P(0);
// Text to Char[]
char *arg;
arg = text_to_cstring(arg1);
// UTF8 to Sjis
//Char *sjisChar[] = foo(arg); // something like that..
// Copied from other system.(Assumes that the character code is sjis.)
cleansingString(sjisChar);
replaceStrimg(sjisChar);
// Sjis to UTF8
//arg = bar(sjisChar); // something like that..
//Char[] to Text and Return
PG_RETURN_TEXT_P(cstring_to_text(arg));
}
any_to_serverandserver_to_anyinsrc/backend/utils/mb/mbutils.c, and the comments at the top ofmbutils.cpg_server_to_anyandpg_any_to_server. And for the encoding name, see thepg_enc2name_tblinsrc/backend/utils/mb/encnames.cand thepg_char_to_encodingfunction