%char() and unicode

Use this board for starting discussions, asking questions, and giving advice on RPG programming for the IBM i platform (and predecessors.)
Paul
Profound User
Posts: 39
Joined: Mon Aug 29, 2011 10:53 pm
First Name: Paul
Last Name: Foster
Company Name: GRI Group Ltd
Country: Hong Kong
Location: Hong Kong
Contact:

%char() and unicode

Post by Paul »

I am currently using YAJL to build a json string to send to a web service (thanks Scott). All is well until I need to add a unicode value for a non-English description. The prototypes for YAJL accept char, so I tried using %char() to convert the ProfoundUI screen field (Graphic, CCSID(1200)) into escaped hex (e.g. \u123a\u456b etc). However, every character is replaced by the substitution character \u100a.

Is it possible to do this using the %char() bif? According to the RPG docs %char() should convert from UCS-2 but the results are not what I expected. The data from the rich display field saves correctly into a DB field of CCSID(13488), so its not a data problem. My job CCSID is 37 and I have CCSID(*CHAR:*JOBRUN) in the H-specs.

Cheers,

Paul
Scott Klement
Experienced User
Posts: 2711
Joined: Wed Aug 01, 2012 8:58 am
First Name: Scott
Last Name: Klement
Company Name: Profound Logic
City: Milwaukee
State / Province: Wisconsin

Re: %char() and unicode

Post by Scott Klement »

YAJL, internally, does everything in UTF-8. (Which, as far as I know, is the only widely used encoding for JSON).

However, my RPG wrappers are intended to make things easier for people who are working in EBCDIC. So they convert everything from your job CCSID (EBCDIC) to UTF-8 before handing it over to YAJL, and likewise convert from UTF-8 back to the job CCSID when returning stuff back to your program.

You can, of course, call the YAJL routines directly (not going through my wrapper) -- or you could extend my wrapper code by adding routines that accept UCS-2 parameters. Etc.

If you are using %CHAR() to convert from UCS-2, then you are converting your UCS-2 data into your job's EBCDIC (which you say is CCSID 37). That means you can only support characters that exist in CCSID 37... that is likely why things aren't working for you.

Also, the \uXXXX escape codes are not valid RPG syntax.... Are you coding this in a different language? Or did you mean u'xxxx'?
Paul
Profound User
Posts: 39
Joined: Mon Aug 29, 2011 10:53 pm
First Name: Paul
Last Name: Foster
Company Name: GRI Group Ltd
Country: Hong Kong
Location: Hong Kong
Contact:

Re: %char() and unicode

Post by Paul »

Thanks Scott. The '\u001a' is what was in the json after YAJL_addChar(), which I viewed in debug (e.g.: {"description":"\u001a\u001a"} ).

I suppose I was a bit ambitious thinking that %char() would convert all characters, probably because I usually use PHP json_encode for this kind of thing and it translates the far-east language characters to '\uXXXX' values. Being restricted to a job CCSID when using %char() is not too helpful really as the source data could contain many languages all in the same json string so I can't set the job to any one CCSID.

I totally understand that your wrappers are for EBCDIC use and they work really well. Very quick and perfect results. I'll have a go at extending them to use type "c" defined variables. As you say, json is UTF-8 anyway so if I send UCS-2 data it has nothing to translate.

Thanks,

Paul
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest