Page 1 of 1

Conversion error using iconv..

Posted: Tue Jun 18, 2013 3:52 am
by kbaluk
We are facing an issue when we are trying to convert EBCDIC (420) to UTF-8(1208) when using iconv() routine in V6R1M0. After conversion the text is getting flipped. For eg:
هذا هو اختبار للعربي N in EBCDIC (420)
N هذا هو اختبار للعربي
in UTF-8 (1208).

Tests have been carried out in V5R4M0 with correct conversion.

Not sure whether this is an issue with PTF's or code (doubt it). Appreciate if anybody can help us.

Thanks in advance.

Re: Conversion error using iconv..

Posted: Tue Jun 18, 2013 8:11 am
by David
It seems to me that if iconv() is behaving different for you in 6.1 vs. 5.4 with the same input data and coding, that there is some issue with it.

Would recommend contacting IBM technical support.

Re: Conversion error using iconv..

Posted: Wed Jun 19, 2013 2:07 am
by kbaluk
Further investigation revealed the below results using the various language combinations:

english/arabic/english/arabic - Flips
arabic/english/arabic/english - No issues
english/arabic/english - No issues
arabic/english/arabic - No issues
english/arabic - Flips
arabic/english - No issues

Any kind of help will be much appreaciated.

Re: Conversion error using iconv..

Posted: Wed Jun 19, 2013 9:36 am
by David
We have a customer in Israel using Hebrew that had similar issues in using CCSID 424 ('original' Hebrew CCSID).

In this case, the resulting 'flip' or 'not flip' was dependent on the data -- it would seem to get confused when Latin characters were mixed in at certain points.

I wonder if this is why you see a difference -- maybe it's not related to OS release but to the actual data?

Behavior of 'iconv()' with bidirectional text and right-to-left languages is much outside of our area of expertise/expirience, so we contacted IBM for assistance -- I'd suggest you do the same, I'm sure they can help.

That said, I'm happy to share my (limited) experience...

In the case of the Israeli customer, the IBM 'bidi' expert told us that conversion between CCSID 424 and UTF-8 was just limited and couldn't handle all cases of mixed Hebrew and Latin characters.

In this case, they let us know about an additional CCSID for Hebrew that works better with UTF-8 and mixed Latin characters.

Looking at the CCSID list for Arabic, I see the same sort of thing with an 'original' Arabic CCSID, and also another one that has a simliar number to the Hebrew CCSID that was recommended by IBM.

See here:

http://www-03.ibm.com/systems/i/softwar ... rabic.html

Here is what the Israeli customer had to do to get good conversions to UTF-8, but I'll substitute with the Arabic CCSID numbers:

1. First, convert CCSID 420 to CCSID 62251. This is another EBCDIC CCSID which stores the data differently (I think in reverse from 420), and has a different behavior when used with UTF-8.

2.Then, convert from CCSID 62251 to 1208.