Re: ec_rpc in Tcl seems unable to cope with special characters like £ (the pound

From: Joachim Schimpf <j.schimpf_at_icparc.ic.ac.uk>
Date: Thu 04 Jan 2001 05:45:41 PM GMT
Message-ID: <3A54B6C5.7057ED42@icparc.ic.ac.uk>
Kish Shen wrote:
> 
> Problem is due to use of unicode in Tcl (>8.0). "£"
> is represented as "\302\243" in utf-8 unicode used by Tcl. This is
> converted correctly to EXDR format as a 2 byte string.
> 
> However, puts seems to loose a byte when outputting the EXDR string,
> even when it should treat the string as binary. The read_exdr then
> crashes because the string has lost a byte.
> 
> Fixed by forcing Tcl not to convert string with puts.


I have now fixed that properly (I hope...) on the C level.

Tcl >8.0 has a new ByteArray (binary) API which has to be used
if dealing with arbitrary bytes (the old string-API was changed
to work on utf-8 encoded strings, which broke our code).

The following commands now return Tcl "binary" (uninterpreted
8-bit bytes) objects:

ec_queue_read
ec_tcl2exdr

and the following will interpret their input strings as binary:

ec_queue_write
ec_exdr2tcl
post_goal

Moreover, the exdr-decoding commands generate "binary" from
the EXDR-string (S) type:

ec_read_exdr
ex_exdr2tcl

and the exdr-encoding commands create EXDR-strings from tcl strings
by interpreting them as binary:

ec_tcl2exdr
ec_post_goal

The latter implies that one cannot directly encode a tcl unicode
string (which may contain characters >255) into an EXDR-string
(the high byte gets lost). To do so, the unicode string must be
explicitly converted to e.g. utf-8 (using encoding convertto utf-8)
and then packed into EXDR.

-- Joachim
Received on Thu Jan 04 17:45:41 2001

This archive was generated by hypermail 2.1.8 : Wed 16 Nov 2005 06:08:02 PM GMT GMT