[pygtk] problem pasting clipboard content from arabic website (target text/html)

Dieter Verfaillie dieterv at optionexplicit.be
Thu Jan 13 02:00:23 WST 2011


On 12/01/2011 16:24, Giuseppe Penone wrote:
> Yes I also was thinking that, being the first two chars not valid (\0xff and
> \0xfe)

That would be the BOM (Byte Order Mark)...

, the problem is that I cannot find a reference to understand what is
> the encoding according to those chars.

... for UTF-16LE (or UTF-16 for short). You'll also want to be careful
about NULL characters.

The attached fragment accepts "html" pastes from firefox/thinderbird
and correctly shows the Arabic fragment from your original message
when copied from thunderbird.

Hey, it even honors RTL, which is kinda neat :)

mvg,
Dieter
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: paste.py
URL: <http://www.daa.com.au/pipermail/pygtk/attachments/20110112/8751dbcf/attachment.ksh>


More information about the pygtk mailing list