[pygtk] Problem in fetching Unicode from URL and displaying it in PyGTK widget
Bertrand Kintanar
b3rxkintanar at gmail.com
Sat Jul 18 18:08:51 WST 2009
On 7/17/09 11:19 PM, Walter Leibbrandt wrote:
> Bertrand Kintanar wrote:
>> On 7/17/09 9:19 PM, saeed wrote:
>>> s1 = 'Guzán'
>>> s2 = ''
>>> n = len(s1)
>>> i = 0
>>> while i<n:
>>> if i<n-6:
>>> if s1[i:i+3]=='&#x' and s1[i+5]==';':
>>> s2 += unichr(int(s1[i+3:i+5], 16)).encode('utf-8')
>>> i += 6
>>> continue
>>> s2 += s1[i]
>>> i += 1
>>> print s2
>> Now this fixes it all. Thanks alot. I hope there is some sexier way
>> to do this though. but this will work. thanks again
> import re
> htmluni = re.compile(r'&#x([\dA-Fa-f]+);')
> data = 'Guzán Guzán'
>
> match = htmluni.search(data)
> while match:
> data = data[:match.start()] + unichr(int(match.group(1), 16)) +
> data[match.end():]
> match = htmluni.search(data)
>
Thanks for this Walter. I'm also using regex for my search but never
thought of it to use it as you have in here.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.daa.com.au/pipermail/pygtk/attachments/20090718/33324d69/attachment.htm
More information about the pygtk
mailing list