[pygtk] Problem in fetching Unicode from URL and displaying it in PyGTK widget
Bertrand Kintanar
b3rxkintanar at gmail.com
Fri Jul 17 19:44:31 WST 2009
On 7/17/09 6:52 PM, John Finlay wrote:
> I misunderstood what you wanted. I thought you just wanted to save the
> html file contents into the DB and were having a problem with the text
> encoding between the html and the DB but it sounds like you want to do
> something different.
>
> John
ok let me put it this way. lets say i have a string variable which i get
from reading from an html file.
data = 'á'
i want it be able to convert the above string to
data = u'\xE1'
in order for me to just
print data.encode('utf-8')
and I can get its correct value which is
á
if i do a data.replace('&#', '\\') it will put two backslashes on to the
string instead of only one. And if i just put one backslash, it will
spit an error since backslash is an escape character. why does python
treat backslash as an escape character but when used in replace string
method, it doesn't escape the other backslash?
if i do the above command and print data i get
'\\xE1;' instead of just '\xE1'
so is there a specific way of converting this?
More information about the pygtk
mailing list