[pygtk] Gtkmozembed non-interactive use

Simon Waters simonw at zynet.net
Fri Sep 11 20:12:01 WST 2009


Found partially written script on the web for thumbnailing websites with 
gtkmozembed.

Claiming...
# This file is released into the public domain.
# Originally Written by Andrew McCall - <andrew at textux.com>
# modified by Matt Biddulph - <matt at hackdiary.com> - to take screenshots

My first ever python project was to hack it into a more sensible shape, and 
will probably stick it somewhere for someone else to abuse it when I'm done. 
I dragged the libraries upto date, added arguments, use of getopts, 
configurable output file name and thumbnail sizes, removed browser chrome, 
added URL decoding (so I can send encoded URL from web requests). So far I 
like Python.

It works, I've thumbnailed 18,605 webpages successfully in batch mode, most 
whilst I was on holiday.

However every so often the process hangs when launched in Xvfb. Sometimes this 
appears to be a bug in Xvfb (I'll deal with that else where).

But at least sometimes the embedded browser is opening up a popup dialogue 
saying the page can not be found. I assume I should listen for a signal and 
aborting at this point.

However does anyone know what signal/signals I should be listening for to spot 
prompts for user interaction? Or what the values means in the API for 
signal "net_state"?

Currently a cron job spots it being hung every few minutes and kill it off, 
but that isn't terribly sophisticated, and it is hung about 15% of the time, 
which means we're slower at processing the thumbnail rendering than I should 
be.


More information about the pygtk mailing list