[pygtk] Gtkmozembed non-interactive use
Simon Waters
simonw at zynet.net
Fri Sep 11 20:12:01 WST 2009
Found partially written script on the web for thumbnailing websites with
gtkmozembed.
Claiming...
# This file is released into the public domain.
# Originally Written by Andrew McCall - <andrew at textux.com>
# modified by Matt Biddulph - <matt at hackdiary.com> - to take screenshots
My first ever python project was to hack it into a more sensible shape, and
will probably stick it somewhere for someone else to abuse it when I'm done.
I dragged the libraries upto date, added arguments, use of getopts,
configurable output file name and thumbnail sizes, removed browser chrome,
added URL decoding (so I can send encoded URL from web requests). So far I
like Python.
It works, I've thumbnailed 18,605 webpages successfully in batch mode, most
whilst I was on holiday.
However every so often the process hangs when launched in Xvfb. Sometimes this
appears to be a bug in Xvfb (I'll deal with that else where).
But at least sometimes the embedded browser is opening up a popup dialogue
saying the page can not be found. I assume I should listen for a signal and
aborting at this point.
However does anyone know what signal/signals I should be listening for to spot
prompts for user interaction? Or what the values means in the API for
signal "net_state"?
Currently a cron job spots it being hung every few minutes and kill it off,
but that isn't terribly sophisticated, and it is hung about 15% of the time,
which means we're slower at processing the thumbnail rendering than I should
be.
More information about the pygtk
mailing list