quick search:
 

Making wget Work With Plone

Submitted by: dylanr
Last Edited: 2006-08-08

Category: CMF

Average rating is: 4.0 out of 5 (1 ratings)

Description:
If you want to make a static copy of your Plone site, you might try
doing something like:

$ wget -m www.mysite.com

But if you did that, you'd notice that your mirror is ugly and
incomplete. That's because Plone incorporates style sheets and
certain graphics in a way that wget just won't see. Worse yet, wget
may fail to save the output of folder_contents, retrieving only the
items inside each folder.

We could try to hack around with wget, but it's probably a lot easier
to cook up a python script that gives wget enough extra information
to create a complete mirror. Create a Python Script somewhere that
your Plone instance can acquire it. Call it archive_portal (or
whatever) and give it the following definition:


Source (Text):
style_sheets = ['plone.css','ploneColumns.css','ploneCustom.css']
graphics = ['bullet.gif','portal_logo']
index = 'index.html'

print "<a href='%s'>site entry point</a><br>" % context.absolute_url()
print "<a href='%s/%s'>site index</a><br>" % (context.absolute_url(), index)
for item_name in style_sheets + graphics:
    print "<a href='%s/%s'>%s</a><br>" % (context.absolute_url(), item_name, item_name)

for dir in context.portal_catalog(portal_type='Folder', review_state='published'):
    print "<a href='%s/%s'>%s</a><br>" % (dir.getURL(), index, dir.id)

return printed

Explanation:
One last step: create an acquirable DTML Method called "index.html"
(or whatever you set for the value of "index" in the script). That
method need only contain one line:

<dtml-var index_html>

By doing this, we ensure that we end up with folder views that
Apache can serve easily.

Once we've done all that, we're ready to go. To mirror your Plone
site with wget, run:

$ wget -m www.mysite.com/archive_portal


Comments:

No Comments