To do that first we have to initialize the tool by importing the getToolByName function and assigning the portal_catalog tool to a variable.
from Products.CMFCore.utils import getToolByName
catalog = getToolByName(obj,'portal_catalog')
And then we can use that catalog to collect the data which we want. For example to collect all the pages that is in the entire site we would do something like so:
pages = catalog.searchResults(portal_type='Page')
And then we would iterate over the results:
for page in pages:
print page.title
Or maybe we want to limit our search to only pages inside a certain folder. Then we would have to specify the path to that folder as criteria to our search like so:
pages = catalog.searchResults(path='/inigo/projects',portal_type='Page')
The query above would return only the pages under the folder projects in the inigo folder.
If the id of the item that is wanted is already known, then we can also specify that as a criteria:
pages = catalog.searchResults(id='project_members',portal_type='Page')
Every criteria can have additional value to compare with that is combined with 'or' or 'and'. For example if you wanted to look for all the pages and folders under the '/inigo/projects' folder, you would it like so:
pages = catalog.searchResults(path='/inigo/projects',portal_type={'query':['Page','Folder'],'operator':'or'})
Note that the results we get from the portal_catalog tools are in the form of brains which is basically the indexed data of the object we have searched for. This is to prevent the search from 'waking up' the object which would be quite costly in terms of processing. But if we require some data to be read from the results which is not indexed, then we need to refer to the object itself like so:
pages = catalog.searchResults(portal_type='Page')
for page in pages:
page_object = page.getObject()
Example Script
#!/usr/bin/env python
from Products.CMFCore.utils import getToolByName
from Testing import makerequest
def runit(app):
app=makerequest.makerequest(app)
catalog = getToolByName(app.inigo, 'portal_catalog')
pages = catalog.unrestrictedSearchResults(portal_type="Document")
print '%d pages found' % len(pages)
for page in pages:
pageobj=page._unrestrictedGetObject()
print 'Title:',pageobj.title
print 'Description:',pageobj.description
print 'Content:',pageobj.getText()
runit(app)
The script above will display the pages(documents) available on the inigo site in a plone instance. If the script above is saved in the root plone directory as showpages.py, then it can be executed by running:
bin/instance run showpages.py
The unrestrictedSearchResults function is used so that even pages that are private or not published are listed. But to be able to get those pages the function _unrestrictedGetObject have to be used to get the object.
