Importing and exporting content¶
Description
Importing and exporting content between Plone sites and other CMS systems
Introduction¶
Goal: you want to import and export content between Plone sites.
- If both sites have identical version and add-on product configuration you can use Zope Management Interface export/import
- If they don't (e.g. have different Plone version on source and target site), you need to use add-on products to export and import the content to a common format, e.g. JSON files.
Zope 2 import / export¶
Zope 2 can import/export parts of the site in .zexp format. This is basically Python pickle data of the exported objects. The data is a raw dump of Python internal data structures, which means that the source and the target Plone versions must be compatible. For example, a export from Plone 3 to Plone 4 is not possible.
To export objects from a site to another, do the following:
- In the Zope Management Interface, navigate to the Folder, which holds the object to be exported.
- Tick the checkbox for a object to be exported.
- Click Import / Export
- Export as .zexp.
- Zope 2 will tell you the path where .zexp was created on the server.
- Zope .zexp to youranothersite/var/instance/import folder
- Go to ZMI root of your another site
- Press Import / Export
- In Import from file you should see now your .zexp file
- Import it
- Go to portal_catalog -> Advanced tab. Clear and Rebuild the catalog (raw Zope pickle does not know about anything living inside the catalog)
collective.transmogrifier¶
On it's own collective.transmogrifier isn't an import tool, rather a generic framework for creating pipelines to process data. Pipeline configs are .ini-style files that join together blueprints to quickly create a tool for processing data.
The following add-ons make it useful in a Plone context:
- plone.app.transmogrifier provides integration with GenericSetup, so you can run pipelines as part of import steps, and some useful blueprints.
- quintagroup.transmogrifier also provides it's own Plone integration, and some useful blueprints. See the site for some example configs for migration.
- transmogrify.dexterity provides some blueprints relevant to Dexterity types, and has some default pipelines for you to use.
- `collective.jsonmigrator <collectivejsonmigrator> is particularly useful when the old site is not able to install collective.transmogrifier, as collective.jsonmigrator has a very low level of dependencies for that end of the migration.
transmogrify.dexterity: CSV import¶
transmogrify.dexterity
will register the pipeline
transmogrify.dexterity.csvimport
, which can be used to import from CSV to dexterity
objects.
For more information on using, see the package documentation.
transmogrify.dexterity: JSON import/export¶
transmogrify.dexterity
also contains some
quintagroup.transmogrifier
pipeline configs. To use these, first install both
quintagroup.transmogrifier
and
transmogrify.dexterity
, then add the following to your ZCML:
<include package="transmogrify.dexterity.pipelines" file="files.zcml" />
Then the "Content (transmogrifier)" generic setup import / export will import / export site content to JSON files.
For more information on using, see this transmogrify blog post.
quintagroup.transmogrifier: Exporting single folder only¶
Here is explained how to export and import Plone CMS folders between different Plonen versions, or different CMS systems, using XML based content marshalling and quintagroup.transmogrifier.
This overcomes some problems with Zope management based export/import which uses Python pickles and thus needs identical codebase on the source and target site. Exporting and importing between Plone 3 and Plone 4 is possible.
You can limit export to cover source content to with arbitrary portal_catalog conditions. If you limit source content by path you can effectively export single folder only.
The recipe described here assumes the exported and imported site have the same path for the folder. Manually rename or move the folder on source or target to change its location.
Note
The instructions here requires quintagroup.transmogrify version 0.4 or later.
Source site¶
Execute these actions on the source Plone site.
Install
quintagroup.transmogrifier
via buildout and Plone add-on control panel.
Go to Site setup > Content migration.
Edit export settings. Remove unnecessary pipeline
entries by looking the example below. Add a new
catalogsource
blueprint. The
exclude-contained
option makes sure we do not export unnecessary items
from the parent folders:
[transmogrifier]
pipeline =
catalogsource
fileexporter
marshaller
datacorrector
writer
EXPORTING
[catalogsource]
blueprint = quintagroup.transmogrifier.catalogsource
path = query= /isleofback/ohjeet
exclude-contained = true
Also we need to include some field-level excluding bits for the folders, because the target site does not necessary have the same content types available as the source site and this may prevent setting up folderish content settings:
[marshaller]
blueprint = quintagroup.transmogrifier.marshaller
exclude =
immediatelyAddableTypes
locallyAllowedTypes
You might want to remove other, unneeded blueprints
from the export
pipeline
. For example,
portletexporter
may cause problems if the source and target site do
not have the same portlet code.
Go to Zope Management Interface > portal_setup > Export tab. Check Content (transmogrifier) step. Press Export Selected Steps at the bottom of the page. Now a .tar.gz file will be downloaded.
During the export process
instance.log
file is updated with status info. You might want to
follow it in real-time from UNIX command line
tail -f var/log/instance.log
In log you should see entries running like:
2010-12-27 12:05:30 INFO EXPORTING _path=sisalto/ohjeet/yritys/yritysten-tuotetiedot/tuotekortti
2010-12-27 12:05:30 INFO EXPORTING
Pipeline processing time: 00:00:02
94 items were generated in source sections
94 went through full pipeline
0 were discarded in some section
Target site¶
Execute these actions on the target Plone site.
Install
quintagroup.transmogrifier
via buildout and Plone add-on control panel.
Open target site
instance.log
file for monitoring the import process
tail -f var/log/instance.log
Go to Zope Management Interface > portal_setup > Import tab.
Choose downloaded
setup_toolxxx.tar.gz
file at the bottom of the page, for
Import uploaded tarball input.
Run import and monitoring log file for possible errors. Note that the import completes even if the target site would not able to process incoming content. If there is a serious problem the import seems to complete successfully, but no content is created.
Note
Currently export/import is not perfect. For example, ZMI content type icons are currently lost in the process. It is recommended to do a test run on a staging server before doing this process on a production server. Also, the item order in the folder is being lost.
collective.jsonmigrator¶
collective.jsonmigrator is basically a collective.transmogrifier pipeline that pulls Plone content from to JSON views on an old site and writes it into your new site. It's major advantage is that the JSON view product: collective.jsonify is very low on dependencies (basically just simplejson), so it can be installed on very old Plone sites that would be difficult if not impossible to install collective.transmogrifier into.
See:
- <https://github.com/collective/collective.jsonmigrator>`_
- <https://github.com/collective/collective.jsonify>`_
- A basic tutorial: <http://www.jowettenterprises.com/blog/plone-content-migration-using-transmogrifier-and-collective.jsonify>`_
- <http://stackoverflow.com/questions/13721016/exporting-plone-archetypes-data-in-json>`_
Fast content import¶
For specific use-cases, you can create 'brains' first and import later * See this blog post
Simple JSON export¶
Below is a simple helper script / BrowserView for a JSON export of Plone folder content. Works Plone 3.3+. It handles also binary data and nested folders.
export.py:
"""
Export folder contents as JSON.
Can be run as a browser view or command line script.
"""
import os
import base64
try:
import json
except ImportError:
# Python 2.54 / Plone 3.3 use simplejson
# version 2.3.3
import simplejson as json
from Products.Five.browser import BrowserView
from Products.CMFCore.interfaces import IFolderish
from DateTime import DateTime
#: Private attributes we add to the export list
EXPORT_ATTRIBUTES = ["portal_type", "id"]
#: Do we dump out binary data... default we do, but can be controlled with env var
EXPORT_BINARY = os.getenv("EXPORT_BINARY", None)
if EXPORT_BINARY:
EXPORT_BINARY = EXPORT_BINARY == "true"
else:
EXPORT_BINARY = True
class ExportFolderAsJSON(BrowserView):
"""
Exports the current context folder Archetypes as JSON.
Returns downloadable JSON from the data.
"""
def convert(self, value):
"""
Convert value to more JSON friendly format.
"""
if isinstance(value, DateTime):
# Zope DateTime
# https://pypi.python.org/pypi/DateTime/3.0.2
return value.ISO8601()
elif hasattr(value, "isBinary") and value.isBinary():
if not EXPORT_BINARY:
return None
# Archetypes FileField and ImageField payloads
# are binary as OFS.Image.File object
data = getattr(value.data, "data", None)
if not data:
return None
return base64.b64encode(data)
else:
# Passthrough
return value
def grabArchetypesData(self, obj):
"""
Export Archetypes schemad data as dictionary object.
Binary fields are encoded as BASE64.
"""
data = {}
for field in obj.Schema().fields():
name = field.getName()
value = field.getRaw(obj)
print "%s" % (value.__class__)
data[name] = self.convert(value)
return data
def grabAttributes(self, obj):
data = {}
for key in EXPORT_ATTRIBUTES:
data[key] = self.convert(getattr(obj, key, None))
return data
def export(self, folder, recursive=False):
"""
Export content items.
Possible to do recursively nesting into the children.
:return: list of dictionaries
"""
array = []
for obj in folder.listFolderContents():
data = self.grabArchetypesData(obj)
data.update(self.grabAttributes(obj))
if recursive:
if IFolderish.providedBy(obj):
data["children"] = self.export(obj, True)
array.append(data)
return array
def __call__(self):
"""
"""
folder = self.context.aq_inner
data = self.export(folder)
pretty = json.dumps(data, sort_keys=True, indent=' ')
self.request.response.setHeader("Content-type", "application/json")
return pretty
def spoof_request(app):
"""
http://docs.plone.org/develop/plone/misc/commandline.html
"""
from AccessControl.SecurityManagement import newSecurityManager
from AccessControl.SecurityManager import setSecurityPolicy
from Products.CMFCore.tests.base.security import PermissiveSecurityPolicy, OmnipotentUser
_policy = PermissiveSecurityPolicy()
setSecurityPolicy(_policy)
newSecurityManager(None, OmnipotentUser().__of__(app.acl_users))
return app
def run_export_as_script(path):
""" Command line helper function.
Using from the command line::
bin/instance script export.py yoursiteid/path/to/folder
If you have a lot of binary data (images) you probably want
bin/instance script export.py yoursiteid/path/to/folder > yourdata.json
... to prevent your terminal being flooded with base64.
Or just pure data, no binary::
EXPORT_BINARY=false bin/instance run export.py yoursiteid/path/to/folder
:param path: Full ZODB path to the folder
"""
global app
secure_aware_app = spoof_request(app)
folder = secure_aware_app.unrestrictedTraverse(path)
view = ExportFolderAsJSON(folder, None)
data = view.export(folder, recursive=True)
# Pretty pony is prettttyyyyy
pretty = json.dumps(data, sort_keys=True, indent=' ')
print pretty
# Detect if run as a bin/instance run script
if "app" in globals():
run_export_as_script(sys.argv[1])