Wednesday, May 29, 2013

How to Efficiently Copy Large Amounts of Content Between CQ Repositories?

There are multiple options available in CQ to move content from one CQ instance to another.  These include “replication” (author->publish), create package->export->install elsewhere, VLT rcp command line and recap.

The File Vault (vlt) tool has a Remote Copy (rcp) option as well - one that is especially useful if you are moving GB or TB of digital assets (JCR node type dam:Asset) from DEV to STAGING to PRODUCTION.  Run all renditions workflows once on a very powerful DEV machine with large numbers of CPU cores and high throughput local storage.  Once done, perform vlt rcp to STAGING and PRODUCTION environments after turning off renditions workflows on those.

Since it streams data between online repositories, it does not use the Durbopackaging used by replication.  Tests by Adobe Performance Architect Gardner Buchanan shows that this is more storage efficient and avoids storage bloat (2:1 in the case of replication).

Gardner also recommends running multiple instances of vlt rcp against separate source tree structures to parallelize the whole operation.  To avoid unnecessary network traffic, run vlt rcp on one of the participant instances, not on a remote, third instance.

Also, tests indicate that the default batch size of 1000 should be reduced to 100 for better throughput.

Assuming that vlt is set up and configured, the following command will copy a large content tree at /content/dam/JJK-Folder-1 on one CQ “author” instance to another “author” instance.  In this case, both are running on the local machine but they can be remote.  Also, both don’t have to be in the same run mode.  Content can be remote copied from an “author” instance to a “publish instance.

vlt rcp -b 100 -r -u -n http://admin:admin@localhost:4502/crx/-/jcr:root/content/dam/JJK-Folder-1 http://admin:admin@localhost:4503/crx/-/jcr:root/content/dam/JJK-Folder-1

A test with 1,000 (1 MB) 1680 x 1050 JPG images copied 20,304 (dam:Asset) nodes (2,690,706,251 bytes) in 573,117 milliseconds - a throughput of 36 JCR nodes/second or 16 GB/hr.

In another test, I copied 44,802 (cq:Page) nodes (5,017,126 bytes) in 461,925 ms - that is a throughput of 97 JCR nodes/second or 37 MB/hr.

The process is differential, meaning only changed nodes are actually copied.  However, each and every source JCR node needs to be checked against each and every destination JCR node.

Source: http://cq-ops.tumblr.com/

3 comments:

  1. hi,

    Can we transfer between instances from a local instance. I want to transfer content from dev to test from my local instance. How can we do it.
    I dont want to use recap in each of the instances. Is it possible?

    ReplyDelete
  2. Is it possible to upload images or any other files from local file system, say C:/assets onto my local CQ Authoring instance ? What's the command for doing so ?

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete