Intro
As a technical writer, one of the many tasks I must work with on a regular basis is pulling content from multiple sources and compiling them into a source file for publication. Early on, the process I used was a very hands on process of manually generating a compressed file from Confluence and other sources, manipulate the contents of the extracted content, bundle everything up, and publish the refined package.
This tutorial's goal is to get novice users familiar with one method of automating content extraction by using a simple Bash script file that pulls content from Confluence using an exporter scheme URL.
The process contains two components: an export scheme URL and a Bash script. I have written two tutorials on how to extract content using two of Scroll's exporter plugins. Refer to Part 2 and/or Part 3 in the Export Content from Confluence series for details on how to generate the export URL using either Scroll's EclipseHelp or HTML Exporter plugins. You will need to complete at least one of these tutorials in order to complete this tutorial as you will need the export scheme URL. The Bash script we will create handles the manual part of extracting content from Confluence.
This tutorial's goal is to get novice users familiar with one method of automating content extraction by using a simple Bash script file that pulls content from Confluence using an exporter scheme URL.
The process contains two components: an export scheme URL and a Bash script. I have written two tutorials on how to extract content using two of Scroll's exporter plugins. Refer to Part 2 and/or Part 3 in the Export Content from Confluence series for details on how to generate the export URL using either Scroll's EclipseHelp or HTML Exporter plugins. You will need to complete at least one of these tutorials in order to complete this tutorial as you will need the export scheme URL. The Bash script we will create handles the manual part of extracting content from Confluence.
This tutorial is written for Mac users. The Bash features used in this document has not been tested on Linux but you know your way around
wget
, this tutorial should work just fine.Prerequisites
Confluence Command Line Interface
Did you know that Confluence has a CLI? Check out and install Confluence Command Line Interface (CLI) as we may need it to gain access to Confluence via the command line to check against extracted content but I'll leave that up to you to decide if you want to use it or not.
For installation, please review Confluence CLI Installation and Use.
For getting started, reference, examples, and much more info, please review Confluence CLI User's Guide.
wget
Another component to automating the export process is using a command line network transfer tool such as wget. There are a few options out there for handling CLI transfers (such as curl) but I've found wget to be rather flexible, handles redirects well, and stable for my documentation needs. If you have arguments for or against, I'd love to hear them in the comments.Setting up a "docbot" account for export
Prior to automating the export process from Confluence, you will want to create a non-human account that only has view and export permissions. If you don't plan on sharing or automating content extraction from Confluence, you can use your personal account but I've seen many tech writers get bitten by using their personal accounts.In this case, I named this new account "docbot". If you don't have the proper credentials to create Atlassian accounts, please contact your Confluence administrator and request the account be created. Otherwise, create a new user account:
- Navigate to the Confluence Admin page.
- Click on Users under Users & Security section.
- Click Add Users.
- Enter
docbot
in Username field. - Enter
docbot
in Full Name field. - If you have a group email address that is shared with the tech writers, I recommend using that for the Email field. Otherwise, enter your email address.
- Navigate to your target space's Space Admin page and click on Permissions.
- Under the Individual Users section, click the Edit Permission button.
- Locate the docbot account and enable only the following permissions:
- All > View
- Space > Export
- Once those two permissions have been set, click Save all.
Your docbot account should now have the proper permissions to export content from Confluence.
Bash script
Once you have ran through the process of creating and saving an export scheme from either one of the aforementioned plugins, you will need to apply the REST URL in our Bash script.The Bash script will contain up to four lines: up to three variables and one command. You may wish to add a few additional lines before and after the export process to set up directories like adding a few commands for setting up an export archive and post processing the downloaded file (like renaming the .jar file to a .zip file, unzipping, it and so on).
...
USER='docbot'
PASS='<docbot's password>'
URL='<exporter scheme URL>'
wget --content-disposition "$URL&os_username=$USER&os_password=$PASS"
...
Breakdown of this script
The first three lines are just variables we will pass into thewget
command. The fourth line is the backbone of the whole operation. Lets example each component of this command:wget
- network transfer command--content-disposition
- flag that will force the download to preserve the file name. Note: This flag is will experimental though I've never hand any problems with it."$URL&os_username=$USER&os_password=$PASS"
- string that gets passed into thewget
command. When pulling content from Confluence using the exporter scheme URL, you need to specify the exporter URL, provide the user requesting it (which it get checked for proper credentials, and the password). If everything checks out properly on Confluence's side, your command will pull down a compressed file based on the settings in your exporter scheme.
Note: if your Bash script will be shared with other users or in a hosted environment, you may want to localize the USER and PASS variables in your Bash profile.
Happy automating!