Automated Image Optimization: The Bash Way

Matt Fiocca, Principal & Digital Craftsman Reading estimate: 4 minutes

As website development contractors, we sometimes come across existing Drupal or Wordpress sites that are just chock full of large, under-optimized client uploads that take forever to load in our browsers. We sometimes find our servers choking to death, not because of [insert disliked developer here]’s rogue php script, but because the server just can’t keep up with the thousands of requests for xyz image at 3000x2000 pixels, uncompressed.

Sure, we can install plugins that will re-compress images on upload, but what about all those hundreds or even thousands of images that are sitting on the server right now? Do we find another plugin to rip through the entire file base? To me, this sounds like too much work for a web stack, and really, why introduce even more load to your web server that is already choking on pixel data?

Recently, we were tasked to find a way to automate an optimization process for one of our clients’ Wordpress sites. They have around 2,500 uploads sitting on their server at the time of this writing, high res images that have all been uploaded by the client through their Wordpress media library.

Now, this normally wouldn’t be a big deal for a small number of uploads. You would just crack open an FTP client, download the suspects, open in photoshop, Save for Web and Devices, and re-upload them. But with the number of files that we were dealing with, FTP just wasn’t going to cut it, not for me anyway. A photoshop action seemed like too much work too, since I would have to manually download all the files, let photoshop run its batch (making photoshop useless while that was running), then manually upload them later.

So, I took to bash to see what I could come up with; results below. This script is meant to be run from the command line, and relies on two things:

  1. The existence of jpegtran on your local system, which is a command found in the libjpeg library, a library that comes bundled with ImageMagick. If you’ve installed imagemagick for some other reason, you’ve already got jpegtran installed. If not, it’s just a matter of:

    Ubuntu/Debian:
    $ sudo apt-get install gcc
    $ sudo apt-get install imagemagick

    CentOS/Fedora/RHEL:
    $ yum install gcc php-devel php-pear
    $ yum install ImageMagick ImageMagick-devel

    Mac (using homebrew):
    $ brew install imagemagick

    Windows:
    Sorry windows users, this is bash

  2. A Pantheon account. This script can be re-factored to be used with any ssh/rsync enabled server, but this is currently setup for Pantheon and the /files directory for both Wordpress and Drupal.

What this script is doing:

  1. First checks to see if you have jpegtran, aborts otherwise.

  2. Prompts you for the absolute or relative path to your local /files directory. This directory can be anywhere actually, and you don’t have to be cd’d into the project or anything like that. This is just what you want the working directory to be on your local system. The script checks to make sure the path actually exists, aborts otherwise.

  3. Prompts you for the pantheon site uuid. This is that long uuid found in your pantheon dash, i.e. XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX

  4. The script then remembers where you were in your terminal and moves into your working directory, and starts downloading (via rsync) the entire /files directory of the remote site recursively. This probably goes without saying, but your local disk needs to have enough available space to support all your site files.

  5. Then it builds a list of all .jpg and .jpeg images recursively and case-insensitively (.jpg or .JPG) through the entire working directory.

  6. Optimizes images. jpegtran offers two ways to optimize images; with either the -optimize flag or the -progressive flag. We’ve chosen -progressive in this script because its the method of optimization we prefer for web images, where images can be loaded almost instantly to the browser and then progressively download in sort of an async fashion. An additional flag we’re using in this script is the -copy none setting. This strips the image of all meta information, adding another layer of file size reduction.

  7. After optimization, the images are then uploaded (again via rsync) back to the server recursively.

  8. When it’s all said and done, this script will ‘cd’ you back to the directory that you came from before running the script.

Check it out on GitHub.

Bonus

You’ll notice that I’ve wrapped this functionality into a bash function, and then just call the function immediately after declaration. This is for fun things like sourcing into your ~/.bash_profile. For example, remove the function call at the end of the script and save as ~/optimize_pantheon_jpgs.sh. Then add this to end of your ~/.bash_profile:

. ~/optimize_pantheon_jpgs.sh

Don’t forget the period at the beginning. Save and close, then start a new terminal session. Now, you can call optimize_pantheon_jpgs directly on the terminal from anywhere and it will return you back to where you were terminal’ed into previous to the call.

Discover More

Safely Publish to Web from Google Docs with Pantheon Content Publisher

7 min read
Read More

Unifying Content and Code: Inside Pantheon’s Vision for Content Operations

5 min read
Read More

How Pantheon Protects Your Site from Software Supply Chain Risks in Open Source

8 min read
Read More

Try Pantheon for Free

Join thousands of developers, marketers, and agencies creating magical digital experiences with Pantheon.