AWSNinja Code Library
Source: at GitHub
In a Sentence: A system for optimal Image management and performance in CloudFront.
Not interested in a story? Go Right to the Framework.
You’re the lead developer at a hot new startup called Woofbook. Elevator pitch: “It’s like Facebook meets Flickr, but for dogs.” The founder of Woofbook has started many companies, and run them to many different levels of success and failure, by the shear force of his outgoing personality. In other words, he is a typical web entrepreneur. One day, while getting his goatee manscaped, a barber tells your boss that his nephew told him that Cloud Computing is the next big thing. He rushes back to the office and announces “We need to harness the cloud!” After searching for “Cloud” on TechCrunch, your boss tells you “Cloudfront is the thing we need.”
You take a moment to marvel that a pretty good idea was actually derived from such ridiculous reasoning. You know that using CDN is actually a very good practice and you’re pleased that you will actually spend the next few days working on something that will actually provide long-term value to the project.
It just so happens that you were building the photo-sharing section of Woofbook when your boss interrupted you with his Cloudfront idea. After 15 minutes of cruising the Amazon’s Cloudfront website and a few blog posts, you conclude that implementing Cloudfront is going to be easier than you thought. You just need to create a Cloudfront distribution, upload your images to it, then point the HTML image tags to the Cloudfront Service. Perfect.
A week later, the photo-sharing feature goes live and it’s a big hit. Before long users have uploaded over 100,000 photos and it loads pretty fast and you think you’re pretty darn sophisticated for making it look so easy.
Several weeks after that, your boss comes back from a meeting with a potential investor (those meetings always tend to result in bothersome new initiatives) and exclaims, “The photo sharing is great! We need to leverage it.”
It turns out that dogs have active social lives. They go out to dog gatherings every day at specially designated meeting places (known as dog parks) to socialize and conduct their. . . ahem – business.
“It would be great if dogs could tag their friends in the photos they uploaded.” your boss says. “Also, wouldn’t it be great if one dog could use a photo uploaded by another dog as his profile pic?”
“No problem!” you say confidently. You smartly designed the database architecture to make it easy to use the photos on the site for virtually any purpose. “I’ll have it done by COB tomorrow.” You link up the Photos table to the Users table by adding a “profilePhotoId” foreign-key field to the Users table. Then you build out the functionality to allow users to choose any photo as their profile pic.
But you soon realize there is a problem. Profile pics are displayed at 200×150 pixels and the photo album photos typically display at 500×375. You’re afraid you don’t have time to properly resize the photos without missing your self-imposed deadline. To speed things up you try resizing the actual photo images by using the WIDTH and HEIGHT attributes of the IMG tag, but that causes the images to appear warped because the aspect-ratios are different.
So now you need to run a script to resize the photos. Or maybe you’ll just resize the subset of the album photos that users have chosen as profile pics. You realize there’s a trade-off between these two options. It’s more logically simple to store all of the photos in two sizes, but it’s more efficient and elegant to only store the profile-sized images for images that will actually be used as profile pics. Unfortunately, you promised it would be done “by COB tomorrow.” To meet your deadline you settle on resizing all of the photos and uploading them all on Cloudfront. Even though it is less efficient to do it this way, it also involves less programming which means you’ll meet your deadline. You run a process to resize the photos overnight and finish up the next day. Everybody’s happy.
Several months later, things are continuing to go well at Woofbook Inc. Users are loving the ability to use their friends’ photos as their profile pics. Woofbook is starting to get positive coverage on TechCrunch and even the regular business press is beginning to take notice. Imitators start to appear on the scene. One of them, BarkSquare, decides to capitalize on some bad press Woofbook received about some ill-advised privacy-policy changes to promote their new “Transfer to BarkSquare” feature. The feature allows users to copy their entire Woofbook profile, including all of their friends and photos, to BarkSquare.
Predictably, this causes a significant panic in the WoofBook offices. “How will we stop this?” your boss cries with the agony not unlike the pleadings of a bullied child. After some determined and serious discussion, he decides that all of the photos should be watermarked. “At least we’ll be able to keep track of where the images go and get some free advertising,” he says.
You roll your eyes and get to work figuring out how to add watermarks to the images. Once you’ve created and tested the script that will apply the watermarks, you kick it off and leave it running as you head home for the evening. Before going to bed, you check the script’s progress, see that it’s finished, and send a quick email to your boss to let him know. You’re quite pleased that the timestamp on your email says 11:32pm. You expect that your dedication will not go unnoticed.
When you arrive at work the next morning, rather than thanking you for your after-hours efforts, your boss wants to know why he’s not seeing any watermarks on the photos on the site. That’s when you remember. Because of the chaos and urgency in the office yesterday, you forgot something important. The images on Cloudfront are cached on the edge locations. Updating the original does not automatically update the cached versions residing around the world. Worse, there is no way to manually flush the cache. You can only wait for the Cloudfront system to refresh the cache on it’s own. Since you followed the best-practice of setting a far future Expires header on the images, it can potentially be a very long time before the cached objects get refreshed.
Now it’s even more complicated. In order to show the new images with the watermarks, you have to upload all of them with different file names, then point to the new versions. It’s going to be a long day.
The CloudfrontImageService is designed to eliminate the pain of managing images on Cloudfront distributions. The framework has built-in functionality for managing:
There are three database tables that are used to manage images in the CloudFrontImageService. They are tbl_image, tbl_imageDimensions, and tbl_imageDimensionsMap.
tbl_images – This table is used to track the original image file that is stored on your server’s filesystem or on an EBS volume (not on S3).
tbl_dimensions – This table holds the definitions of different dimension sizes like: thumbnail, x-large, original and whatever else you need.
tbl_imageDimensionsMap – This table holds the records of actual image objects that are currently residing in in your Cloudfront distribution.
Before you can serve up images in your webpages, you need to add them to the framework. This is done through the createImageObjectFromFileSystemPath() method. Example:
<?php $cfImgSvc = new CloudfrontImageService(); $imgObj = $cfImgSvc->createImageObjectFromFileSystemPath('/var/www/imagedrop/flower.jpg', 'flowers/flower.jpg'); ?>
On line two we instantiate the CloudFrontImageService. On line four we use the createImageObjectFromFileSystemPath() method to take an image saved on the filesystem at location /var/www/imagedrop/flower.jpg and create an Image object which will be stored in the flowers subdirectory of the IMAGES_ROOT that is set in the config file. The createImageObjectFromFileSystemPath() handles the following steps:
Once you have added an image to the framework, you can use the CloudfrontImageService class to insert the image URL into your HTML. CloudfrontImageService has three methods that you can use to do this:
getUrlFromFilePath($filePath, $dimensionKeyName) – Takes the $filePath (which is one found in tbl_images) and a valid dimensionKey (which is a keyName found in tbl_imageDimensions). It returns a URL to the image on Cloudfront. If the image doesn’t exist on Cloudfront, it automatically uploads it to your Cloudfront distribution it before returning the URL.
<html> <body> <img src="<?php echo $cfImgSvc->getUrlFromFilePath('flowers/flower.jpg', 'thumbnail'); ?>"/> </body> </html>
This will insert the URL of the thumbnail-sized image into the HTML output. The thumbnail dimensionkey (thumbnail) must be defined in tbl_imageDimensions. The resulting URL will look something like this:
The version, dimensionkey, width, and height are all part of the URL for the purposes of ensuring that it is a unique URL and for your convenience.
getUrlFromImageId($imageId , $dimensionKeyName) – Same as getUrlFromFilePath(), except that you provide the Image id.
<html> <body> <img src="<?php echo $cfImgSvc->getUrlFromImageId(55, 'extra-large'); ?>"/> </body> </html>
getUrl(Image $imgObj, $dimensionsKeyName) – Same as the two above, except you can use it if you already have the Image object.
<html> <body> <img src="<?php $imgObj = Image::findById(55); echo $cfImgSvc->getUrl($imgObj, 'original'); ?>"/> </body> </html>