Images account for a whopping 50% of the total file size of a web page. But consider this: On average, image metadata makes up 16% of a typical JPEG file on the web, according to website optimization service Dexecure. This means 8% of what we download online is useless, unnecessary metadata.
When it comes to website optimization, images are often considered low hanging fruit since they’re easy to compress. But what you might not be doing is trimming excess metadata.
In this post, we’ll take a look at what image metadata is, its performance impacts, some security concerns you might not be aware of, along with details on how to scrub metadata from your images.
What is Image Metadata?
Metadata is generally described as data about data. Image metadata, specifically, is information embedded into an image that includes details about the image itself as well as information about how it was created.
In addition to visual data, image files actually contain several different formats for metadata, which in turn store different types of information:
- EXIF (Exchangeable image file format) – Information generated automatically by the device that captured the image (i.e. cameras and smartphones), such as data and time, and camera settings (make and model, image orientation, aperture, shutter speed, focal length, metering mode, ISO speed etc). This specification also helps cameras use formats that can be exchanged between devices; for example, ensuring an iPhone photo appears correctly on a Samsung device.
- IPTC (International Press Telecommunications Council) – A format originally adopted by old media news agencies to streamline information, but has been implemented by new media to do much the same thing. The IPTC section of an image usually contains information about the image, such as title, description, keywords, photographer’s information, copyright restrictions, and more.
- XMP (Extensible Metadata Platform) – An XML-based format recently adopted by Adobe that incorporates all the information from the IPTC format, but allows for additional information to be stored within the image.
- 8BIM – A file extension used by Photoshop that stores some graphics-related data.
- ICC (International Color Consortium) – In color management, an ICC profile is a set of data that characterizes a color input or output device, or a color space, according to standards set by the ICC. Imaging programs like GIMP use ICC color profiles to interpret an image’s RGB values.
Image metadata allows information to be transported together with an image, in a way that can be understood by software, hardware, and humans, regardless of the format.
While some metadata is generated by manufacturers and devices that capture images, other metadata may be added manually and edited using software like GIMP and Photoshop.
A lot of information can be contained in metadata—There are more than 460 metadata tags alone for the EXIF format.
Viewing Image Metadata
There are many different ways to view image metadata.
If you’re using OS X, you can simply right-click an image and click “Get Info” to see all kinds of details, such as its creation day, the make and model of the camera it was taken on, its color profile, and exposure.
Alternatively, and also on OS X, you could open the image in Preview and go to Tools > Show Inspector to see more information about the image.
If you’re using Windows or want to view other metadata contained in an image, there are free online tools such as Get-Metadata.com, a free online EXIF data viewer, and Pic2Map.com, an online photo location viewer that uses EXIF GPS coordinate data to create a map view of your photos.
It’s pretty cool seeing the location information for my photo displayed as a map, along with the time and date the image was taken. But really… who cares?
How Image Metadata Impacts Performance
There is a lot of image metadata floating around on the web. As Dexecure’s research into the impact of metadata on image performance revealed, as much as 16% of the total file size of images online is metadata.
But the thing is, most people don’t even know they’re sharing image metadata—and users generally don’t know they’re downloading it.
Dexecure wanted to find out exactly how much unnecessary metadata is contained in images online, so they used the data from the HTTP Archive along with help from BigQuery to generate a list of URLs of valid JPEG images. Crawling the archive from August 1, 2016, provided 4.3 million images from the top 500,000 websites, totaling 195 GB. They downloaded all of these images and analyzed the associated metadata.
They found that 38.9% of the images had some metadata, and of those images, metadata accounted for 15.8% of their total file size. As Dexecure highlights:
“Let that sink in—if each of these top sites were just visited once, nearly 13 GB of data could have been saved on the internet if these websites were handling the metadata properly!”
So what kind of image metadata are we talking about here? The research found it was mostly information about the creation of the image, such as the time and date the photo was taken, camera settings, etc.
These were the 50 most common attribute stored as EXIF data:
Obviously, most of this information is useless to site visitors and isn’t required by browsers to render images anyway. Unless you’re a photographer, or work for a media outlet or some other organization that requires information to be retained for copyright or other reasons, there’s simply no need to retain image metadata.
So where possible, it’s recommended you strip this information when you know you—and your users—don’t need it.
Security and Privacy Concerns Around Image Metadata
Having unnecessarily large images on your site that are potentially slowing down your page speed shouldn’t be your only concern—metadata also comes with security implications.
Back in 2016, Apple forgot to scrub the EXIF data from one of its default desktop wallpapers, leaving behind comments from Apple employees about how it was shot and edited. The slip up was discovered by a New Zealand photographer who shared the news on Reddit.
While there are 49 photos (of 51 total wallpaper options) in El Capitan, only the image below still has its EXIF data intact, with comments like, “Please darken some of the stars that are a bit smaller and darker, so there is a little more difference in the starfield.”
While the comments, in this instance, were harmless, imagine if the Apple employees had left offensive remarks or even shared company secrets?
As well as text information, metadata can include a thumbnail of the image in question—information that American television host Catherine Schwartz no doubt wishes she knew back in 2003. After cropping and posting images of herself on her personal blog, it was soon discovered that metadata in the images included thumbnails of the original photos, which showed Schwartz topless.
Then there’s location information. One of the most prominent examples of image metadata being used to track a person’s location is the arrest of technologist John McAfee in Guatemala in 2012.
While on the run from criminal prosecution for the alleged murder of his neighbor, McAfee was interviewed by Vice for a story that bragged about the fact they were documenting his life on the run. What the publication didn’t realize was that the accompanying photo—taken on an iPhone—included a geotag that authorities used to catch and arrest McAfee.
While the McAfee case is an extreme example, geotags present an obvious privacy concern, which is why Facebook, for example, typically removes metadata from uploaded images.
According to Facebook, information including GPS data is automatically removed from photos uploaded onto the platform to protect people “from accidentally sharing private information, such as their location.”
If you’re concerned about the security of your website and company, or just mindful of privacy, be sure to check your images don’t contain metadata that could potentially open the door for liability issues.
SEO and Image Metadata
Does Google use image metadata from images as a ranking factor? The short answer: ¯\_(ツ)_/¯
Back in 2014, Matt Cutts, the former head of search quality at Google, talked about metadata in images and basically said Google “reserves the right to use it,” but didn’t confirm whether the search giant does or doesn’t use it.
Here’s the video:
John Mueller, a webmaster trends analyst at Google, was asked about this on Twitter last year and said:
So the jury’s out on whether image metadata is a ranking factor. As Cutts says in the video, if your images have metadata and the information can help other users learn more about an image or the camera that took it, by all means, keep the metadata. But only if you want to.
How to Remove Image Metadata
There are lots of different ways to remove metadata from your images:
Removing Image Metadata on a Mac
There’s no straightforward way to delete metadata from an image on a Mac, so you’ll need to download an app. I recommend ImageOptim. It’s easy to use—just drag the images you want scrubbed into ImageOptim and it will get to work. When it’s done, just drag the images back to your desktop.
Removing Image Metadata on Windows
Windows Explorer makes it easy to delete EXIF information from a single photo or a batch of photos in one go. Just follow these steps:
- Select all the files you want to delete metadata from.
- Right-click anywhere within the selected fields and choose “Properties.”
- Click the “Details” tab.
- At the bottom of the “Details” tab, you’ll see a link titled “Remove Properties and Personal Information.” Click this link.
- Windows will ask whether you want to make a copy of the photo with this information removed, or if you want to remove the information from the original. Choose the option you prefer and click “OK.”
Removing Image Metadata with Photoshop
To remove image metadata in Photoshop, use the “Save for Web” option and in the drop-down next to “Metadata” select “None.”
Removing Image Metadata with a Compression Plugin
Image compression plugins for WordPress can help reduce the size of your images by removing unnecessary metadata. Imagify, for example, removes EXIF data by default, but gives you the option to keep it should you want to.
Stripping metadata from images is one of the easiest ways to reduce page weight and deliver faster, more streamlined content, but also improve the security and privacy of your site.
There’s are lots of tools to help you view and remove image metadata, so pick the one that works for you. If you want to automate the removal of EXIF data from your images, image optimization plugins like Imagify can automate the process for you.