Amazon's Simple Storage Service (S3) is a great way to safely store loads of data in the cloud. It's highly available, simple to use and provides good data durability by automatically copying your data across multiple regions and/or zones. With over 80 billion objects stored (at last published count) I'm clearly not alone in thinking it's a good thing.
The only problem I've had with S3 over the years is the queazy feeling I get when I think about some nefarious individual getting hold of my AWS AccessKey/SecretKey. Since all S3 capabilities are accessed via a REST API and since that credential pair is used to authenticate all requests with S3, a bad guy/girl with my credentials (or a temporarily stupid version of me) could potentially delete all of the content I have stored in S3. That represents the "Worst Case Scenario" of S3 usage and I've spent a considerable amount of time and effort trying to find ways to mitigate this risk.
Using multiple AWS accounts can help. The Import/Export feature is another way to mitigate your exposure. But what I've always wanted was a WORM (Write Once Read Many) bucket. Well, not always, but at least since May 6, 2007. That would give me confidence that the data I store in S3 could not be accidentally or maliciously deleted. This kind of feature would also provide some interesting functionality for certain types of compliance and regulatory solutions.
Starting today, AWS has released a couple of really useful new features in S3: Versioning and MFADelete. Together, these features provide just about everything I wanted when I asked for a WORM bucket. So, how do they work?
Versioning allows you to have multiple copies of the same object. Each version has a unique version ID and the versions are kept in ascending order by the date the version was created. Each bucket can be configured to either enable or disable versioning (only by the bucket owner) and the basic behavior is shown below in the table. The behavior of a Versioned bucket differs based on whether it is being accessed by a Version-Aware (VA) client or NonVersion-Aware (NVA) client.
|Operation||Unversioned Bucket||Versioned Bucket - NVA Client||Versioned Bucket - VA Client|
|GET||Retrieves the object or a 404 if the object is not found||Retrieves the latest version or a 404 if a Delete Marker is found||Retrieves the version specified by provided version ID|
|PUT||Stores the content in the bucket, overwriting any existing content||Stores content as new version||Stores content as new version|
|DELETE||Irrevocably deletes the content||Stores a DeleteMarker as latest version of object.||Permanently deletes version specified by provided version ID|
The above table is just a summary. You should see the S3 documentation for full details but even this summary clearly shows the benefits of versioning. If I enable versioning on a bucket, the chance of accidentally deleting content is greatly reduced. I would have to be using a version-aware delete tool and explicitly referencing individual version ID's to permanently delete them.
So, accidental deletion of content is less of a risk with versioning but how about the other risk? If a bad guy/girl gets my AccessKey/SecretKey, they can still delete all of my content as long as they know how to use the versioning feature of S3. To address this threat, S3 has implemented a new feature called MFADelete.
MFADelete uses the Multi-Factor Authentication device you are already using to protect AWS Portal and Console access. What? You aren't using the MFA device? Well, you should go sign up for one right now. It's well worth the money, especially if you are storing important content in S3.
Like Versioning, MFADelete can be enabled on a bucket-by-bucket basis and only by the owner of the bucket. But, rather than just trusting that the person with the AccessKey/SecretKey is the owner, MFADelete uses the MFA device to provide an additional factor of authentication. To enable MFADelete, you send a special PUT request to S3 with an XML body that looks like this:
<?xml version="1.0" encoding="UTF-8"?>
In addition to this XML body, you also need to send a special HTTP header in the request, like this:
x-amz-mfa: <serial number of MFA device> <token from MFA device>
Once this request has been sent, all delete operations on the bucket and all requests to change the MFADelete status for the bucket will also require the special HTTP header with the MFA information. So, that means that even if the bad guy/girl gets your AccessKey/SecretKey combo they still won't be able to delete anything from your MFADelete-enabled bucket without the MFA device, as well.
It's not exactly the WORM bucket I was originally hoping for but it's a huge improvement and greatly reduces the risk of accidental or malicious deletion of data from S3. I got my pony!
The code in the boto subversion repo has already been updated to work with the new Versioning and MFADelete features. A new release will be out in the near future. I have included a link below to a unit test script that shows most of the basic operations and should give you a good start on incorporating these great new features into your application. The script prompts for you for the serial number of your MFA device once and then prompts for a new MFA code each time on is required. You can only perform one operation with each code so you will have to wait for the device to cycle to the next code between each operation.