AWS S3 Glacier Instant Retrieval for additional storage cost savings
In fall 2021, AWS introduced an extra S3 storage class - Glacier Instant Retrieval. This brings us to a total of eight S3 storage classes offering different upsides.
Standard Infrequent Access (IA)
Standard One-Zone Infrequent Access
Glacier Instant Retrieval
Glacier Flexible Retrieval
Glacier Deep Archive
S3 Glacier Instant Retrieval
Let's say you have had an object which requires real-time access. Depending on how often you access it, you would choose some of the Standard storage classes. Usually Standard-IA.
Glacier storage classes are used for archive purposes. If you don't want to delete the object, but still want to keep the option of accessing it (in no real-time), you would use Glacier.
Glacier Instant Retrieval is a missing link between cheap storage and real-time object access.
Standard classes offering:
cheap data retrieval
Glacier classes offering the opposite:
no realtime access
expensive data retrieval
Glacier Instant Retrieval offers the best of both worlds:
cheap storage ($0.004/GB - 3x cheaper than IA)
affordable data retrieval ($0.03/GB - 3x more expensive than IA)
Adjusting lifecycle rules to Glacier Instant Retrieval
In November 2021, I wrote about optimizing S3 storage costs where we made a cost-cut of $40k annually.
In bucket analytics, you can see a couple of important parameters.
By applying the same principle, but now based on December data, I've come up with the following calculation.
In this case, I am calculating the difference between Glacier Instant Retrieval and Standard-IA, by comparing storage costs and data retrieval costs.
As expected storage costs dropped significantly, while data retrieval costs are negligible.
The conclusion is obvious:
By moving the objects to the Glacier Instant Retrieval instead of Standard Infrequent Access we can save 60% more on top of cost-reduction we already made.
Let's do it!!
A spreadsheet with real-world data is available here. With minor adjustments, you can use it to do your own calculations.
S3 IA vs S3 GIR: Cost analysis spreadsheet
Why it works in the case above?
We have a small number of objects (around 100k long-lived, rarely accessed objects.
All our objects are huge - a couple of dozens of GBs.
We access these objects 2-3 times in a lifetime before it is a candidate for Deep Archive transfer.
When does it not make sense to switch to Instant Retrieval?
One downside of Glacier storage classes is that you have to keep objects in storage for 30 or 90 days. If you delete the objects - a pro-rated charge applies anyway.
If your storage is huge and you have small-size files (a couple of hundreds of kB), it means you'll have millions of files in the bucket. Pay attention to lifecycle transition cost, as it might be more expensive to move objects from one storage class to another than the cost is made.
- To move 1000 objects into Glacier Instant Retrieval, you'll be charged $0.02.
- To move 15m objects to Glacier Deep Archive will cost $750.
If you're accessing objects at least once a week (or so - it depends on the object size), it most probably makes sense to leave it in Standard IA.
It's beyond belief that AWS is introducing these optimal storage options. We have reduced our S3 monthly costs from $3400 a month to $300 a month.
Therefore, if you have large storage, and you wanna keep those files - it's worth paying attention to different storage class capabilities as we did.
If you've liked this post, make sure to subscribe down below. I am mostly writing about engineering management, specific software development stuff, and cloud services.