We had snapshots enabled on RDS, which meant that there was supposedly a full snapshot every night and then logs of changes for every 5 minutes after that. The idea behind this was that we could restore to within 5 minutes of when the database stopped logging. However, this snapshot and log was inaccessible since EBS/EC2/RDS were down.
This has always scared the shit out of me.
The idea that the backup system is part of what I’m paying for in a hosting provider. But then the hosting provider goes down in a bad way, and not only is my app down, but so is my backup. My approach is to be exceedingly paranoid about these providers. One example of this: I always copy important data not just to Amazon S3 for posterity, but also to a plain old NAS sitting in a physical office over which I have dominion. Yes, S3 has ridiculous durability. But, S3 can (and will) go down. Or, someone will one day compromise Amazon’s systems and figure out how to delete my S3 bucket and all the replicas they have of it.
Honestly, I don’t think cloud has simplified any of this stuff. I think it has actually made it worse, because CTOs think they can outsource their architectural decisions to these service providers. As my Dad once said to me: “it’s not what you don’t know that kills you… it’s what you think you know, that just ain’t so.”