How does Amazon RDS backup/snapshot actually work?

I am an Amazon RDS customer and am experiencing daily amazon RDS write latency spikes, corresponding roughly to the backup window. I will also see spikes at the end of a snapshot (case in point: running a snapshot takes appx 1 hour, and in the final 5 minutes, write latency spikes). I am running a multi-AZ m1.large deployment.

Is there anyone on Stack who can explain how Amazon RDS backup is actually working? I've read the Amazon RDS docs, and as far as I can tell, Amazon RDS is not behaving according to spec. Specifically, these backup/snapshot operations should be hitting my replica, and therefore not causing any downtime/performance hit, or so I thought.

I can distill my problem into six questions:

  • What is technically happening during a snapshot and a backup, and how are they different? (If you answer this question, please tell me if you are able to empirically confirm your answer, or are simply quoting me documentation).
  • Is a spike in write latency to be expected during the backup window on a multi-AZ deployment?
  • Is a spike in write latency to be expected at the end of a snapshot on a multi-AZ deployment?
  • Would my write latency spike be even higher if I was not multi-AZ ?
  • Architecturally, would I be able to avoid these write latency spikes if I rolled my own database running on two m1.large EC2 instances?
  • Are there any configurations I can use that would avoid these write latency spikes while still hosting my DB with RDS, or am I effectively at the mercy of Amazon?

Bonus Question: where and how do you host your mysql database?

I can say that I have been generally happy with RDS except for these daily write latency issues. I love the built-in database monitoring and it was fairly simple to setup and get going.

Thanks!

amazon RDS write latency

91
задан esilver 9 March 2011 в 17:37
поделиться