Dynamo (storage system)


Dynamo is a set of techniques that together can form a highly available key-value structured storage system or a distributed data store. It has properties of both databases and distributed hash tables. It was created to help address some scalability issues that Amazon.com's website experienced during the holiday season of 2004. By 2007, it was used in Amazon Web Services, such as its Simple Storage Service.

Relationship to DynamoDB

is "built on the principles of Dynamo" and is a hosted service within the AWS infrastructure. However, while Dynamo is based on leaderless replication, DynamoDB uses single-leader replication.

Principles

ProblemTechniqueAdvantage
Dataset partitioningConsistent HashingIncremental, possibly linear scalability in proportion to the number of collaborating nodes.
Highly available writesVector Clock or Dotted-Version-Vector Sets, reconciliation during readsVersion size is decoupled from update rates.
Handling temporary failuresSloppy Quorum and Hinted HandoffProvides high availability and durability guarantee when some of the replicas are not available.
Recovering from permanent failuresAnti-entropy using Merkle treeCan be used to identify differences between replica owners and synchronize divergent replicas pro-actively.
Membership and failure detectionGossip-based membership protocol and failure detectionAvoids having a centralized registry for storing membership and node liveness information, preserving symmetry.

Implementations

Amazon published the paper on Dynamo, but never released its implementation. The index layer of Amazon S3 implements and extends many core features of Dynamo. Since then, several implementations have been created based on the paper. The paper also inspired many other NoSQL database implementations, such as Apache Cassandra, Project Voldemort and Riak.