Engine
Features
- Consistent hashing
- Replication factor, replicas of the data across the cluster
- Consistency level controlled for each query
- Up to 2 billion key-value pairs in a row

Cassandra replication
- Replication factor = 3
- Consistency level = QUORUM
- Clients talks to any node, the node hashes the partition key and finds the location of the data
- Data is read from all the replicas waiting for responses until we reach a quorum

Cassandra write
- Acknowledged when we write to both the commit log (append only) and the memtable
- When the memtable becomes full itβs flushed into an SSTable
- Periodically SSTables are merged

Cassandra read
- Check if the key is in the in-memory row cache
- Query the bloomfilters of the existing SSTables to find the record, if it doesnβt exist then skip the SSTable
- If the bloomfilter says that there may be data check the in-memory key cache
- On miss get the data from the SSTable and merge it with the data in the memtable, write the key to the in-memory key-cache and merged result to the in-memory row cache
Data modeling
Goals
- spread data evenly around the cluster
- minimize the number of partitions read
- keep partitions manageable
Process
- Identify initial entities and relationships
- Key attributes (map to PK columns)
- Equality search attributes (map to the beginning of the PK)
- Inequality search attributes (map to clustering columns)
- Other attributes
- Static attributes are shared within a given partition
primary key = partition key + clustering columns
Legend:
K Partition key
C Clustering key and their ordering (ascending or descending)
S Static columns, fixed and shared per partition

Cassandra table structure
Validation
- Is data evenly spread?
- 1 partition per read?
- Are writes (overwrites) possible?
- How large are the partitions? Letβs assume that each partition should have at most 1M cells,
- How much data duplication?
Examples
Store books by ISBN
Attribute | Special |
---|---|
isbn | K |
title | |
author | |
genre | |
publisher |
- Is data evenly spread? Yes
- 1 partition per read? Yes
- Are writes (overwrites) possible? Yes
- How large are the partitions?
- How much data duplication? 0
Register a user uniquely identified by an email/password, we also want their fullname. They will be accessed by email and password or by UUID
Attribute | Special |
---|---|
K | |
password | C |
fullname | |
uuid |
Q1: find users by login info
Q3: find users by email (to guarantee uniqueness)
- Is data evenly spread? Yes
- 1 partition per read? Yes
- Are writes (overwrites) possible? Yes
- How large are the partitions?
- How much data duplication? 0
Attribute | Special |
---|---|
uuid | K |
fullname |
Q2: get users by UUID
- Is data evenly spread? Yes
- 1 partition per read? Yes
- Are writes (overwrites) possible? Yes
- How large are the partitions?
- How much data duplication? 0
Find books a logged in user has read sorted by title and author
Attribute | Special |
---|---|
uuid | K |
title | C |
author | C |
fullname | S |
ISBN | |
genre | |
publisher |
- Is data evenly spread? Yes
- 1 partition per read? Yes
- Are writes (overwrites) possible? Yes
- How large are the partitions? (up to 200k book reads per user)
- How much data duplication? 0
Interaction of every user in the website
Attribute | Special |
---|---|
uuid | K |
time | C (desc) |
element | |
type |
- Is data evenly spread? Yes
- 1 partition per read? Yes
- Are writes (overwrites) possible? Yes
- How large are the partitions? (up to 333k book reads per user, 333k actions may be low number of actions to store therefore we should store actions by bucket)
- How much data duplication? 0
Attribute | Special |
---|---|
uuid | K |
month | K |
time | C (desc) |
element | |
type |
- Is data evenly spread? Yes
- 1 partition per read? Yes
- Are writes (overwrites) possible? Yes
- How large are the partitions? (up to 333k book reads per user)
1 year = 333k / 365 / 24 = 38 actions / h
1 month = 333k / 30 / 24 = 462 actions / h (most realistic case)
1 week = 333k / 7 / 24 = 1984 actions / h
- How much data duplication? 0