Sat, Feb 29, 2020
Kafka
Kafka is a distributed event streaming platform designed for building high-throughput, fault-tolerant, and scalable data streaming applications. This article covers key designs in Kafka, such as how messages for a topic are shared into partitions assigned to brokers. Then, we see some guarantees about producers, consumers, and consumer groups.
Mon, Jan 8, 2018
Partitioning
Data partitioning refers to the process of dividing a system's data into smaller, more manageable subsets, which are distributed across multiple storage locations or nodes. This article covers several strategies for partitioning, including random partitioning, by hash key, by range, and a hybrid approach for skewed workloads. It also discusses strategies to rebalance partitions, whether there's a static or dynamic number of partitions.