Skip to main content

Aws DynamoDb Part 2

Partitions and Performance

For a single partition:

- For a single partition key
- You can have a hard limit around 3000RCU (Read Capcity Units)
- You can have hard limit around 1000WCR
- Partition key should evenly distributed among partitions
- Accessing frequently a partition is called "hot partition" can cause performance issues.
- General number of requests you make to dynamodb too high

Partitions Best Practice

- Use high cardinality attributions - itemId, cardId, SessionId
- Use composite attributed - ComputerId+DisplayId
- Cache - can use DAX (dynamodb accelerator for reads)
- Add Random Numbers to make partition key high cardinality

Consistency Models

- DynamoDb supports both eventual and strongly consistent.
- By default eventual consistency.

- Eventual - Read response can be non latest, other write, stale, repeat get different answer, but keep repeating and will get latest answer.
- Strongly - Always up to date data, can get error 500, higher latency, not supported on global secondary index GSI, use more RCU throughput, use it when using api for GetItem then pass parameter you want ConsistentRead.

Transactions

- Supports Transactions!
- All or nothing!
- ACID - Atomicity, Consistency, Isolation, Durability
- Read and write multiple items accross multiple tables
- Check pre requisite conditions before writing
- Group (Put, Update, Delete, ConditionCheck) in a single transaction
- API TransactWriteItems operation - succeed/fail
- TransactGetItems
- Cost - no additional to enable but if you have multiple reads costs.
- Like 2 phase commit - dynamo will perform 2 underlying reads or writes every time in transaction 1. prepare 2. commit.
- When we look at cloud watch we would see these two read/writes.

Scan / Query

- Scan - yeah full scan - avoid using it
- Can cause lot of RCU
- ProjectionExpression to select specific attributes (columns)
- Parallel scan for higher performance
- Set ConsistenRead on scan to get it to strong consistent

- Query based on primary key which is distinct for each row.
- like UserId
- Can use optional sort key to filter items based on sort key
- Results sorted by sort key
- ScanIndexForward for query with it can reverse results
- More efficient scan

Indexes

LSI - Local Secondary Index

- Alternative sort key for use in scan and query
- 5 - Up to 5 LSI per table
- Sort key is one scala attribute
- Created only at table creation time
- Canno add remove modify LSI
- Same partition key as table - just a + different sort key
- Different view of the data
- UserId --> BirthDate , UserID --> (LSI) --> Height

GSI Global Secondary Index

- This is like a new table
- 1. Different partition key.  2. Different sort key.
- Create any time.
- Different partition key than orig table.
- Different sort key.
- UserId --> BirthDate, GSI: EmailId --> LoginTime
- Can specify which attributes to project to this "new table"
- Define RCW/WCU for this GSI
- Can effect performance and throttle main table when we have writes even if have enough WCU because need to reflect the new items.
- LSI not throttling.

Comments

Popular posts from this blog

Functional Programming in Scala for Working Class OOP Java Programmers - Part 1

Introduction Have you ever been to a scala conf and told yourself "I have no idea what this guy talks about?" did you look nervously around and see all people smiling saying "yeah that's obvious " only to get you even more nervous? . If so this post is for you, otherwise just skip it, you already know fp in scala ;) This post is optimistic, although I'm going to say functional programming in scala is not easy, our target is to understand it, so bare with me. Let's face the truth functional programmin in scala is difficult if is difficult if you are just another working class programmer coming mainly from java background. If you came from haskell background then hell it's easy. If you come from heavy math background then hell yes it's easy. But if you are a standard working class java backend engineer with previous OOP design background then hell yeah it's difficult. Scala and Design Patterns An interesting point of view on scala, is

Alternatives to Using UUIDs

  Alternatives to Using UUIDs UUIDs are valuable for several reasons: Global Uniqueness : UUIDs are designed to be globally unique across systems, ensuring that no two identifiers collide unintentionally. This property is crucial for distributed systems, databases, and scenarios where data needs to be uniquely identified regardless of location or time. Standardization : UUIDs adhere to well-defined formats (such as UUIDv4) and are widely supported by various programming languages and platforms. This consistency simplifies interoperability and data exchange. High Collision Resistance : The probability of generating duplicate UUIDs is extremely low due to the combination of timestamp, random bits, and other factors. This collision resistance is essential for avoiding data corruption. However, there are situations where UUIDs may not be the optimal choice: Length and Readability : UUIDs are lengthy (typically 36 characters in their canonical form) and may not be human-readable. In URLs,

Bellman Ford Graph Algorithm

The Shortest path algorithms so you go to google maps and you want to find the shortest path from one city to another.  Two algorithms can help you, they both calculate the shortest distance from a source node into all other nodes, one node can handle negative weights with cycles and another cannot, Dijkstra cannot and bellman ford can. One is Dijkstra if you run the Dijkstra algorithm on this map its input would be a single source node and its output would be the path to all other vertices.  However, there is a caveat if Elon mask comes and with some magic creates a black hole loop which makes one of the edges negative weight then the Dijkstra algorithm would fail to give you the answer. This is where bellman Ford algorithm comes into place, it's like the Dijkstra algorithm only it knows to handle well negative weight in edges. Dijkstra has an issue handling negative weights and cycles Bellman's ford algorithm target is to find the shortest path from a single node in a graph t