Skip to main content

Scalability And Performance Split Your Data and Simplify

Introduction

Here are a few guidelines for supporting scalability and performance in your systems.
  1. Simplify — Simplify your code and design, you will gain from it an easier to understand and a scalable system, your life will be scalable, the more complex it is the less it’s possible to scale it out and the more complex your life is. Of Course if its not possible to simplify do not we are sane people, but many times we only think its not possible to simplify while it’s possible, so do ourself a favour and put some effort on this.
2. X Axis Duplicate Data Create multiple read only db’s or clones for your data and thus scale your reads. You can then use a read query to read across multiple copies of your data thus less strain on your servers.
3. Y Axis Split on business your data like microservices also in db level not only in service level, different roles, different db’s. Do you sell both underwear and have another line of business for atomic energy manufacturing? what do you think of having 2 db’s or more, or just splitting your data you could still have a single db, just make sure you split things.
4. Z Axis Split on same - If you have multiple customers split on customer id, you can put smaller customers on same shard and larger one on different. Be sure you split on categories that makes sense for an even split and not for example on location, in this case if 80% of your customers are from a certain place you didn’t split ok.
5. 3 Data centers rules — If you had only 2 data centers you need 200% capacity if one goes down in order to serve 100% capacity, if you have 3 data centers you need total of 150% so two of them would make 100% capacity if one of them goes down.
6. Remember storage alternatives — Remember you have different storages such as file storage such as `ceph` (don’t forget the file system), nosql, wide column storage such as cassandra and relational. Column storage usually provide automatic row sharding and asynchronous replication with eventual consistency, column split requires more of manual intervention.
7. Consistency - If you increase consistency for example on nosql then operations such as `getSomething` would require to contact all nodes to make sure they return the recent and greatest version.
8. Firewalls are like locks — You lock your main door but you don’t lock internal doors is that right? Credit card request through lock but not image request. Don’t overuse your firewall, it’s complex enough without it.
9. Really need a transaction? When you pass money from one customer to another do you really need a transaction? Consider all options, usually when you start considering event sourcing you see you can compromise without transactions.
10. Dont read validate your write Have you just wrote something to disk/cache/db? don’t reread it in order to validate it your servers have more useful things to do.

Resources



Comments

Popular posts from this blog

Functional Programming in Scala for Working Class OOP Java Programmers - Part 1

Introduction Have you ever been to a scala conf and told yourself "I have no idea what this guy talks about?" did you look nervously around and see all people smiling saying "yeah that's obvious " only to get you even more nervous? . If so this post is for you, otherwise just skip it, you already know fp in scala ;) This post is optimistic, although I'm going to say functional programming in scala is not easy, our target is to understand it, so bare with me. Let's face the truth functional programmin in scala is difficult if is difficult if you are just another working class programmer coming mainly from java background. If you came from haskell background then hell it's easy. If you come from heavy math background then hell yes it's easy. But if you are a standard working class java backend engineer with previous OOP design background then hell yeah it's difficult. Scala and Design Patterns An interesting point of view on scala, is

Alternatives to Using UUIDs

  Alternatives to Using UUIDs UUIDs are valuable for several reasons: Global Uniqueness : UUIDs are designed to be globally unique across systems, ensuring that no two identifiers collide unintentionally. This property is crucial for distributed systems, databases, and scenarios where data needs to be uniquely identified regardless of location or time. Standardization : UUIDs adhere to well-defined formats (such as UUIDv4) and are widely supported by various programming languages and platforms. This consistency simplifies interoperability and data exchange. High Collision Resistance : The probability of generating duplicate UUIDs is extremely low due to the combination of timestamp, random bits, and other factors. This collision resistance is essential for avoiding data corruption. However, there are situations where UUIDs may not be the optimal choice: Length and Readability : UUIDs are lengthy (typically 36 characters in their canonical form) and may not be human-readable. In URLs,

Bellman Ford Graph Algorithm

The Shortest path algorithms so you go to google maps and you want to find the shortest path from one city to another.  Two algorithms can help you, they both calculate the shortest distance from a source node into all other nodes, one node can handle negative weights with cycles and another cannot, Dijkstra cannot and bellman ford can. One is Dijkstra if you run the Dijkstra algorithm on this map its input would be a single source node and its output would be the path to all other vertices.  However, there is a caveat if Elon mask comes and with some magic creates a black hole loop which makes one of the edges negative weight then the Dijkstra algorithm would fail to give you the answer. This is where bellman Ford algorithm comes into place, it's like the Dijkstra algorithm only it knows to handle well negative weight in edges. Dijkstra has an issue handling negative weights and cycles Bellman's ford algorithm target is to find the shortest path from a single node in a graph t