database greenhorn

PoisonedPrisonPanda@discuss.tchncs.de · edit-2 2 months ago

database greenhorn

PoisonedPrisonPanda@discuss.tchncs.de · 2 months ago

first of all many thanks for the bullets. Good to have some guidance on where to start.

2nd level cache shared between services

I have read about this related to how FB does it. In general this means that fetching from the DB and keep it in memory to work with right? So we assume that the cached data is outdated to some extend?

faster storage/cpu/ram faster storage/cpu/ram faster storage/cpu/ram

I was able to convince management to put money into a new server (SSD thank god). So thank you for your emphasizes. We are also migrating to PostgreSQL from SQL server, and refactor the whole approach and design in general.

generate indexes

How would indeces help me when I want to ensure that no duplicate row is added? Is this some sort of internal SQL constraint or what is the difference to compare a certain list of rows with an existing table (lets say column id)?

Dunstabzugshaubitze@feddit.org · 2 months ago

I have read about this related to how FB does it. In general this means that fetching from the DB and keep it in memory to work with right? So we assume that the cached data is outdated to some extend?

correct, introducing caching can result in returning outdated data for awhile, which is usually not a huge deal. those caches can get tricky, but they should take pressure from your db, if you’re scenario is read heavy, which is often the case. Research existing caching solutions before running ahead and implementing something from scratch, especially if you need a cache distirbuted between multiple instances of your service. In the Java world that would be something like Infinispan, but your ecosystem might over better integration with other solutions.

I was able to convince management to put money into a new server (SSD thank god). So thank you for your emphasizes. We are also migrating to PostgreSQL from SQL server, and refactor the whole approach and design in general.

having management on board is great and the new hardware should help a lot, migrating to another RDBMS sounds scary, but probably worth it if your organisation has more expertise with it.

generate indexes

they won’t help you with your duplicates, they will help speed up your reads but could slow down writes. building a good index is not trivial, but nothing is when it comes to performance tuning a database, it’s tradeoff after tradeoff. The best way to handle identical rows of data is to not write them usually, but i don’t know your system nor its history, maybe there is or was a good reason for its current state.