How Is Data Corruption Handled In YugabyteDB Vs. PostgreSQL?
YugabyteDB fully utilizes the PostgreSQL query layer. Doing so allows us to enable more RDBMS features than other distributed SQL databases, even though our storage engine differs from Postgres. With YugabyteDB, you get the consistency of PostgreSQL and gain greater resiliency against issues like server panics, node failures, and disk corruption. Our database is designed to tolerate single disk failures (or corruptions) because it runs on multiple servers. If needed, you can take down a bad server and bring on a new one, all while serving active traffic. In fact, a large automotive manufacturer was recently able to keep serving 2m ops/s despite a server panic.
Regarding disk corruption specifically, YugabyteDB’s use of LSM Tree / SST Files marks a significant difference from PostgreSQL. These sequential write append-only files handle random updates in memory only, with the first level of LSM Tree being a MemTable that flushes to an SST file. SST Files then compact into new SST files, minimizing the risk of corruption and ensuring new writes won’t corrupt previous data.
In contrast, PostgreSQL’s Heap Tables and B-Tree update with random reads that can go to any block. If corruption occurs at the storage level in this case, it might corrupt previous data within the same block, necessitating frequent whole-database checks. If corruption happens on a block older than the backup retention, recovery becomes impossible, a risk that YugabyteDB’s design significantly mitigates.
Dealing with PostgreSQL managed services
If you are using PostgreSQL managed services vendor—such as Amazon RDS—there is a chance that the same corrupt block is on the standby due to the way RDS replicates with storage sync rather than WAL.
With YugabyteDB, once the SST files are written, they are not altered. As a result, new changes cannot corrupt the past data. Additionally, YugabyteDB further safeguards data by checking data validity through independent compactions performed on all nodes, verifying checksums to confirm integrity. Since YugabyteDB’s replication is at a higher layer (the logical key-value changes in the raft group), another tablet peer likely has the right data (the probability of having corruption on two different physical writes is very low) and the corrupt one can be discarded.
Discover More Tips and Tricks
Explore our library of distributed SQL tips and tricks and general “how to” information on the YugabyteDB blog and on our DEV Community Blogs.
Events and Training
Check out the upcoming YugabyteDB events, including all training sessions, conferences, in-person and virtual events, and YugabyteDB Friday Tech Talks (designed for engineers by engineers).
In addition, there is some extremely popular “how to” content on the YugabyteDB YouTube channel.
If You Have Questions About Distributed SQL
If you have questions, ask them on the YugabyteDB Slack channel, Forum, GitHub, or Stack Overflow.
Next Steps
Ready to start exploring YugabyteDB features? You have some great options to get started. Run the database locally on your laptop (Quick Start), deploy it to your favorite cloud provider (Multi-node Cluster Deployment), sign up for a free YugabyteDB Managed cluster, or request a full-featured trial. It’s easy! Get started today!