Follow-up: Evaluating CockroachDB vs YugabyteDB Webinar
Earlier this week, Yugabyte CTO Karthik Ranganathan presented the live webinar: Evaluating CockroachDB vs YugabyteDB, with a spotlight on comparing PostgreSQL features, architecture, and the latest performance benchmarks between the two databases. We were delighted to see such interest in the topic, dive deeper into some of the topics we raised in parts 1 and parts 2 of the blog series, Bringing Truth to Competitive Benchmark Claims – YugabyteDB vs CockroachDB, and answer questions from the audience.
In this blog post, we provide the playback recording and slides, recap some highlights from the presentation, and summarize the questions and answers.
If you have additional questions about evaluating YugabyteDB vs CockroachDB, or any question about distributed SQL databases or YugabyteDB in general, you can ask them on our YugabyteDB Slack channel, Forum, GitHub, Stackoverflow, or contact us on our website.
Webinar recording and slides
Please find a copy of the webinar recording here, and the slides here.
Evaluating CockroachDB vs YugabyteDB presentation highlights
Introduction to Distributed SQL databases
Karthik first touched on defining what is a distributed SQL database, further adding that distributed SQL databases:
- Support SQL features, and each distributed SQL database differs in terms of the depth of the SQL features it supports.
- Are resilient to failures, which means if you have deployed in the cloud and you lose a node or a zone, your distributed SQL database should be able to survive that failure automatically without manual intervention.
- Are horizontally scalable, which means if you need to expand the number of nodes that you have for processing power or storage, you should be able to do so by simply adding nodes.
- Are geographically distributed, which means you should be able to deploy the database in a multi zone, multi-region or multi-cloud configuration.
YugabyteDB was architected to support all of the above. Additionally, YugabyteDB was specifically architected to be:
- High performance in terms of providing both massive throughput and extremely low latency
- Cloud native, which means it can run on Kubernetes, VMs, bare metal, any cloud, container, and data center
- Open source – YugabyteDB is one hundred percent open source. The entire database, including all the enterprise features such as backups, encryption, security, and more, are made available under the Apache 2.0 license, which is one of the most permissive open source licenses out there.
Evaluating RDBMS feature support
YugabyteDB reuses the PostgreSQL native query layer, whereas CockroachDB has rewritten the entire SQL layer in Go. We actually started with a rewrite in C++, and what we found was that PostgreSQL, developed over 20+ years, was so rich in terms of its feature set that it was going to be nearly impossible to build all of these features by hand. So we reset about four to five months of effort, went back to the drawing board and made the decision that it’s much better to reuse the PostgreSQL query layer code. We’re very happy we did that because YugabyteDB is able to support many relatively complex and advanced RDBMS features, and offer the most complete set of PostgreSQL-compatible features available in a distributed SQL database.
You can learn more about why we built YugabyteDB by reusing the PostgreSQL query layer here, and see all of the PostgreSQL features supported in YugabyteDB vs CockroachDB here.
What YugabyteDB’s reuse of the PostgreSQL codebase means to developers is that, if they are already using PostgreSQL to get their applications to perform the way they want to, they can continue to do so with YugabyteDB, while adding scalability, high availability, and geographic distribution at the same time.
Visit the webinar playback from ~6:25 to ~13:32 to get a deep dive into how YugabyteDB is architected to enhance PostgreSQL to a distributed architecture across three phases:
- SQL layer on distributed DB
- Perform more SQL pushdowns
- Enhance PostgreSQL optimizer
The result of these three phases is that you can push down a lot of the query execution closer to where the data is and reduce the amount of data that’s being sent over the network to the query coordinator node, the result of which is a truly distributed SQL database.
Evaluating performance, especially performance at scale
Developers are often attracted to a distributed SQL database for large amounts of data. As such, we decided to perform the YCSB benchmark at scale, with the standard YCSB driver and the standard JDBC binding, so there’s no difference in the benchmark we used and to eliminate bias. We decided to run the YCSB performance benchmark with 450 million rows for both YugabyteDB and CockroachDB, and compared the results.
You can read the deep dive into the performance of YugabyteDB vs CockroachDB at large data sizes here.
Some of the highlights:
- YugabyteDB has on average 3x higher throughput than CockroachDB, and it outperforms CockroachDB in all YCSB workloads in terms of throughput achieved whether using hash or range sharding.
- On average, YugabyteDB had 7x better write/update latency and 2x better read latency compared to CockroachDB.
You can listen to Karthik describe the performance comparisons from ~13:32 to ~26:50 in the webinar recording.
Architectural takeaways
When we first tried to do the performance benchmark at scale, we attempted to do it for 1 billion rows. But, we ran into issues loading data into CockroachDB in a reasonable amount of time, which we defined as roughly a 24 hour timeframe. In that window, we were able to only load 450 million rows into CockroachDB, so that is what we used as our basis for the performance at scale evaluation described above. However, we wanted to do 1 billion rows because that would have represented about a terabyte of data per node, including the replication factor or a three terabyte total data set across three nodes.
In the webinar, Karthik looked at some of the architectural takeaways that we observed by attempting to load a billion rows into each of the databases, including three main issues:
- Use of RocksDB: CRDB unevenly uses multiple disks; whereas YugabyteDB has an even split of data across the two disks.
- Compactions affect performance: CockroachDB’s lack of timely compactions have a huge negative impact on read performance; whereas compactions in YugabyteDB were able to keep pace with the data load thereby maintaining low SSTable file count and read amplification.
- Backpressure vs no back pressure: CockroachDB was not able to throttle write requests, which impacted the read performance and the cluster had to be left idle for many hours to recover. By contrast, the load throughput graph of YugabyteDB shows that the database repeatedly applies backpressure on the loader to make sure the read queries would always perform well.
You can read about architectural differences observed between YugabyteDB and CockroachDB trying to load 1 billion rows with 64 threads here.
You can listen to Karthik’s explanation of the architectural takeaways from ~26:50 – ~37:00 in the webinar recording.
Q&A Summary
Q: Did the benchmarks require the use of GPS clocks?
A: You do not need GPS clocks to run YugabyteDB. None of the benchmarks here required synchronized clocks to run. In the specific case of the YCSB benchmarks, you don’t need a very high level of clock accuracy, because YugabyteDB internally uses hybrid logical time and hybrid clocks (and we believe CockroachDB does the same) in order to figure out how to synchronize time across nodes. It’s completely possible to run this with NTP on commodity hardware without GPS clocks.
Q: Follow-up: How does YugabyteDB manage time within the system without the aid of atomic clocks?
A: Time synchronization across nodes comes into play primarily in the context of multi-row/distributed ACID transactions. YugabyteDB’s distributed transactions architecture is inspired by Google Spanner which uses a globally-synchronized atomic clock called TrueTime to manage time across nodes of a cluster. However, TrueTime is proprietary to Google and hence is not available as a commodity infrastructure component that an open source database like YugabyteDB can depend upon. Instead of TrueTime, YugabyteDB uses a combination of Raft operation IDs and Hybrid Logical Clocks (HLC). HLC combines physical time clocks that are coarsely synchronized (using NTP) with Lamport clocks that track causal relationships. Each node in the cluster first computes its HLC. HLC is represented as a (physical time component, logical component) tuple and is described in detail in the post Distributed PostgreSQL on a Google Spanner Architecture – Storage Layer.
Q: YugabyteDB appears to perform better than CockroachDB with large data sets. What about small datasets like 100 GB?
A. The amount of data in our evaluation, 450 million – 1 billion rows, is not a large amount of data in real-world enterprise scenarios. Organizations often run a terabyte of data even on a single-node RDBMS. If the dataset is 100GB or less, the difference between the two databases would get marginalized. To evaluate performance at a smaller data set that you don’t think will grow, and you don’t need to be future-proof for scale, then you’d have to evaluate performance across databases without that behavior at which a single-node RDBMS like native PostgreSQL would be good enough.
Q: Why did Yugabyte not design their database as an extension to PostgreSQL?
We tried our best to do that; but ran into two issues:
1) YugabyteDB does high availability for the system catalog, whereas a PostgreSQL extension only works on the user table side. There is no extension that would help the system catalog be highly available, which means you still have a single point of failure in the system catalog. If that goes down, you cannot work with the database.
2) PostgreSQL recently intro’d pluggable storage; but it’s not at the level at which we intercept, because we’re actually changing it, not to change the way data is laid out on a single node, but the way queries are executed across nodes. There are no hooks that we were able to find for PostgreSQL in order to be able to integrate in a clean fashion.
That said, this is something that we would love to do in the longer term and see if we are able to transform PostgreSQL and YugabyteDB to work together as an extension or snap on. This was just not feasible given where things are. So, we started out with PostgreSQL 10.4, rebased to Postgres 11.2, and we hope to keep bringing in all the newer Postgres features in the meantime in order to make our database as feature rich and functional as PostgreSQL itself.
Q: With CockroachDB recently announcing support for HASH partitioning, will it have a noticeable effect on the benchmark results that were presented?
A. The YCSB benchmark, the way it generates data, is not going to introduce hot spots whether it’s range or hash. CockroachDB will start with a single range or shard and it’s going to quickly split into a few shards; when it splits, you will see that the performance has to match whatever hash sharding would provide. For YugyabteDB, when loading data there is a minor 2-hour advantage for hash vs range. Looking at the workloads themselves for range vs hash, the difference is marginal. Workload E (range scan workload) works with range not hash. You can see range vs hash are able to keep their performance characteristics pretty close; the latencies across workloads are pretty matched.
The takeaway is that range vs. hash is really no different, the minute you get a few splits, your range and hash performance will stabilize to be the same. For YCSB – we found that it’s immaterial that we run it on hash or range. Changing the key to hash should not affect the performance of the database once there are enough splits, which happens very quickly; after about the first 10 min of running the workload, you shouldn’t really see a difference. Note that hash sharding was not generally available in CockroachDB when we did the test.
What’s Next?
If you are interested in learning more about comparing CockroachDB vs YugabyteDB, you can check out these resources: