Securing YugabyteDB: The SIEM/SOAR Quest

Securing the Infrastructure Behind Our Distributed DBaaS

Bharat Kumar Mukheja

At Yugabyte, our mission is to build the most secure DBaaS* available. So we began researching how to best secure the infrastructure supporting our fully managed version of YugabyteDB, YugabyteDB Aeon (formerly YugabyteDB Managed). We began the process of evaluating SEIM/SOAR (Security Information and Event Management / Security Orchestration, Automation, and Response) solutions and quickly came to the conclusion that external, 3rd party solutions would not meet our needs.

Securing YugabyteDB: The SIEM/SOAR Quest
SIEM, which stands for Security Information and Event Management, provides real-time analysis of security alerts generated by applications and network hardware. Source: Security Information and Event Management (SEIM) Solution and It’s Importance

Understanding Our Requirements

A SIEM/SOAR solution may be one of the most critical security infrastructure tools for a company (of any size). Event monitoring, access audits, log analysis, and file integrity monitoring are the fundamental security mechanisms the cloud is built upon.

So we began by outlining our essential requirements (on paper, believe it or not). We had two primary objectives with the SIEM/SOAR infrastructure.

  1. To deploy a security layer over our critical infrastructure
  2. To align with industry best practices and comply with top certification frameworks (ISO, SOC, etc.)

We wanted to cover our most critical pieces of infrastructure—the SaaS infrastructure behind the fully managed deployment of  YugabyteDB, the build and test pipeline, and others.

We wrote down our exhaustive list of requirements with a few nice-to-haves. Our top objectives were:

  • Intrusion detection system
  • File integrity monitoring
  • Malware detection
  • Log retention and search

Our non-functional requirements included:

  • Long-term log retention
  • Easy integration with notification and alerting systems
  • Ability to support custom log sources.

Cost Estimation

When developing any large solution, the purchase and implementation costs can be substantial, often making the solution prohibitive if they balloon. For our use cases, we factored the following costs into our estimate:

  1. Subscription costs(if SaaS vendor) or the cost of a license plus hosting(if a self-hosted vendor)
  2. Storage costs
  3. Data transfer costs

The first two (subscription and storage) would be the most expensive. Data transfer costs, even though significant on their own, would be minimal in the context of this big project.

Storage Estimation

To calculate pricing,  a rough estimate of storage is required, which are heavily influenced by the type of solution used. A storage estimation also greatly helps determine the storage type that can be used, which could affect the underlying cloud type used (AWS/GCP/other) and pricing. Let’s create a quick formula for easy reference later.

We have two kinds of data sources:

  1. Cloud native audit logs(AWS Cloudtrail, GCP cloud logging, etc.)
  2. SIEM agents installed on machines

For each log source, we’ll have to find average log size per day Si, where i is the source name, e.g. Sgcp

For Cloud native sources, we need the number of cloud accounts/projects, Ai

For agent sources, we need number of agents, N

Finally we need the number of days for log retention, Di

The formula for storage calculation is:

((Saws * Aaws) + (Sgcp * Agcp) + (Sazure * Azure) + (Sagent * N)) * D

NOTE: This formula assumes that each cloud account’s average cloud log size would be the same, which is good enough for a rough storage estimate. For more precise calculations, the formula will split into Saccount1 + Saccount2 + …

Challenges in Finding the Right SIEM Tool

Data Sources for the YugabyteDB SEIM
Data Sources ingested into the YugabyteDB SIEM (i.e. the Yugabyte Security Center).

The next step involved locating this ideal tool. However, it proved difficult to find a single solution for all our security needs. No one product was tailor-made to our requirements. Our challenges included:

  • Most SIEM tools are not cloud native or lack deep integration with public clouds
  • There’s a significant gap in Kubernetes support among SIEM solutions
  • Adapting and customizing to the evolving software landscape is an uphill battle

After evaluating many solutions, we finalized our SIEM/SOAR tool of choice. I’ll detail the tool and its implementation in the next blog post.

*NOTE: A few abbreviations

  • SIEM – Security Infrastructure and Event Management
  • SOAR – Security Orchestration Automation and Response
  • DBaaS – Database as a Service

Bharat Kumar Mukheja

Related Posts

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Get Started
Browse Yugabyte Docs
Explore docs
PostgreSQL For Cloud Native World
Read for Free