Database Performance
Proven techniques to prevent your database from becoming your application’s bottleneck
The database is the most critical component and potentially the biggest bottleneck of any web application. A poorly optimised query can block the entire application under load; a missing index can multiply an endpoint’s response time by 100. Database performance directly impacts user experience and system scalability.
This guide covers the fundamental techniques for database optimisation: effective indexing, query tuning, caching strategies, connection pooling, monitoring and scaling options. Concepts applicable to both relational databases (PostgreSQL, MySQL) and NoSQL (MongoDB, Redis).
Indexing: the biggest impact with the least effort
Indexes are the most powerful tool for improving query performance. Without an appropriate index, the database performs a full table scan (traversing all rows) for each query. With the right index, the same query can be resolved in microseconds.
- Create indexes on columns used in WHERE, JOIN, ORDER BY and GROUP BY
- Use composite indexes when your queries filter by multiple columns simultaneously
- Avoid over-indexing: each index consumes space and slows down writes (INSERT/UPDATE/DELETE)
- Use EXPLAIN/EXPLAIN ANALYZE to verify that your queries use the expected indexes
- Partial indexes (PostgreSQL): index only a subset of rows to save space and improve performance
Query optimisation
A poorly written query can be efficient with 1,000 records but catastrophic with 1 million. Query optimisation is an ongoing process that requires understanding execution plans and your application’s data access patterns.
- Avoid SELECT *: select only the columns you need to reduce I/O
- Use cursor-based pagination instead of OFFSET for large datasets: OFFSET recalculates everything from the start
- Replace correlated subqueries with JOINs when possible: the query optimiser works better with JOINs
- Identify N+1 queries: a classic ORM problem where one query is executed per record in the relationship
- Use batch operations (INSERT INTO ... multiple VALUES, bulk updates) instead of row-by-row operations
Database caching strategies
Caching reduces database load by storing results of frequent queries in memory. Redis and Memcached are the most popular options as a cache layer between the application and the database.
Implement caching for data that is read far more often than written: product catalogues, settings, category lists. Use TTL (Time To Live) invalidation for periodically changing data and write-through or write-behind for data that changes in real time.
- Cache-aside (lazy loading): the application queries the cache first; if not present, queries the DB and stores the result
- Write-through: every DB write simultaneously updates the cache. Guaranteed consistency.
- Write-behind: writes go to the cache first and sync to the DB asynchronously. Better performance, risk of data loss.
- Native query caching: MySQL Query Cache (deprecated in 8.0), PostgreSQL with pg_stat_statements to identify slow queries.
Connection pooling
Each database connection consumes server resources: memory, file descriptors and CPU for the backend process. Without connection pooling, a web application can exhaust available connections under moderate load, causing errors that look like database outages.
PgBouncer is the most widely used pooler for PostgreSQL: it manages a pool of connections that are reused across application requests. For MySQL, ProxySQL offers similar functionality. In Node.js applications, libraries like pg-pool or mysql2 include built-in pooling.
- Size the pool using the formula: connections = (cores * 2) + effective disks
- Configure idle connection timeouts to free up resources
- Monitor pool usage: if it is constantly at 100%, you need to scale
- Use serverless connection pooling (Neon, PlanetScale, Supabase) for serverless environments where connections are ephemeral
Performance monitoring
You cannot optimise what you do not measure. Continuous database monitoring allows you to identify slow queries, problematic usage patterns and bottlenecks before they impact users.
- pg_stat_statements (PostgreSQL) / Performance Schema (MySQL): identify slow queries with execution statistics
- Datadog Database Monitoring: query traces with execution plans, waits and resource usage
- pganalyze / Percona Monitoring: specialised tools for PostgreSQL and MySQL respectively
- Key metrics: queries per second, p95/p99 latency, active connections, buffer pool cache usage, disk IOPS
Scaling strategies
When query optimisation and caching are not enough, the database needs to scale. There are two main strategies, each with its own trade-offs.
- Vertical scaling: more CPU, RAM and SSD on the existing server. Simple but with a physical limit. Sufficient for most applications.
- Read replicas: read-only replicas that distribute SELECT queries. Reduces primary load. Native support in PostgreSQL, MySQL and managed services.
- Sharding: splitting data horizontally across multiple instances. High complexity, only justified at massive scale.
- Managed databases: Amazon RDS, Google Cloud SQL, PlanetScale or Neon handle backups, replicas and scaling automatically.
Key Takeaways
- Indexes are the highest-impact optimisation: use EXPLAIN to verify their usage
- Identify and eliminate N+1 queries, especially when using an ORM
- Redis as a cache dramatically reduces the load from repetitive queries
- Connection pooling is essential in production to prevent connection exhaustion
- Continuously monitor slow queries: pg_stat_statements and APM tools are your allies
Is your database a bottleneck?
We analyse your database performance, optimise queries and indexes and design an architecture that scales with your business.