Efficient SQL

Published: 2012-02-25

As databases are the backend, beating heart of many applications (especially on the web) efficiently querying them via SQL is a very important aspect of scalability. The most basic ones you should come up with are prepared statements, minimizing (multi-layered) loops around your queries and GROUP-ing and paging via SQL. You may also come up with indexing specific table columns which is no longer code-based, but database administration – and of course we are in both fields if we want to optimize database query performance. But how do you efficiently index? (You can indeed over-index, decreasing performance again.)

A great and extensive resource on database querying efficiency is the website Use The Index Luke. I strongly encourage you to check it out, if you care about these issues.

Some notes and reference for myself (very incomplete) follow:

Querying For Top-N Results

FETCH FIRST <x> / LIMIT <x> usefulness is pretty limited if column is not indexed (non-pipelined order_by), as entire table has to be sorted. Although the returned result can be selected faster from the sorted table, the big time-eater is the sorting.

SQL-standard / PostgreSQL: FETCH FIRST <x> ROWS ONLY;

MySQL: LIMIT <x>;

Offsetting Top-N Results

By Number/Index: Specifying the offset by index is the most common one. As entries are numbered though, the farther you offset the longer it takes (has to count to your offset (+ limit)).

PostgreSQL: OFFSET <y> FETCH NEXT <x> ROWS ONLY;

MySQL: LIMIT <x> OFFSET <y>;

By Value: Using a WHERE clause with an indexed column evaluation (WHERE <x> > <y>). See more details and heads-ups on fetching by value.