In databases, how do you determine whether making an index on one particular column is good, useless, or bad?

541 views

In databases, how do you determine whether making an index on one particular column is good, useless, or bad?

In: Technology

4 Answers

Anonymous 0 Comments

The easiest way is to run whatever query your doing a few times without an index and a few times with an index and see if it makes a difference to the average speed.

Anonymous 0 Comments

A good index should be on a column that is often used in the “where” clause of queries or the “on” clause of a join. It should be on a column that has a wide range of possible values (e.g. nearly every row has a different value), rather than only a few possible values (like “active”, “inactive”, “expired”).

An index will very slightly slow down inserts, updates and deletes that affect the indexed column, but this is rarely a problem. It’s one reason why you don’t want too many indexes though.

An index will take up some disk space. Again, this is rarely a problem. It’s another reason why you don’t want too many indexes though.

There are special types of indexes you can use for special purposes, like finding words in columns containing text or columns with geographical data.

Anonymous 0 Comments

Are you sorting or looking up data by said column very often? If so it may be a good candidate for an index.

Indexes help the database search for values in the specified column. If the column needs to be searched often it makes sense to index it to speed that search up. If you only use it every once in a while or if the values don’t help with lookup that much (for instance columns that only have a few distinct values) then it may not be worth the time spent keeping the index correct.

Anonymous 0 Comments

Generally you can tell if an index is required on a column/set of columns because the queries frequently make use of those columns as criteria for SELECT/UPDATE/DELETE, JOINs, GROUPing, ORDERing or PARTITIONing, and it is a column with diverse values (eg. a boolean flag column doesn’t need indexes). In those cases, indexes are almost certainly going to be good. If the columns aren’t used that way, then the index is useless – which also means bad, because it means unnecessarily wasting resources (storage, RAM, CPU). Indexes are very rarely directly make query performance worse, if one isn’t necessary it simply won’t be used.

If you need more quantitative data than ‘educated guess’, then the query execution plan will give some idea. If your query is doing full table scans at some point, then it probably needs indexes there. The planner will tell you what the cost of each operation is, and if it decreases after you’ve added the index, then congrats you’ve improved performance. You will also want to do some real-world query executions and measure the actual impact to execution time and resource utilisation before and after adding the index.

Of course you can only prove a performance increase after adding the index, which may not be an easy option in a live production environment if you don’t have a suitable test environment that closely resembles prod. In that case you just need to rely on experience and probability. DB tuning often requires some trial and error.