Database Optimization: A Practical Guide to Indexing and Table Management

Atik Bin Mustafij (Sobuj)
5 min readDec 23, 2024

--

Efficient database management is the cornerstone of high-performance applications. Whether you’re a developer, database administrator, or tech enthusiast, understanding indexing and table management can significantly enhance the speed and efficiency of your queries.

This guide offers a comprehensive exploration of database indexing and table management. By breaking down complex concepts into practical steps, it equips you with the tools to optimize data retrieval, streamline table operations, and balance performance with storage efficiency. From creating indexes to analyzing query execution plans, this guide ensures you’re well-prepared to manage databases effectively.

1. What is Database Indexing?

Database indexing is a technique used to improve the speed of data retrieval operations in a database table. It works like an index at the back of a book, where you can quickly find the page number of a topic without scanning the entire book.

Indexes are special lookup tables that the database uses to find rows efficiently. Instead of searching the entire table row by row, the database can jump directly to the relevant rows, saving time and resources.

Real-Life Example of Database Indexing

Scenario: Finding a Specific Contact in a Phone Book

Imagine you have a physical phone book (database table) containing 1,000,000 names (rows) listed alphabetically by last name (indexed column). You need to find the phone number of “John Smith.”

Without an Index:

If there is no alphabetical order (no index), you’d have to scan each name in the phone book, one by one, from the beginning to the end. This is like performing a full table scan in a database, which is time-consuming.

  • Steps: Start from the first name and check each one until you find “John Smith.”
  • Time Taken: Proportional to the number of names in the book (linear search).

With an Index:

If the phone book is organized alphabetically (indexed by last name), you can directly jump to the “S” section, then narrow down to “Smith,” and finally find “John.” This is akin to the binary search algorithm used in database indexing.

Steps:

  1. Open the book to the section for “S” (jump).
  2. Narrow down to “Smith.”
  3. Find “John Smith” under “Smith.”

Time Taken: Logarithmic time (much faster for large data sets).

How Indexing Works in Databases

How Indexing Works

1. Creating an Index:

  • When you create an index on a column (e.g., CREATE INDEX idx_name ON contacts(last_name)), the database builds a sorted data structure (like a B-tree or hash table) for that column.

2. Using the Index:

  • When you query the database (e.g., SELECT phone FROM contacts WHERE last_name = 'Smith';), the database uses the index to locate the relevant rows instead of scanning the entire table.

3. Trade-Off:

  • While indexes speed up reads, they slightly slow down writes (INSERT, UPDATE, DELETE) because the index must be updated whenever the data changes.
  • Indexes also consume additional storage space.

2. Creating and Managing a Table

Creating the contacts Table

CREATE TABLE contacts (
id INT AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
phone VARCHAR(20) NOT NULL,
email VARCHAR(100) NOT NULL,
city VARCHAR(50)
);
  • Primary Key: id is the unique identifier for each row.
  • Columns:
  • first_name and last_name: Store the contact’s name.
  • phone: Stores the contact’s phone number.
  • email: Stores the contact’s email address.
  • city: Stores the contact’s city.

3. Adding Indexes

Creating an Index on last_name

CREATE INDEX idx_last_name ON contacts(last_name);
  • Index Name: idx_last_name
  • Column Indexed: last_name
  • Purpose: Speeds up queries that filter, sort, or group by the last_name column.

Composite Index (Optional)

If you often query by both last_name and city together:

CREATE INDEX idx_last_name_city ON contacts(last_name, city);

4. Inserting Dummy Data

Insert sample data into the contacts table for testing:

INSERT INTO contacts (first_name, last_name, phone, email, city) VALUES
('John', 'Smith', '123-456-7890', 'john.smith@example.com', 'New York'),
('Jane', 'Doe', '987-654-3210', 'jane.doe@example.com', 'Los Angeles'),
('Michael', 'Brown', '555-666-7777', 'michael.brown@example.com', 'Chicago'),
('Emily', 'Davis', '111-222-3333', 'emily.davis@example.com', 'Houston'),
('David', 'Wilson', '444-555-6666', 'david.wilson@example.com', 'Phoenix'),
('Sarah', 'Johnson', '777-888-9999', 'sarah.johnson@example.com', 'Philadelphia'),
('Chris', 'Lee', '222-333-4444', 'chris.lee@example.com', 'San Diego'),
('Anna', 'White', '888-999-0000', 'anna.white@example.com', 'Dallas'),
('James', 'Taylor', '123-789-4560', 'james.taylor@example.com', 'Austin'),
('Linda', 'Martinez', '321-654-9870', 'linda.martinez@example.com', 'San Jose');

5. Querying the Table

Basic Queries

Retrieve All Data:

SELECT * FROM contacts;

Filter by Last Name:

SELECT * FROM contacts ORDER BY last_name;

Group by Last Name:

SELECT last_name, COUNT(*) AS count FROM contacts GROUP BY last_name;

6. Analyzing Index Usage

View Index Details

SHOW INDEX FROM contacts;
  • Displays the list of indexes, their names, columns, and types.

Check Query Execution Plan

Use EXPLAIN to analyze how MySQL uses the index:

EXPLAIN SELECT * FROM contacts WHERE last_name = 'Smith';

Look at the key column in the output to confirm index usage.

7. Dropping Indexes and Tables

Drop an Index

If the index is no longer needed:

DROP INDEX idx_last_name ON contacts;

Check Existence Before Dropping

To avoid errors if the table does not exist:

DROP TABLE IF EXISTS contacts;

8. Measuring Index Storage

Query to Check Index Sizes:

SELECT
TABLE_SCHEMA AS database_name,
TABLE_NAME AS table_name,
INDEX_LENGTH AS index_size_in_bytes,
ROUND(INDEX_LENGTH / 1024, 2) AS index_size_in_kb,
ROUND(INDEX_LENGTH / (1024 * 1024), 2) AS index_size_in_mb
FROM
information_schema.TABLES
WHERE
TABLE_SCHEMA = 'your_database_name'
AND TABLE_NAME = 'contacts';

Replace your_database_name with the name of your database.

9. Performance Considerations

Advantages of Indexing

  • Faster Query Performance: Indexes speed up searches, sorting, and grouping.
  • Efficient Data Retrieval: Improves performance for large datasets.

Disadvantages of Indexing

  • Storage Overhead: Indexes consume additional disk space.
  • Write Performance: Slows down INSERT, UPDATE, and DELETE operations since the index must be updated.

Summary

  • Indexes are crucial for optimizing query performance, especially for large datasets.
  • Adding indexes requires balancing query speed with storage and write performance.
  • Use tools like SHOW INDEX and EXPLAIN to analyze and verify index usage.
  • Manage your database tables and indexes carefully to maintain efficient performance.

Sign up to discover human stories that deepen your understanding of the world.

--

--

Atik Bin Mustafij (Sobuj)
Atik Bin Mustafij (Sobuj)

Written by Atik Bin Mustafij (Sobuj)

Expert Software Developer with a knack for mobile/web applications, database design, and architecture. Proficient in project management and business analysis.

No responses yet

Write a response