What is SQL?

SQL (Structured Query Language) is the standard language for communicating with relational databases. You use it to create tables, insert data, ask questions, update records, and control who has access to what.

It’s not a general-purpose programming language — SQL doesn’t have loops or conditionals in the traditional sense. It’s a declarative language: you describe what you want, and the database figures out how to retrieve it.

Why SQL Still Matters

SQL was created in the early 1970s at IBM, standardized by ISO and ANSI, and has been the foundation of data management ever since. In 2025, it’s still required for:

Data analysis: Nearly every BI tool (Tableau, Looker, Power BI, Metabase) executes SQL under the hood
Data engineering: ETL/ELT pipelines, dbt models, warehouse transformations
Backend development: Application databases (PostgreSQL, MySQL, SQLite)
Data science: Pulling data for analysis, cleaning datasets, feature engineering
Cloud data warehouses: Snowflake, BigQuery, Redshift, Databricks all use SQL as the primary query interface

It’s one of the few skills that transfers nearly everywhere in the data ecosystem.

What SQL Can Do

SQL is organized into sublanguages by purpose:

DDL (Data Definition Language)  — CREATE, ALTER, DROP
  Define and modify table structure

DML (Data Manipulation Language) — INSERT, UPDATE, DELETE
  Add, change, and remove data

DQL (Data Query Language)       — SELECT
  Retrieve and transform data

DCL (Data Control Language)     — GRANT, REVOKE
  Manage permissions

TCL (Transaction Control Language) — BEGIN, COMMIT, ROLLBACK
  Control multi-statement transactions

In practice, most SQL work involves SELECT (reading data) and DML (writing data).

How Relational Databases Work

A relational database stores data in tables — a structure with named columns and typed rows. Tables are related to each other through shared columns called foreign keys.

customers table:
  customer_id | name         | email              | city
  ──────────────────────────────────────────────────────
  1           | Alice Smith  | alice@example.com  | London
  2           | Bob Chen     | bob@example.com    | New York

orders table:
  order_id | customer_id | amount | status
  ──────────────────────────────────────────
  1001     | 1           | 149.99 | completed
  1002     | 1           | 79.50  | completed
  1003     | 2           | 249.00 | pending

The customer_id column in orders is a foreign key that references customers.customer_id. This relationship lets you ask: “What are all of Alice’s orders?” without duplicating her contact information in every order row.

Your First SQL Queries

Retrieve all rows from a table:

SELECT * FROM customers;

Retrieve specific columns:

SELECT name, email FROM customers;

Filter rows with WHERE:

SELECT name, email
FROM customers
WHERE city = 'London';

Join two tables:

SELECT c.name, o.amount, o.status
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed';

Aggregate data:

SELECT customer_id, COUNT(*) AS order_count, SUM(amount) AS total_spent
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC;

SQL Dialects

The SQL standard defines core behavior, but each database adds its own extensions. The major databases and their common contexts:

Database	Typical Use
PostgreSQL	Production apps, data warehousing, analytics
MySQL / MariaDB	Web applications, LAMP stack
SQLite	Mobile apps, embedded databases, local development
SQL Server	Microsoft ecosystem, enterprise
BigQuery	Google Cloud data warehouse
Snowflake	Cloud data warehouse
Redshift	AWS data warehouse
DuckDB	In-process analytics, local analysis

Core SQL — SELECT, JOIN, GROUP BY, WHERE — is nearly identical across all of them. Edge cases (date functions, string functions, window function syntax, pagination) differ.

Relational vs NoSQL

SQL databases enforce a schema — you define column names and types upfront, and the database ensures data conforms to that structure. This is called relational or structured data.

NoSQL databases (MongoDB, Redis, Cassandra) store data without a fixed schema. They’re optimized for different access patterns and scale differently. Neither is universally better — they solve different problems.

Most production data systems use SQL for analytical workloads regardless of what the operational database is, because SQL’s expressive power for aggregation, joins, and complex filtering is hard to match.

How SQL Gets Executed

When you run a SQL query, the database goes through several steps:

1. Parse   — check the SQL syntax is valid
2. Analyze — resolve table and column names
3. Plan    — determine the most efficient execution strategy
4. Execute — actually retrieve or modify the data
5. Return  — send results back to the client

The “plan” step (query optimization) is where the database decides whether to use an index, which join algorithm to apply, and in what order to process tables. Understanding this helps you write faster queries.