What is SQL?
SQL (Structured Query Language) is the standard language for communicating with relational databases. You use it to create tables, insert data, ask questions, update records, and control who has access to what.
Itβs not a general-purpose programming language β SQL doesnβt have loops or conditionals in the traditional sense. Itβs a declarative language: you describe what you want, and the database figures out how to retrieve it.
Why SQL Still Matters
SQL was created in the early 1970s at IBM, standardized by ISO and ANSI, and has been the foundation of data management ever since. In 2025, itβs still required for:
- Data analysis: Nearly every BI tool (Tableau, Looker, Power BI, Metabase) executes SQL under the hood
- Data engineering: ETL/ELT pipelines, dbt models, warehouse transformations
- Backend development: Application databases (PostgreSQL, MySQL, SQLite)
- Data science: Pulling data for analysis, cleaning datasets, feature engineering
- Cloud data warehouses: Snowflake, BigQuery, Redshift, Databricks all use SQL as the primary query interface
Itβs one of the few skills that transfers nearly everywhere in the data ecosystem.
What SQL Can Do
SQL is organized into sublanguages by purpose:
DDL (Data Definition Language) β CREATE, ALTER, DROP Define and modify table structure
DML (Data Manipulation Language) β INSERT, UPDATE, DELETE Add, change, and remove data
DQL (Data Query Language) β SELECT Retrieve and transform data
DCL (Data Control Language) β GRANT, REVOKE Manage permissions
TCL (Transaction Control Language) β BEGIN, COMMIT, ROLLBACK Control multi-statement transactionsIn practice, most SQL work involves SELECT (reading data) and DML (writing data).
How Relational Databases Work
A relational database stores data in tables β a structure with named columns and typed rows. Tables are related to each other through shared columns called foreign keys.
customers table: customer_id | name | email | city ββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1 | Alice Smith | alice@example.com | London 2 | Bob Chen | bob@example.com | New York
orders table: order_id | customer_id | amount | status ββββββββββββββββββββββββββββββββββββββββββ 1001 | 1 | 149.99 | completed 1002 | 1 | 79.50 | completed 1003 | 2 | 249.00 | pendingThe customer_id column in orders is a foreign key that references customers.customer_id. This relationship lets you ask: βWhat are all of Aliceβs orders?β without duplicating her contact information in every order row.
Your First SQL Queries
Retrieve all rows from a table:
SELECT * FROM customers;Retrieve specific columns:
SELECT name, email FROM customers;Filter rows with WHERE:
SELECT name, emailFROM customersWHERE city = 'London';Join two tables:
SELECT c.name, o.amount, o.statusFROM customers cJOIN orders o ON c.customer_id = o.customer_idWHERE o.status = 'completed';Aggregate data:
SELECT customer_id, COUNT(*) AS order_count, SUM(amount) AS total_spentFROM ordersGROUP BY customer_idORDER BY total_spent DESC;SQL Dialects
The SQL standard defines core behavior, but each database adds its own extensions. The major databases and their common contexts:
| Database | Typical Use |
|---|---|
| PostgreSQL | Production apps, data warehousing, analytics |
| MySQL / MariaDB | Web applications, LAMP stack |
| SQLite | Mobile apps, embedded databases, local development |
| SQL Server | Microsoft ecosystem, enterprise |
| BigQuery | Google Cloud data warehouse |
| Snowflake | Cloud data warehouse |
| Redshift | AWS data warehouse |
| DuckDB | In-process analytics, local analysis |
Core SQL β SELECT, JOIN, GROUP BY, WHERE β is nearly identical across all of them. Edge cases (date functions, string functions, window function syntax, pagination) differ.
Relational vs NoSQL
SQL databases enforce a schema β you define column names and types upfront, and the database ensures data conforms to that structure. This is called relational or structured data.
NoSQL databases (MongoDB, Redis, Cassandra) store data without a fixed schema. Theyβre optimized for different access patterns and scale differently. Neither is universally better β they solve different problems.
Most production data systems use SQL for analytical workloads regardless of what the operational database is, because SQLβs expressive power for aggregation, joins, and complex filtering is hard to match.
How SQL Gets Executed
When you run a SQL query, the database goes through several steps:
1. Parse β check the SQL syntax is valid2. Analyze β resolve table and column names3. Plan β determine the most efficient execution strategy4. Execute β actually retrieve or modify the data5. Return β send results back to the clientThe βplanβ step (query optimization) is where the database decides whether to use an index, which join algorithm to apply, and in what order to process tables. Understanding this helps you write faster queries.