SQL Data Types
Every column in a SQL table has a data type that defines what kind of value it can hold and how the database stores and processes it. Choosing the right data type matters for correctness (you can’t store a date in a text column without losing semantics), storage efficiency, and query performance.
Numeric Types
Integers
For whole numbers (no decimal places):
SMALLINT -- 2 bytes, range: -32,768 to 32,767INTEGER / INT -- 4 bytes, range: -2,147,483,648 to 2,147,483,647BIGINT -- 8 bytes, range: ~-9.2 trillion to 9.2 trillionAuto-incrementing integer primary keys:
-- PostgreSQLid SERIAL PRIMARY KEY -- 4-byte auto-incrementid BIGSERIAL PRIMARY KEY -- 8-byte auto-increment
-- MySQLid INT AUTO_INCREMENT PRIMARY KEY
-- SQL Standard (PostgreSQL 10+, SQL Server)id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEYDecimal / Fixed-Point Numbers
For exact decimal arithmetic (money, tax rates, measurements):
NUMERIC(precision, scale) -- exact; NUMERIC(10, 2) = up to 8 digits + 2 decimal placesDECIMAL(precision, scale) -- alias for NUMERIC in most databases
-- Examplesprice NUMERIC(10, 2) -- 99999999.99 maxtax_rate NUMERIC(5, 4) -- 0.0825 (8.25%)weight_kg NUMERIC(8, 3) -- 12345.678Always use NUMERIC/DECIMAL for money. Never use FLOAT or REAL for currency — floating-point representation causes rounding errors.
Floating-Point Numbers
For scientific measurements where approximate precision is acceptable:
REAL / FLOAT4 -- 4 bytes, ~6 decimal digits of precisionDOUBLE PRECISION / FLOAT8 -- 8 bytes, ~15 decimal digits
-- Fine for: latitude/longitude, scientific measurements, ML featureslat DOUBLE PRECISIONlon DOUBLE PRECISIONscore REALCharacter Types
CHAR(n) -- Fixed-length: always stores exactly n characters (padded with spaces)VARCHAR(n) -- Variable-length: stores up to n charactersTEXT -- Unlimited length (PostgreSQL, MySQL); use instead of VARCHAR(MAX)When to use each:
VARCHAR(n)with a meaningful limit: usernames (50), email (320), phone (20)TEXTwhen length is truly unbounded: blog content, descriptions, JSON stored as textCHAR(n)rarely: ISO country codes (CHAR(2)), fixed-format codes where exact length matters
username VARCHAR(50) NOT NULLemail VARCHAR(320) UNIQUE NOT NULLphone VARCHAR(20)bio TEXTcountry_code CHAR(2) -- 'US', 'UK', 'DE'Date and Time Types
DATE -- Calendar date: 2025-06-15TIME -- Time of day: 14:30:00TIMESTAMP -- Date + time, no timezone: 2025-06-15 14:30:00TIMESTAMPTZ -- Date + time WITH timezone (UTC internally)INTERVAL -- Duration: '3 days', '2 hours 30 minutes'Always prefer TIMESTAMPTZ over TIMESTAMP for timestamps in application data. It stores UTC and converts to local time on output — you never get ambiguous times during DST transitions.
created_at TIMESTAMPTZ DEFAULT NOW()updated_at TIMESTAMPTZ DEFAULT NOW()order_date DATEevent_time TIMEduration INTERVALDate arithmetic:
-- PostgreSQLSELECT NOW() + INTERVAL '30 days';SELECT created_at + INTERVAL '7 days' AS expires_at FROM trials;SELECT AGE(NOW(), birth_date) AS age FROM customers;SELECT EXTRACT(YEAR FROM created_at) AS year FROM orders;SELECT DATE_TRUNC('month', created_at) AS month FROM orders;
-- MySQLSELECT DATE_ADD(created_at, INTERVAL 30 DAY) FROM trials;SELECT YEAR(created_at) FROM orders;Boolean Type
BOOLEAN -- TRUE, FALSE, or NULL
is_active BOOLEAN DEFAULT TRUEis_verified BOOLEAN DEFAULT FALSEis_deleted BOOLEAN DEFAULT FALSEIn PostgreSQL, you can write TRUE/FALSE or 't'/'f' or 1/0. In MySQL, BOOLEAN is an alias for TINYINT(1) — store 1 for true, 0 for false.
WHERE is_active = TRUEWHERE is_deleted = FALSEWHERE NOT is_verified -- negationJSON and JSONB (PostgreSQL)
PostgreSQL supports JSON natively, making it a hybrid relational + document store:
-- JSON: stores text, no indexingdata JSON
-- JSONB: binary storage, indexable, faster for queries (preferred)metadata JSONB-- Insert JSONINSERT INTO products (name, metadata)VALUES ('Keyboard', '{"color": "black", "switch": "blue", "wireless": true}');
-- Query JSON fieldsSELECT name, metadata->>'color' AS color FROM products;SELECT * FROM products WHERE metadata->>'wireless' = 'true';SELECT * FROM products WHERE metadata @> '{"switch": "blue"}';
-- Index a JSON fieldCREATE INDEX idx_products_color ON products((metadata->>'color'));
-- GIN index for fast containment queriesCREATE INDEX idx_products_meta ON products USING GIN (metadata);Arrays (PostgreSQL)
PostgreSQL supports arrays as column values:
-- Array column definitiontags TEXT[]scores INTEGER[]
-- InsertINSERT INTO articles (title, tags) VALUES ('SQL Guide', ARRAY['sql', 'tutorial', 'database']);
-- QuerySELECT title FROM articles WHERE 'sql' = ANY(tags);SELECT title FROM articles WHERE tags @> ARRAY['sql', 'tutorial'];UUID
Universally Unique Identifiers — useful as primary keys for distributed systems where you need unique IDs without a central sequence:
id UUID DEFAULT gen_random_uuid() PRIMARY KEY -- PostgreSQL 13+id UUID DEFAULT uuid_generate_v4() PRIMARY KEY -- requires uuid-ossp extensionUUIDs are 128-bit values (16 bytes vs 4-8 bytes for integers). They’re larger and slightly slower to index, but useful when:
- Merging data from multiple sources
- Exposing IDs in public APIs (sequential integers expose row counts)
- Generating IDs client-side before database insert
Type Casting
Convert between types explicitly:
-- PostgreSQL cast syntaxSELECT '2025-06-15'::DATESELECT '149.99'::NUMERICSELECT 42::TEXT
-- ANSI SQL CAST function (works everywhere)SELECT CAST('2025-06-15' AS DATE)SELECT CAST(price AS TEXT)SELECT CAST('42' AS INTEGER)
-- COALESCE with type consistencySELECT COALESCE(amount, 0::NUMERIC) FROM orders;Choosing the Right Type
Whole numbers (IDs, counts): INTEGER or BIGINTMoney, prices, rates: NUMERIC(p, s)Scientific values (approx OK): DOUBLE PRECISIONFixed-length codes ('US', 'EUR'): CHAR(n)Variable-length strings: VARCHAR(n) or TEXTDates: DATETimestamps with timezone: TIMESTAMPTZFlags, booleans: BOOLEANSemi-structured data: JSONB (PostgreSQL) or JSONUnique IDs across systems: UUIDThe most common mistakes: using FLOAT for money, using TEXT for everything when a specific type would add validation, and using TIMESTAMP instead of TIMESTAMPTZ for application data that spans timezones.