SQL Notebook
Basic Query & Sorting
SELECT
SELECT license_plate, entry_time
FROM parking_history;
* returns every column.
Column Aliases
SELECT parking_id,
parked_hours * 40 AS amount
FROM parking_history;
When you compute something, use AS to give it a clear name.
Removing Duplicates
SELECT DISTINCT parking_type
FROM parking_history;
DISTINCT makes sure the same row does not appear more than once.
ORDER BY Clause
SELECT *
FROM parking_history
ORDER BY entry_time ASC;
ORDER BY sorts the results. ASC means earliest first.
Filtering & Conditions
WHERE Clause
SELECT *
FROM parking_history
WHERE license_plate = 'AAA-1111';
WHERE filters the rows. Here, it returns the details for the license plate AAA-1111.
Conditions with AND, OR, NOT
SELECT license_plate, payment_method, amount, is_electric
FROM payment_history
WHERE (payment_method = 'Credit_Card' AND amount > 5000)
OR (is_electric = 1);
- AND: all conditions must be true.
- OR: at least one condition must be true.
- NOT: flips a condition.
SQL checks these in this order: NOT → AND → OR.
Always use brackets () for complex logic so you don't make a mistake.
Not Equal (<>)
SELECT parking_id, parking_type, entry_time, exit_time
FROM parking_history
WHERE parking_type <> 'monthly';
<> means "not equal". This returns every non-monthly parking record. Same as WHERE NOT parking_type = 'monthly'.
<> is shorter and very common.
Check for Missing Values (IS NULL)
SELECT station_code, license_plate, entry_time
FROM parking_history
WHERE exit_time IS NULL;
- IS NULL finds rows with no value (the car is still parked).
- Never use
= NULLor<> NULL— they don't work. - Use
IS NULLorIS NOT NULLonly.
Date Range (BETWEEN or >= / <)
SELECT COUNT(*) AS total_cars
FROM parking_history
WHERE entry_time >= '2025-01-01'
AND entry_time < '2026-01-01';
- The best way to cover a full year:
>= start AND < next_year_start. - Safe even with timestamps (hours, minutes, seconds).
- BETWEEN also works, but it can miss records on Dec 31 if the time is not exactly 00:00:00.
IN (Match a List)
SELECT station_code, station_name, open_date
FROM management_table
WHERE strftime('%Y', open_date) IN ('2025', '2024', '2023');
IN checks if the value is in the list. Cleaner than many ORs.
Pattern Matching (LIKE)
SELECT station_code, license_plate
FROM parking_history
WHERE license_plate LIKE 'E%';
LIKE finds text patterns.
- % = any characters (zero or more)
- _ = exactly one character
Aggregation & Grouping
Aggregate Functions
SELECT station_code,
COUNT(*) AS vehicle_count,
AVG(TIMESTAMPDIFF(MINUTE, entry_time, exit_time)) AS avg_duration_mins,
MAX(TIMESTAMPDIFF(MINUTE, entry_time, exit_time)) AS max_duration_mins,
MIN(TIMESTAMPDIFF(MINUTE, entry_time, exit_time)) AS min_duration_mins
FROM parking_history
WHERE exit_time IS NOT NULL
GROUP BY station_code;
- Per group:
COUNT(*)→ row count,AVG()→ average,MAX/MIN→ highest / lowest value. - Use WHERE first to filter rows before grouping.
Basic Grouping (GROUP BY)
SELECT station_code, COUNT(*) AS total_entries
FROM parking_history
GROUP BY station_code
ORDER BY total_entries DESC;
Group by Several Columns
SELECT city, station_type, COUNT(*) AS station_count
FROM management_table
GROUP BY city, station_type
ORDER BY city, station_type;
GROUP BY city, then station_type.
Grouping via Expressions
SELECT HOUR(entry_time) AS entry_hour,
COUNT(*) AS traffic_volume
FROM parking_history
GROUP BY HOUR(entry_time)
ORDER BY entry_hour;
GROUP BY HOUR(entry_time) → hourly buckets (0–23). You can analyze trends without adding extra columns.
Filter Groups (HAVING)
SELECT station_code, COUNT(*) AS total_visits
FROM parking_history
GROUP BY station_code
HAVING COUNT(*) > 1000;
- HAVING filters AFTER grouping.
- WHERE filters BEFORE (raw rows).
- You cannot use HAVING on non-aggregates.
Add Totals (ROLLUP)
SELECT payment_method, COUNT(*) AS tx_count
FROM payment_history
GROUP BY payment_method WITH ROLLUP;
WITH ROLLUP adds subtotal rows + a grand total (NULL row). Quick reports without extra queries.
Table Joins
FROM Clause & Table Aliases
SELECT ph.station_code, ph.station_name, py.amount
FROM parking_history ph
JOIN payment_history py
ON py.parking_id = ph.parking_id;
FROM tells the query which table to read. Short aliases (like ph, py) make the query shorter and easier to read, especially when you join tables.
Inner Join (Most Common)
SELECT ph.license_plate, ph.entry_time, mt.bd_name, mt.city
FROM parking_history ph
INNER JOIN management_table mt
ON ph.station_code = mt.station_code;
INNER JOIN keeps only rows that have a match in both tables.
Left Join (Keep All from the Left Table)
SELECT mt.station_name,
COALESCE(SUM(pay.amount_received), 0) AS total_revenue
FROM management_table mt
LEFT JOIN payment_history pay
ON mt.station_code = pay.station_code
AND DATE(pay.paid_time) = CURRENT_DATE
GROUP BY mt.station_name;
Joining Three Tables
SELECT ph.license_plate, mt.area, pay.payment_method, pay.paid_time
FROM parking_history ph
INNER JOIN management_table mt
ON ph.station_code = mt.station_code
INNER JOIN payment_history pay
ON ph.station_code = pay.station_code
AND ph.entry_time = pay.entry_time;
Start with parking_history → add station info → add payment info.
Chain Several Outer Joins
SELECT mt.station_name, ph.entry_time, pay.amount_received
FROM management_table mt
LEFT JOIN parking_history ph
ON mt.station_code = ph.station_code
LEFT JOIN payment_history pay
ON ph.station_code = pay.station_code
AND ph.entry_time = pay.entry_time;
Self-Join
SELECT t1.station_name AS station_a, t1.bd_team AS team_a,
t2.station_name AS station_b, t2.bd_team AS team_b, t1.area
FROM management_table t1
INNER JOIN management_table t2
ON t1.area = t2.area
AND t1.bd_team <> t2.bd_team;
Use two aliases (t1, t2) for the same table. Find pairs of stations in the same area but different teams.
Cross Join (Every Combo)
SELECT mt.station_name, d.day_num
FROM management_table mt
CROSS JOIN (
SELECT 1 AS day_num UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) d
ORDER BY mt.station_name, d.day_num;
Set Operations
The UNION Operator
SELECT station_code FROM Parking_history
UNION
SELECT station_code FROM Payment_history;
UNION stacks two query results and removes duplicates. Rules: same number of columns, compatible data types, column names from the first query.
UNION ALL (Combine + Keep All Rows)
SELECT *
FROM (
SELECT entry_time AS event_time, 'Vehicle Entry' AS event_type
FROM parking_history
WHERE station_code = 'ST001'
UNION ALL
SELECT paid_time AS event_time, 'Payment Made' AS event_type
FROM payment_history
WHERE station_code = 'ST001'
)
ORDER BY event_time;
UNION ALL keeps every row (faster, no de-duplication).
INTERSECT (Rows in Both Queries)
SELECT station_code, entry_time
FROM parking_history
INTERSECT
SELECT station_code, entry_time
FROM payment_history;
INTERSECT returns only rows that appear in BOTH results.
EXCEPT (Rows in First but Not Second)
SELECT station_code, entry_time
FROM parking_history
EXCEPT
SELECT station_code, entry_time
FROM payment_history;
EXCEPT returns rows that are only in the FIRST query.
Subqueries
Basic Subquery
SELECT station_code, station_name, open_date
FROM management_table
WHERE open_date = (
SELECT MAX(open_date)
FROM management_table
);
- The inner query runs first to find the latest open_date.
- The outer query gets the stations that opened on that date.
Scalar Subquery in SELECT
SELECT mt.station_name,
mt.city,
(SELECT COUNT(*)
FROM parking_history ph
WHERE ph.station_code = mt.station_code) AS total_visits
FROM management_table mt;
The subquery returns one value per row.
IN Subquery (Filter by List)
SELECT mt.station_name, ph.license_plate, ph.entry_time
FROM parking_history ph
JOIN management_table mt
ON mt.station_code = ph.station_code
WHERE ph.station_code IN (
SELECT station_code
FROM management_table
WHERE area = 'Xinyi Dist'
);
Non-correlated (the subquery runs once). The subquery → list of station_codes in Xinyi.
Derived Table (Subquery in FROM)
SELECT station_code, day_of_record, daily_total
FROM (
SELECT station_code,
DATE(paid_time) AS day_of_record,
SUM(amount_received) AS daily_total
FROM payment_history
GROUP BY station_code, DATE(paid_time)
) AS daily_stats
WHERE daily_total > 10000;
- Inner → daily revenue per station.
- Outer → keep only the high-revenue days (> 10,000).
NOT EXISTS (Find Unmatched Rows)
SELECT ph.license_plate, ph.station_code, ph.entry_time
FROM parking_history ph
WHERE NOT EXISTS (
SELECT 1
FROM payment_history pay
WHERE pay.station_code = ph.station_code
AND pay.entry_time = ph.entry_time
);
- Correlated subquery: it runs once for every row in the outer query (
parking_history). - If NO matching record is found in the inner query (
payment_history) → it returns TRUE and the row is kept. - Use case: finding unpaid parking sessions.
SELECT 1is used because the actual data does not matter — only whether the row exists.
Conditional Logic
Simple CASE (Exact Match Mapping)
SELECT station_name,
CASE station_type
WHEN 1 THEN 'Indoor Garage'
WHEN 2 THEN 'Outdoor Lot'
WHEN 3 THEN 'Street Parking'
ELSE 'Unknown'
END AS type_description
FROM management_table;
Match exact values only.
Searched CASE (Ranges & Complex Rules)
SELECT ticket_id,
TIMESTAMPDIFF(HOUR, entry_time, exit_time) AS duration_hours,
CASE
WHEN TIMESTAMPDIFF(HOUR, entry_time, exit_time) < 2 THEN 'Short-term'
WHEN TIMESTAMPDIFF(HOUR, entry_time, exit_time) BETWEEN 2 AND 8 THEN 'Medium'
ELSE 'Long-term'
END AS duration_category
FROM parking_history
WHERE exit_time IS NOT NULL;
You can check any condition (ranges, >, <, AND/OR).
LIMIT & OFFSET
LIMIT (Top N Rows)
SELECT license_plate, entry_time
FROM parking_history
ORDER BY entry_time DESC
LIMIT 10;
LIMIT returns only the first N rows. Always pair it with ORDER BY, otherwise the "top" rows are random.
OFFSET (Pagination)
SELECT license_plate, entry_time
FROM parking_history
ORDER BY entry_time DESC
LIMIT 10 OFFSET 20;
OFFSET skips the first N rows. Page 1 → OFFSET 0, Page 2 → OFFSET 10, Page 3 → OFFSET 20.
SQL Execution Order
SQL clauses are written in one order but run in a different order:
| Step | Clause | What it does |
|---|---|---|
| 1 | FROM / JOIN | Load tables and combine rows |
| 2 | WHERE | Filter raw rows |
| 3 | GROUP BY | Group the remaining rows |
| 4 | HAVING | Filter groups |
| 5 | SELECT | Pick columns, run expressions |
| 6 | DISTINCT | Remove duplicate rows |
| 7 | ORDER BY | Sort results |
| 8 | LIMIT / OFFSET | Cut to the final row count |
This is why: - You cannot use a SELECT alias in WHERE (the alias does not exist yet at step 2). - You cannot use aggregate functions in WHERE — use HAVING instead. - You can use a SELECT alias in ORDER BY (the alias exists by then).
CTEs (Common Table Expressions)
Basic CTE
WITH station_revenue AS (
SELECT station_code,
SUM(amount_received) AS total_revenue
FROM payment_history
GROUP BY station_code
)
SELECT mt.station_name, sr.total_revenue
FROM management_table mt
JOIN station_revenue sr
ON mt.station_code = sr.station_code
ORDER BY sr.total_revenue DESC;
WITH names a temporary result set. The outer query reads it like a normal table. Prefer CTEs over nested subqueries — same logic, much easier to read.
Several CTEs
WITH active_stations AS (
SELECT DISTINCT station_code
FROM parking_history
WHERE entry_time >= '2025-01-01'
),
station_revenue AS (
SELECT station_code,
SUM(amount_received) AS total_revenue
FROM payment_history
GROUP BY station_code
)
SELECT mt.station_name, sr.total_revenue
FROM management_table mt
JOIN active_stations a ON mt.station_code = a.station_code
JOIN station_revenue sr ON mt.station_code = sr.station_code;
Chain several CTEs with commas. Each one can use the ones before it.
Window Functions
Window functions compute values across a set of rows without collapsing them into one row (unlike GROUP BY).
ROW_NUMBER (Rank Within a Group)
SELECT station_code, license_plate, entry_time,
ROW_NUMBER() OVER (
PARTITION BY station_code
ORDER BY entry_time DESC
) AS visit_rank
FROM parking_history;
- PARTITION BY: reset the counter for each group (like GROUP BY, but you keep all the rows).
- ORDER BY: sets the order inside each partition.
- ROW_NUMBER always gives a unique number, even on ties.
RANK vs DENSE_RANK
SELECT station_code,
SUM(amount_received) AS revenue,
RANK() OVER (ORDER BY SUM(amount_received) DESC) AS rank,
DENSE_RANK() OVER (ORDER BY SUM(amount_received) DESC) AS dense_rank
FROM payment_history
GROUP BY station_code;
- RANK: ties share the same rank, the next rank skips numbers (1, 2, 2, 4).
- DENSE_RANK: ties share the same rank, the next rank does not skip (1, 2, 2, 3).
LAG / LEAD (Compare to Previous or Next Row)
SELECT station_code,
DATE(paid_time) AS day,
SUM(amount_received) AS daily_revenue,
LAG(SUM(amount_received)) OVER (
PARTITION BY station_code
ORDER BY DATE(paid_time)
) AS prev_day_revenue,
LEAD(SUM(amount_received)) OVER (
PARTITION BY station_code
ORDER BY DATE(paid_time)
) AS next_day_revenue
FROM payment_history
GROUP BY station_code, DATE(paid_time);
- LAG: looks at the previous row's value.
- LEAD: looks at the next row's value.
- Useful for day-over-day or month-over-month comparisons.
Running Total (SUM OVER)
SELECT DATE(paid_time) AS day,
SUM(amount_received) AS daily_revenue,
SUM(SUM(amount_received)) OVER (
ORDER BY DATE(paid_time)
) AS running_total
FROM payment_history
GROUP BY DATE(paid_time)
ORDER BY day;
OVER (ORDER BY ...) with no PARTITION BY runs across the whole result set.
NULL Handling
COALESCE (First Non-NULL Value)
SELECT station_code,
license_plate,
COALESCE(exit_time, 'Still Parked') AS status
FROM parking_history;
COALESCE returns the first argument that is not NULL. Useful for replacing NULLs with a default.
-- Several fallbacks
SELECT COALESCE(phone_number, email, 'No contact') AS contact
FROM customer_table;
It checks left to right and returns the first value that is not NULL.
NULLIF (Return NULL on a Match)
SELECT station_code,
total_visits,
total_revenue / NULLIF(total_visits, 0) AS revenue_per_visit
FROM station_summary;
NULLIF(a, b) returns NULL if a = b, otherwise it returns a.
Classic use: prevent divide-by-zero errors.
String & Date Functions
String Functions
SELECT UPPER(station_name) AS name_upper, -- 'north lot' → 'NORTH LOT'
LOWER(station_name) AS name_lower, -- 'NORTH LOT' → 'north lot'
TRIM(station_name) AS name_trimmed, -- removes spaces at start and end
LENGTH(station_name) AS name_length, -- number of characters
SUBSTRING(license_plate, 1, 3) AS plate_prefix, -- first 3 chars: 'AAA'
CONCAT(city, ' - ', station_name) AS full_name -- join strings together
FROM management_table;
Date Functions
SELECT entry_time,
DATE(entry_time) AS date_only, -- '2025-06-15'
DATE_FORMAT(entry_time, '%Y-%m') AS year_month, -- '2025-06'
DATEDIFF(exit_time, entry_time) AS days_parked, -- difference in days
DATE_ADD(entry_time, INTERVAL 2 HOUR) AS two_hours_later,
DATE_SUB(entry_time, INTERVAL 1 DAY) AS day_before,
NOW() AS current_datetime,
CURDATE() AS today
FROM parking_history;
DATE()drops the time part.DATE_FORMAT()converts to any string format (%Y= year,%m= month,%d= day,%H= hour).DATEDIFF(end, start)returns the number of days between two dates.DATE_ADD / DATE_SUBshift a date by an interval (HOUR, DAY, MONTH, YEAR).
Type Conversion
CAST (Convert Data Types)
SELECT license_plate,
CAST(parked_hours AS DECIMAL(10, 2)) AS hours_decimal,
CAST(entry_time AS DATE) AS entry_date,
CAST(amount AS CHAR) AS amount_text
FROM parking_history;
CAST(value AS type) converts a value to the target type. Common target types: INT, DECIMAL(p, s), CHAR, DATE, DATETIME.
Practical Use: Fixing Messy Data
-- A string stored as a number → you cannot do math without CAST
SELECT station_code,
SUM(CAST(revenue_text AS DECIMAL(10, 2))) AS total_revenue
FROM raw_import_table
GROUP BY station_code;
Real-world data often comes in the wrong type (for example, numbers imported as text). CAST fixes this before you aggregate.
-- Safe date filtering when the date column is stored as text
SELECT *
FROM parking_history
WHERE CAST(entry_time AS DATE) = '2025-06-15';
CONVERT (MySQL Alternative)
SELECT CONVERT(amount, DECIMAL(10,2)) AS amount_decimal,
CONVERT(entry_time, DATE) AS entry_date
FROM parking_history;
CONVERT(value, type) is MySQL-specific. CAST is the SQL standard and works in most databases.
Pivot (Rows to Columns)
A pivot turns row values into column headers — the most common business report format.
Monthly Revenue by Station (CASE + GROUP BY)
SELECT station_code,
SUM(CASE WHEN MONTH(paid_time) = 1 THEN amount_received ELSE 0 END) AS jan,
SUM(CASE WHEN MONTH(paid_time) = 2 THEN amount_received ELSE 0 END) AS feb,
SUM(CASE WHEN MONTH(paid_time) = 3 THEN amount_received ELSE 0 END) AS mar,
SUM(CASE WHEN MONTH(paid_time) = 4 THEN amount_received ELSE 0 END) AS apr,
SUM(CASE WHEN MONTH(paid_time) = 5 THEN amount_received ELSE 0 END) AS may,
SUM(CASE WHEN MONTH(paid_time) = 6 THEN amount_received ELSE 0 END) AS jun,
SUM(CASE WHEN MONTH(paid_time) = 7 THEN amount_received ELSE 0 END) AS jul,
SUM(CASE WHEN MONTH(paid_time) = 8 THEN amount_received ELSE 0 END) AS aug,
SUM(CASE WHEN MONTH(paid_time) = 9 THEN amount_received ELSE 0 END) AS sep,
SUM(CASE WHEN MONTH(paid_time) = 10 THEN amount_received ELSE 0 END) AS oct,
SUM(CASE WHEN MONTH(paid_time) = 11 THEN amount_received ELSE 0 END) AS nov,
SUM(CASE WHEN MONTH(paid_time) = 12 THEN amount_received ELSE 0 END) AS dec
FROM payment_history
WHERE YEAR(paid_time) = 2025
GROUP BY station_code
ORDER BY station_code;
Pattern: one SUM(CASE WHEN ... THEN value ELSE 0 END) per column you want to create.
Payment Method Breakdown per Station
SELECT station_code,
SUM(CASE WHEN payment_method = 'Credit_Card' THEN 1 ELSE 0 END) AS credit_card_count,
SUM(CASE WHEN payment_method = 'Cash' THEN 1 ELSE 0 END) AS cash_count,
SUM(CASE WHEN payment_method = 'Mobile_Pay' THEN 1 ELSE 0 END) AS mobile_pay_count,
COUNT(*) AS total_count
FROM payment_history
GROUP BY station_code;
Use SUM(CASE WHEN ... THEN 1 ELSE 0 END) to count occurrences, or SUM(... THEN amount ...) to add up values.
% of Total (Pivot with Share)
SELECT station_code,
COUNT(*) AS total,
SUM(CASE WHEN payment_method = 'Credit_Card' THEN 1 ELSE 0 END) AS credit_card_count,
ROUND(
SUM(CASE WHEN payment_method = 'Credit_Card' THEN 1 ELSE 0 END)
/ COUNT(*) * 100, 1
) AS credit_card_pct
FROM payment_history
GROUP BY station_code;
Divide the pivoted count by COUNT(*) to get the percentage share — a standard business reporting pattern.
Ready to apply these building blocks to real problems? See SQL Patterns for reusable solutions to common analysis and interview questions. For window frames, recursive CTEs, grouping sets, and query performance, see SQL Advanced Notebook.