Excel Notebook
Excel is the tool most Taiwan offices use every day. Even if you mainly work in Python or SQL, Excel is the shared language between you and your non-technical colleagues. This notebook covers the functions and workflows that show up most often in DA work.
For the same things in code, see Pandas Notebook and SQL Notes.
All examples use a parking system dataset with these columns: station_code, station_name, city, license_plate, entry_time, exit_time, parking_type, amount, payment_method.
Keyboard Shortcuts Worth Knowing
| Shortcut | Action |
|---|---|
Ctrl + T |
Turn a range into a Table (formulas auto-extend) |
Ctrl + Shift + L |
Turn filters on / off |
| `Ctrl + `` | Show every formula on the sheet |
Alt + = |
Auto-sum the selected range |
F4 |
Switch between absolute and relative reference (A1 → $A$1) |
Ctrl + D |
Fill down |
Ctrl + Shift + End |
Select to the last used cell |
Tables vs Ranges
Always turn raw data into a Table (Ctrl + T) before you work on it.
| Range | Table | |
|---|---|---|
| Formula fill | Manual | Auto-extends to new rows |
| Column reference | B2:B1000 |
Table1[amount] — readable |
| Filter | Set up by hand | Built in |
| Pivot Table source | Fixed range | Auto-updates |
Refer to a table column in formulas: =SUM(Table1[amount]) instead of =SUM(B2:B1000).
Lookup Functions
XLOOKUP — Modern Standard
=XLOOKUP(lookup_value, lookup_array, return_array, [if_not_found])
-- find station name by station code
=XLOOKUP(A2, station_table[station_code], station_table[station_name], "Not Found")
-- return several columns at once (spills automatically)
=XLOOKUP(A2, station_table[station_code], station_table[[station_name]:[city]])
XLOOKUP replaces VLOOKUP entirely. It searches in any direction, handles missing values cleanly, and can return several columns. Use it by default.
VLOOKUP — Legacy (Still Common)
=VLOOKUP(lookup_value, table_array, col_index_num, [exact_match])
-- find city for a station code (city is the 3rd column of the lookup table)
=VLOOKUP(A2, $E$2:$G$100, 3, FALSE)
Always use FALSE (exact match) for DA work. The TRUE (approximate match) is for sorted ranges only and causes hidden bugs.
VLOOKUP limits — use XLOOKUP instead when you can:
- The lookup column has to be the leftmost column of the table.
- No built-in "not found" fallback (you have to wrap it in IFERROR).
- Returns only one column.
INDEX / MATCH — Flexible Lookup
=INDEX(return_range, MATCH(lookup_value, lookup_range, 0))
-- same as the VLOOKUP example above, but the lookup column can sit anywhere
=INDEX(station_table[city], MATCH(A2, station_table[station_code], 0))
Use INDEX/MATCH when the column you want is to the left of the lookup column. VLOOKUP cannot do this.
Conditional Aggregation
SUMIF / SUMIFS
=SUMIF(range, criteria, sum_range)
=SUMIFS(sum_range, criteria_range1, criteria1, criteria_range2, criteria2, ...)
-- total revenue for station S001
=SUMIF(Table1[station_code], "S001", Table1[amount])
-- total revenue for station S001, hourly parking only
=SUMIFS(Table1[amount], Table1[station_code], "S001", Table1[parking_type], "hourly")
-- revenue for a station coming from cell A2
=SUMIFS(Table1[amount], Table1[station_code], A2, Table1[parking_type], "hourly")
-- revenue greater than 500
=SUMIFS(Table1[amount], Table1[amount], ">500")
SUMIF takes one condition. SUMIFS takes several. When in doubt, use SUMIFS — it works for the one-condition case too.
COUNTIF / COUNTIFS
-- count visits to station S001
=COUNTIF(Table1[station_code], "S001")
-- count hourly visits to station S001
=COUNTIFS(Table1[station_code], "S001", Table1[parking_type], "hourly")
-- count amounts above 500
=COUNTIF(Table1[amount], ">500")
AVERAGEIF / AVERAGEIFS
-- average amount for credit card payments
=AVERAGEIF(Table1[payment_method], "Credit_Card", Table1[amount])
-- average amount for credit card, hourly only
=AVERAGEIFS(Table1[amount], Table1[payment_method], "Credit_Card", Table1[parking_type], "hourly")
Logical Functions
IF
=IF(logical_test, value_if_true, value_if_false)
-- flag high-value transactions
=IF(B2>500, "High", "Normal")
-- nested IF: three tiers
=IF(B2>1000, "High", IF(B2>500, "Medium", "Low"))
Avoid deep IF nesting (more than 2 levels). Use IFS or SWITCH instead.
IFS — Many Conditions, No Nesting
=IFS(condition1, value1, condition2, value2, ..., TRUE, default)
=IFS(
B2 > 1000, "High",
B2 > 500, "Medium",
B2 > 100, "Low",
TRUE, "Minimal"
)
The final TRUE, default is the else clause. Always include it. Without it, IFS returns an error when nothing matches.
SWITCH — Match Against Fixed Values
=SWITCH(expression, value1, result1, value2, result2, ..., default)
-- translate payment method codes
=SWITCH(C2,
"CC", "Credit Card",
"MP", "Mobile Pay",
"CASH","Cash",
"Other"
)
Use SWITCH when you check a value against a list of fixed values. Use IFS when you check different conditions.
Text Functions
=LEFT(text, n) -- first n characters
=RIGHT(text, n) -- last n characters
=MID(text, start, n) -- n characters starting at position start
=LEN(text) -- character count
=TRIM(text) -- remove spaces at the start and end
=UPPER(text) -- uppercase
=LOWER(text) -- lowercase
=PROPER(text) -- Title Case
=SUBSTITUTE(text, old, new) -- replace every match
=FIND(find_text, within) -- position of a substring (case-sensitive)
=CONCAT(text1, text2) -- join strings (or use &)
=TEXT(value, format) -- format a number or date as text
Common Patterns
-- get the first 3 characters of a license plate
=LEFT(A2, 3)
-- remove dashes from license plate "ABC-1234" → "ABC1234"
=SUBSTITUTE(A2, "-", "")
-- combine station code and name: "S001 - North Station"
=A2 & " - " & B2
-- format a date as "2025-06"
=TEXT(C2, "YYYY-MM")
-- check if station name contains "North"
=ISNUMBER(FIND("North", B2))
Date Functions
=TODAY() -- today's date
=NOW() -- the current date and time
=YEAR(date) -- pull out the year
=MONTH(date) -- pull out the month (1–12)
=DAY(date) -- pull out the day
=WEEKDAY(date, 2) -- day of the week (2 = Mon=1, Sun=7)
=TEXT(date, "DDDD") -- weekday name: "Monday"
=EOMONTH(date, 0) -- last day of the same month
=EOMONTH(date, -1) + 1 -- first day of the same month
=DATEDIF(start, end, "D") -- days between two dates
=DATEDIF(start, end, "M") -- complete months between two dates
=NETWORKDAYS(start, end) -- working days between two dates (no weekends)
Common Patterns
-- parking duration in minutes
=(D2 - C2) * 24 * 60
-- days since entry
=TODAY() - C2
-- format as a year-month label for grouping
=TEXT(C2, "YYYY-MM")
-- which quarter
="Q" & ROUNDUP(MONTH(C2) / 3, 0)
-- first day of the month (for monthly grouping)
=DATE(YEAR(C2), MONTH(C2), 1)
Error Handling
=IFERROR(formula, value_if_error) -- catch any error
=IFNA(formula, value_if_na) -- catch only #N/A (lookup not found)
=ISBLANK(cell) -- TRUE if the cell is empty
=ISNUMBER(value) -- TRUE if the value is numeric
-- VLOOKUP with a fallback instead of #N/A
=IFERROR(VLOOKUP(A2, $E$2:$G$100, 3, FALSE), "Not Found")
-- safer: use XLOOKUP with a built-in fallback
=XLOOKUP(A2, station_table[station_code], station_table[city], "Not Found")
Use IFNA instead of IFERROR when you only want to catch missing lookups. IFERROR quietly hides every error, including formula mistakes.
Pivot Tables
Pivot tables are the fastest way to summarize tabular data. Click any cell in your Table, then: Insert → Pivot Table.
Basic Setup
Drag fields into the four areas:
| Area | Purpose | Example |
|---|---|---|
| Rows | Group by | station_code |
| Columns | Cross-tabulate | payment_method |
| Values | Aggregate | amount → Sum |
| Filters | Slice the whole table | city |
Value Field Settings
Right-click a value → Value Field Settings to change the aggregation:
| Setting | Equivalent |
|---|---|
| Sum | SUM() |
| Count | COUNT(*) |
| Average | AVG() |
| % of Grand Total | Share of the total |
| % of Row Total | Share within the row |
| Running Total In | Running sum |
Grouping Dates
Right-click a date field in Rows → Group → pick Year / Quarter / Month. This creates a date hierarchy without needing helper columns.
Slicers
Insert → Slicer adds a panel of filter buttons. One slicer can drive several pivot tables on the same sheet — right-click the slicer → Report Connections.
Conditional Formatting
Select a range → Home → Conditional Formatting.
| Use case | Rule type |
|---|---|
| Highlight top 10 values | Top/Bottom Rules → Top 10 Items |
| Color scale by value | Color Scales |
| In-cell bar chart | Data Bars |
| Flag values above a threshold | Highlight Cell Rules → Greater Than |
| Duplicate values | Highlight Cell Rules → Duplicate Values |
For DA dashboards: use Data Bars on amount columns and Color Scales on percentage columns. Both are faster to read than raw numbers.
Common DA Workflows
1. Monthly Revenue Summary
1. Turn the data into a Table (Ctrl+T)
2. Insert a Pivot Table
3. Rows: entry_time (grouped by Month)
4. Values: amount → Sum, parking_id → Count
5. Add a slicer on city or parking_type
2. VLOOKUP / XLOOKUP to Add Information
Scenario: the parking transaction table has station_code but not city.
Reference table: station_code | station_name | city
=XLOOKUP([@station_code], station_table[station_code], station_table[city], "Unknown")
Add this as a calculated column in the transactions table. The @ prefix means "this row's value", and the formula auto-fills down the whole table.
3. KPI Dashboard with SUMIFS
Build a summary table where each row is a station and each column is a metric:
-- in cell B2 (station S001 total revenue)
=SUMIFS(transactions[amount], transactions[station_code], $A2)
-- in cell C2 (station S001 hourly revenue)
=SUMIFS(transactions[amount], transactions[station_code], $A2, transactions[parking_type], "hourly")
-- in cell D2 (visit count)
=COUNTIFS(transactions[station_code], $A2)
Lock the station column ($A2) so you can copy the formula across the metric columns.
4. Period-over-Period Comparison
-- current month revenue (where the helper column "YM" = TEXT(entry_time,"YYYY-MM"))
=SUMIF(Table1[YM], TEXT(TODAY(),"YYYY-MM"), Table1[amount])
-- previous month
=SUMIF(Table1[YM], TEXT(EDATE(TODAY(),-1),"YYYY-MM"), Table1[amount])
-- MoM change %
=(current - previous) / previous
5. De-duplication Check
-- flag duplicate license plates
=COUNTIF($A$2:$A$1000, A2)
-- rows where the count > 1 are duplicates
-- filter on this column and check
Or use: Data → Remove Duplicates to delete duplicates in place (destructive — work on a copy).
6. Data Validation
Data → Data Validation controls what can go in a cell.
Allow: List
Source: hourly,monthly,daily
This adds a dropdown, which prevents typos in category columns. Useful when you build input templates for non-technical colleagues.
Excel vs Pandas — Quick Reference
| Task | Excel | Pandas |
|---|---|---|
| Filter rows | AutoFilter or a boolean formula | df[df['col'] == val] |
| VLOOKUP / XLOOKUP | =XLOOKUP(...) |
pd.merge() |
| SUMIFS | =SUMIFS(...) |
df.groupby().agg() |
| Pivot table | Insert → Pivot Table | pd.pivot_table() |
| Remove duplicates | Data → Remove Duplicates | df.drop_duplicates() |
| Date format | TEXT(date, "YYYY-MM") |
df['col'].dt.to_period('M') |
| Conditional column | =IFS(...) |
np.select(...) |
| Running total | Value Field → Running Total | df['col'].cumsum() |
Use Excel when you share with non-technical colleagues or build input templates. Use Pandas when the dataset is large, the logic is complex, or the process has to run on its own.