Excel Notebook
Excel is the most widely used data tool in Taiwan's workplace. Even if you primarily work in Python or SQL, Excel is the common language between analysts and non-technical stakeholders. This notebook covers the functions and workflows that appear most often in DA work.
For programmatic equivalents, see Pandas Notebook and SQL Notes.
All examples use a parking system dataset with columns: station_code, station_name, city, license_plate, entry_time, exit_time, parking_type, amount, payment_method.
Keyboard Shortcuts Worth Knowing
| Shortcut | Action |
|---|---|
Ctrl + T |
Convert range to Table (auto-expands formulas) |
Ctrl + Shift + L |
Toggle filters |
| `Ctrl + `` | Show all formulas |
Alt + = |
Auto-sum selected range |
F4 |
Toggle absolute/relative reference (A1 → $A$1) |
Ctrl + D |
Fill down |
Ctrl + Shift + End |
Select to last used cell |
Tables vs Ranges
Always convert raw data to a Table (Ctrl + T) before working with it.
| Range | Table | |
|---|---|---|
| Formula fill | Manual | Auto-extends to new rows |
| Column reference | B2:B1000 |
Table1[amount] — readable |
| Filter | Manual setup | Built-in |
| Pivot Table source | Fixed range | Auto-updates |
Reference a table column in formulas: =SUM(Table1[amount]) instead of =SUM(B2:B1000).
Lookup Functions
XLOOKUP — Modern Standard
=XLOOKUP(lookup_value, lookup_array, return_array, [if_not_found])
-- find station name by station code
=XLOOKUP(A2, station_table[station_code], station_table[station_name], "Not Found")
-- return multiple columns at once (spills automatically)
=XLOOKUP(A2, station_table[station_code], station_table[[station_name]:[city]])
XLOOKUP replaces VLOOKUP entirely — it searches any direction, handles missing values cleanly, and can return multiple columns. Use it by default.
VLOOKUP — Legacy (Still Common)
=VLOOKUP(lookup_value, table_array, col_index_num, [exact_match])
-- find city for a station code (city is the 3rd column of the lookup table)
=VLOOKUP(A2, $E$2:$G$100, 3, FALSE)
Always use FALSE (exact match) for DA work. The TRUE (approximate match) is for sorted ranges only and causes subtle bugs.
VLOOKUP limitations — use XLOOKUP instead when possible:
- Lookup column must be the leftmost column of the table
- No built-in "not found" fallback (requires wrapping in IFERROR)
- Returns only one column
INDEX / MATCH — Flexible Lookup
=INDEX(return_range, MATCH(lookup_value, lookup_range, 0))
-- same as VLOOKUP example above, but lookup column can be anywhere
=INDEX(station_table[city], MATCH(A2, station_table[station_code], 0))
Use INDEX/MATCH when the return column is to the left of the lookup column — VLOOKUP cannot do this.
Conditional Aggregation
SUMIF / SUMIFS
=SUMIF(range, criteria, sum_range)
=SUMIFS(sum_range, criteria_range1, criteria1, criteria_range2, criteria2, ...)
-- total revenue for station S001
=SUMIF(Table1[station_code], "S001", Table1[amount])
-- total revenue for station S001, hourly parking only
=SUMIFS(Table1[amount], Table1[station_code], "S001", Table1[parking_type], "hourly")
-- revenue for a station referenced by cell A2
=SUMIFS(Table1[amount], Table1[station_code], A2, Table1[parking_type], "hourly")
-- revenue greater than 500
=SUMIFS(Table1[amount], Table1[amount], ">500")
SUMIF is single-condition; SUMIFS handles multiple conditions. When in doubt, just use SUMIFS — it handles the single-condition case too.
COUNTIF / COUNTIFS
-- count visits to station S001
=COUNTIF(Table1[station_code], "S001")
-- count hourly visits to station S001
=COUNTIFS(Table1[station_code], "S001", Table1[parking_type], "hourly")
-- count amounts above 500
=COUNTIF(Table1[amount], ">500")
AVERAGEIF / AVERAGEIFS
-- average amount for credit card payments
=AVERAGEIF(Table1[payment_method], "Credit_Card", Table1[amount])
-- average amount for credit card, hourly only
=AVERAGEIFS(Table1[amount], Table1[payment_method], "Credit_Card", Table1[parking_type], "hourly")
Logical Functions
IF
=IF(logical_test, value_if_true, value_if_false)
-- flag high-value transactions
=IF(B2>500, "High", "Normal")
-- nested IF: three tiers
=IF(B2>1000, "High", IF(B2>500, "Medium", "Low"))
Avoid deeply nested IFs (more than 2 levels) — use IFS or SWITCH instead.
IFS — Multiple Conditions (No Nesting)
=IFS(condition1, value1, condition2, value2, ..., TRUE, default)
=IFS(
B2 > 1000, "High",
B2 > 500, "Medium",
B2 > 100, "Low",
TRUE, "Minimal"
)
The final TRUE, default acts as the else clause. Always include it — without it, IFS returns an error when no condition matches.
SWITCH — Match Against Fixed Values
=SWITCH(expression, value1, result1, value2, result2, ..., default)
-- translate payment method codes
=SWITCH(C2,
"CC", "Credit Card",
"MP", "Mobile Pay",
"CASH","Cash",
"Other"
)
Use SWITCH when matching a value against a list; use IFS when evaluating different conditions.
Text Functions
=LEFT(text, n) -- first n characters
=RIGHT(text, n) -- last n characters
=MID(text, start, n) -- n characters starting at position start
=LEN(text) -- character count
=TRIM(text) -- remove leading/trailing spaces
=UPPER(text) -- uppercase
=LOWER(text) -- lowercase
=PROPER(text) -- Title Case
=SUBSTITUTE(text, old, new) -- replace all occurrences
=FIND(find_text, within) -- position of substring (case-sensitive)
=CONCAT(text1, text2) -- join strings (or use &)
=TEXT(value, format) -- format a number or date as text
Common Patterns
-- extract first 3 characters of license plate
=LEFT(A2, 3)
-- remove dashes from license plate "ABC-1234" → "ABC1234"
=SUBSTITUTE(A2, "-", "")
-- combine station code and name: "S001 - North Station"
=A2 & " - " & B2
-- format date as "2025-06"
=TEXT(C2, "YYYY-MM")
-- check if station name contains "North"
=ISNUMBER(FIND("North", B2))
Date Functions
=TODAY() -- today's date
=NOW() -- current date and time
=YEAR(date) -- extract year
=MONTH(date) -- extract month (1–12)
=DAY(date) -- extract day
=WEEKDAY(date, 2) -- day of week (2 = Mon=1, Sun=7)
=TEXT(date, "DDDD") -- weekday name: "Monday"
=EOMONTH(date, 0) -- last day of the same month
=EOMONTH(date, -1) + 1 -- first day of the same month
=DATEDIF(start, end, "D") -- days between two dates
=DATEDIF(start, end, "M") -- complete months between dates
=NETWORKDAYS(start, end) -- working days between dates (excludes weekends)
Common Patterns
-- parking duration in minutes
=(D2 - C2) * 24 * 60
-- days since entry
=TODAY() - C2
-- format as year-month label for grouping
=TEXT(C2, "YYYY-MM")
-- which quarter
="Q" & ROUNDUP(MONTH(C2) / 3, 0)
-- first day of month (for monthly grouping)
=DATE(YEAR(C2), MONTH(C2), 1)
Error Handling
=IFERROR(formula, value_if_error) -- catch any error
=IFNA(formula, value_if_na) -- catch #N/A only (lookup not found)
=ISBLANK(cell) -- TRUE if cell is empty
=ISNUMBER(value) -- TRUE if value is numeric
-- VLOOKUP with fallback instead of #N/A
=IFERROR(VLOOKUP(A2, $E$2:$G$100, 3, FALSE), "Not Found")
-- safer: use XLOOKUP with built-in fallback
=XLOOKUP(A2, station_table[station_code], station_table[city], "Not Found")
Use IFNA instead of IFERROR when you only want to catch missing lookups — IFERROR silently swallows all errors including formula mistakes.
Pivot Tables
Pivot tables are the fastest way to summarize tabular data. Select any cell in your Table, then: Insert → Pivot Table.
Basic Setup
Drag fields to the four areas:
| Area | Purpose | Example |
|---|---|---|
| Rows | Group by | station_code |
| Columns | Cross-tabulate | payment_method |
| Values | Aggregate | amount → Sum |
| Filters | Slice entire table | city |
Value Field Settings
Right-click a value → Value Field Settings to change the aggregation:
| Setting | Equivalent |
|---|---|
| Sum | SUM() |
| Count | COUNT(*) |
| Average | AVG() |
| % of Grand Total | Share of total |
| % of Row Total | Row-level share |
| Running Total In | Cumulative sum |
Grouping Dates
Right-click a date field in Rows → Group → select Year / Quarter / Month. This creates a hierarchy without adding helper columns.
Slicers
Insert → Slicer adds a visual filter button panel. Slicers can control multiple pivot tables on the same sheet — right-click the slicer → Report Connections.
Conditional Formatting
Select a range → Home → Conditional Formatting.
| Use case | Rule type |
|---|---|
| Highlight top 10 values | Top/Bottom Rules → Top 10 Items |
| Color scale by value | Color Scales |
| In-cell bar chart | Data Bars |
| Flag values above threshold | Highlight Cell Rules → Greater Than |
| Duplicate values | Highlight Cell Rules → Duplicate Values |
For DA dashboards: use Data Bars on amount columns and Color Scales on percentage columns — faster to read than raw numbers.
Common DA Workflows
1. Monthly Revenue Summary
1. Convert data to Table (Ctrl+T)
2. Insert Pivot Table
3. Rows: entry_time (grouped by Month)
4. Values: amount → Sum, parking_id → Count
5. Add slicer on city or parking_type
2. VLOOKUP / XLOOKUP to Enrich Data
Scenario: parking transaction table has station_code but not city.
Reference table: station_code | station_name | city
=XLOOKUP([@station_code], station_table[station_code], station_table[city], "Unknown")
Add this as a calculated column in the transactions table. The @ prefix means "this row's value" — it auto-fills down the entire table.
3. KPI Dashboard with SUMIFS
Build a summary table where each row is a station and each column is a metric:
-- in cell B2 (station S001 total revenue)
=SUMIFS(transactions[amount], transactions[station_code], $A2)
-- in cell C2 (station S001 hourly revenue)
=SUMIFS(transactions[amount], transactions[station_code], $A2, transactions[parking_type], "hourly")
-- in cell D2 (visit count)
=COUNTIFS(transactions[station_code], $A2)
Lock the station column ($A2) so you can copy the formula across multiple metric columns.
4. Period-over-Period Comparison
-- current month revenue (where helper column "YM" = TEXT(entry_time,"YYYY-MM"))
=SUMIF(Table1[YM], TEXT(TODAY(),"YYYY-MM"), Table1[amount])
-- previous month
=SUMIF(Table1[YM], TEXT(EDATE(TODAY(),-1),"YYYY-MM"), Table1[amount])
-- MoM change %
=(current - previous) / previous
5. De-duplication Check
-- flag duplicate license plates
=COUNTIF($A$2:$A$1000, A2)
-- rows where count > 1 are duplicates
-- filter on this column and review
Or use: Data → Remove Duplicates to delete duplicates in place (destructive — work on a copy).
6. Data Validation
Data → Data Validation restricts what can be entered in a cell.
Allow: List
Source: hourly,monthly,daily
This creates a dropdown, preventing typos in categorical columns. Useful when building input templates for non-technical colleagues.
Excel vs Pandas — Quick Reference
| Task | Excel | Pandas |
|---|---|---|
| Filter rows | AutoFilter or boolean formula | df[df['col'] == val] |
| VLOOKUP / XLOOKUP | =XLOOKUP(...) |
pd.merge() |
| SUMIFS | =SUMIFS(...) |
df.groupby().agg() |
| Pivot table | Insert → Pivot Table | pd.pivot_table() |
| Remove duplicates | Data → Remove Duplicates | df.drop_duplicates() |
| Date format | TEXT(date, "YYYY-MM") |
df['col'].dt.to_period('M') |
| Conditional column | =IFS(...) |
np.select(...) |
| Running total | Value Field → Running Total | df['col'].cumsum() |
Use Excel when sharing with non-technical stakeholders or building input templates. Use Pandas when the dataset is large, the logic is complex, or the process needs to be automated.