Excel Notebook

Excel is the most widely used data tool in Taiwan's workplace. Even if you primarily work in Python or SQL, Excel is the common language between analysts and non-technical stakeholders. This notebook covers the functions and workflows that appear most often in DA work.

For programmatic equivalents, see Pandas Notebook and SQL Notes.

All examples use a parking system dataset with columns: station_code, station_name, city, license_plate, entry_time, exit_time, parking_type, amount, payment_method.


Keyboard Shortcuts Worth Knowing

Shortcut Action
Ctrl + T Convert range to Table (auto-expands formulas)
Ctrl + Shift + L Toggle filters
`Ctrl + `` Show all formulas
Alt + = Auto-sum selected range
F4 Toggle absolute/relative reference (A1$A$1)
Ctrl + D Fill down
Ctrl + Shift + End Select to last used cell

Tables vs Ranges

Always convert raw data to a Table (Ctrl + T) before working with it.

Range Table
Formula fill Manual Auto-extends to new rows
Column reference B2:B1000 Table1[amount] — readable
Filter Manual setup Built-in
Pivot Table source Fixed range Auto-updates

Reference a table column in formulas: =SUM(Table1[amount]) instead of =SUM(B2:B1000).


Lookup Functions

XLOOKUP — Modern Standard

=XLOOKUP(lookup_value, lookup_array, return_array, [if_not_found])
-- find station name by station code
=XLOOKUP(A2, station_table[station_code], station_table[station_name], "Not Found")

-- return multiple columns at once (spills automatically)
=XLOOKUP(A2, station_table[station_code], station_table[[station_name]:[city]])

XLOOKUP replaces VLOOKUP entirely — it searches any direction, handles missing values cleanly, and can return multiple columns. Use it by default.

VLOOKUP — Legacy (Still Common)

=VLOOKUP(lookup_value, table_array, col_index_num, [exact_match])
-- find city for a station code (city is the 3rd column of the lookup table)
=VLOOKUP(A2, $E$2:$G$100, 3, FALSE)

Always use FALSE (exact match) for DA work. The TRUE (approximate match) is for sorted ranges only and causes subtle bugs.

VLOOKUP limitations — use XLOOKUP instead when possible: - Lookup column must be the leftmost column of the table - No built-in "not found" fallback (requires wrapping in IFERROR) - Returns only one column

INDEX / MATCH — Flexible Lookup

=INDEX(return_range, MATCH(lookup_value, lookup_range, 0))
-- same as VLOOKUP example above, but lookup column can be anywhere
=INDEX(station_table[city], MATCH(A2, station_table[station_code], 0))

Use INDEX/MATCH when the return column is to the left of the lookup column — VLOOKUP cannot do this.


Conditional Aggregation

SUMIF / SUMIFS

=SUMIF(range, criteria, sum_range)
=SUMIFS(sum_range, criteria_range1, criteria1, criteria_range2, criteria2, ...)
-- total revenue for station S001
=SUMIF(Table1[station_code], "S001", Table1[amount])

-- total revenue for station S001, hourly parking only
=SUMIFS(Table1[amount], Table1[station_code], "S001", Table1[parking_type], "hourly")

-- revenue for a station referenced by cell A2
=SUMIFS(Table1[amount], Table1[station_code], A2, Table1[parking_type], "hourly")

-- revenue greater than 500
=SUMIFS(Table1[amount], Table1[amount], ">500")

SUMIF is single-condition; SUMIFS handles multiple conditions. When in doubt, just use SUMIFS — it handles the single-condition case too.

COUNTIF / COUNTIFS

-- count visits to station S001
=COUNTIF(Table1[station_code], "S001")

-- count hourly visits to station S001
=COUNTIFS(Table1[station_code], "S001", Table1[parking_type], "hourly")

-- count amounts above 500
=COUNTIF(Table1[amount], ">500")

AVERAGEIF / AVERAGEIFS

-- average amount for credit card payments
=AVERAGEIF(Table1[payment_method], "Credit_Card", Table1[amount])

-- average amount for credit card, hourly only
=AVERAGEIFS(Table1[amount], Table1[payment_method], "Credit_Card", Table1[parking_type], "hourly")

Logical Functions

IF

=IF(logical_test, value_if_true, value_if_false)
-- flag high-value transactions
=IF(B2>500, "High", "Normal")

-- nested IF: three tiers
=IF(B2>1000, "High", IF(B2>500, "Medium", "Low"))

Avoid deeply nested IFs (more than 2 levels) — use IFS or SWITCH instead.

IFS — Multiple Conditions (No Nesting)

=IFS(condition1, value1, condition2, value2, ..., TRUE, default)
=IFS(
    B2 > 1000, "High",
    B2 > 500,  "Medium",
    B2 > 100,  "Low",
    TRUE,      "Minimal"
)

The final TRUE, default acts as the else clause. Always include it — without it, IFS returns an error when no condition matches.

SWITCH — Match Against Fixed Values

=SWITCH(expression, value1, result1, value2, result2, ..., default)
-- translate payment method codes
=SWITCH(C2,
    "CC",  "Credit Card",
    "MP",  "Mobile Pay",
    "CASH","Cash",
    "Other"
)

Use SWITCH when matching a value against a list; use IFS when evaluating different conditions.


Text Functions

=LEFT(text, n)           -- first n characters
=RIGHT(text, n)          -- last n characters
=MID(text, start, n)     -- n characters starting at position start
=LEN(text)               -- character count
=TRIM(text)              -- remove leading/trailing spaces
=UPPER(text)             -- uppercase
=LOWER(text)             -- lowercase
=PROPER(text)            -- Title Case
=SUBSTITUTE(text, old, new)  -- replace all occurrences
=FIND(find_text, within) -- position of substring (case-sensitive)
=CONCAT(text1, text2)    -- join strings (or use &)
=TEXT(value, format)     -- format a number or date as text

Common Patterns

-- extract first 3 characters of license plate
=LEFT(A2, 3)

-- remove dashes from license plate "ABC-1234" → "ABC1234"
=SUBSTITUTE(A2, "-", "")

-- combine station code and name: "S001 - North Station"
=A2 & " - " & B2

-- format date as "2025-06"
=TEXT(C2, "YYYY-MM")

-- check if station name contains "North"
=ISNUMBER(FIND("North", B2))

Date Functions

=TODAY()                          -- today's date
=NOW()                            -- current date and time
=YEAR(date)                       -- extract year
=MONTH(date)                      -- extract month (1–12)
=DAY(date)                        -- extract day
=WEEKDAY(date, 2)                 -- day of week (2 = Mon=1, Sun=7)
=TEXT(date, "DDDD")               -- weekday name: "Monday"
=EOMONTH(date, 0)                 -- last day of the same month
=EOMONTH(date, -1) + 1            -- first day of the same month
=DATEDIF(start, end, "D")        -- days between two dates
=DATEDIF(start, end, "M")        -- complete months between dates
=NETWORKDAYS(start, end)          -- working days between dates (excludes weekends)

Common Patterns

-- parking duration in minutes
=(D2 - C2) * 24 * 60

-- days since entry
=TODAY() - C2

-- format as year-month label for grouping
=TEXT(C2, "YYYY-MM")

-- which quarter
="Q" & ROUNDUP(MONTH(C2) / 3, 0)

-- first day of month (for monthly grouping)
=DATE(YEAR(C2), MONTH(C2), 1)

Error Handling

=IFERROR(formula, value_if_error)   -- catch any error
=IFNA(formula, value_if_na)         -- catch #N/A only (lookup not found)
=ISBLANK(cell)                      -- TRUE if cell is empty
=ISNUMBER(value)                    -- TRUE if value is numeric
-- VLOOKUP with fallback instead of #N/A
=IFERROR(VLOOKUP(A2, $E$2:$G$100, 3, FALSE), "Not Found")

-- safer: use XLOOKUP with built-in fallback
=XLOOKUP(A2, station_table[station_code], station_table[city], "Not Found")

Use IFNA instead of IFERROR when you only want to catch missing lookups — IFERROR silently swallows all errors including formula mistakes.


Pivot Tables

Pivot tables are the fastest way to summarize tabular data. Select any cell in your Table, then: Insert → Pivot Table.

Basic Setup

Drag fields to the four areas:

Area Purpose Example
Rows Group by station_code
Columns Cross-tabulate payment_method
Values Aggregate amount → Sum
Filters Slice entire table city

Value Field Settings

Right-click a value → Value Field Settings to change the aggregation:

Setting Equivalent
Sum SUM()
Count COUNT(*)
Average AVG()
% of Grand Total Share of total
% of Row Total Row-level share
Running Total In Cumulative sum

Grouping Dates

Right-click a date field in Rows → Group → select Year / Quarter / Month. This creates a hierarchy without adding helper columns.

Slicers

Insert → Slicer adds a visual filter button panel. Slicers can control multiple pivot tables on the same sheet — right-click the slicer → Report Connections.


Conditional Formatting

Select a range → Home → Conditional Formatting.

Use case Rule type
Highlight top 10 values Top/Bottom Rules → Top 10 Items
Color scale by value Color Scales
In-cell bar chart Data Bars
Flag values above threshold Highlight Cell Rules → Greater Than
Duplicate values Highlight Cell Rules → Duplicate Values

For DA dashboards: use Data Bars on amount columns and Color Scales on percentage columns — faster to read than raw numbers.


Common DA Workflows

1. Monthly Revenue Summary

1. Convert data to Table (Ctrl+T)
2. Insert Pivot Table
3. Rows: entry_time (grouped by Month)
4. Values: amount → Sum, parking_id → Count
5. Add slicer on city or parking_type

2. VLOOKUP / XLOOKUP to Enrich Data

Scenario: parking transaction table has station_code but not city.
Reference table: station_code | station_name | city

=XLOOKUP([@station_code], station_table[station_code], station_table[city], "Unknown")

Add this as a calculated column in the transactions table. The @ prefix means "this row's value" — it auto-fills down the entire table.

3. KPI Dashboard with SUMIFS

Build a summary table where each row is a station and each column is a metric:

-- in cell B2 (station S001 total revenue)
=SUMIFS(transactions[amount], transactions[station_code], $A2)

-- in cell C2 (station S001 hourly revenue)
=SUMIFS(transactions[amount], transactions[station_code], $A2, transactions[parking_type], "hourly")

-- in cell D2 (visit count)
=COUNTIFS(transactions[station_code], $A2)

Lock the station column ($A2) so you can copy the formula across multiple metric columns.

4. Period-over-Period Comparison

-- current month revenue (where helper column "YM" = TEXT(entry_time,"YYYY-MM"))
=SUMIF(Table1[YM], TEXT(TODAY(),"YYYY-MM"), Table1[amount])

-- previous month
=SUMIF(Table1[YM], TEXT(EDATE(TODAY(),-1),"YYYY-MM"), Table1[amount])

-- MoM change %
=(current - previous) / previous

5. De-duplication Check

-- flag duplicate license plates
=COUNTIF($A$2:$A$1000, A2)

-- rows where count > 1 are duplicates
-- filter on this column and review

Or use: Data → Remove Duplicates to delete duplicates in place (destructive — work on a copy).

6. Data Validation

Data → Data Validation restricts what can be entered in a cell.

Allow: List
Source: hourly,monthly,daily

This creates a dropdown, preventing typos in categorical columns. Useful when building input templates for non-technical colleagues.


Excel vs Pandas — Quick Reference

Task Excel Pandas
Filter rows AutoFilter or boolean formula df[df['col'] == val]
VLOOKUP / XLOOKUP =XLOOKUP(...) pd.merge()
SUMIFS =SUMIFS(...) df.groupby().agg()
Pivot table Insert → Pivot Table pd.pivot_table()
Remove duplicates Data → Remove Duplicates df.drop_duplicates()
Date format TEXT(date, "YYYY-MM") df['col'].dt.to_period('M')
Conditional column =IFS(...) np.select(...)
Running total Value Field → Running Total df['col'].cumsum()

Use Excel when sharing with non-technical stakeholders or building input templates. Use Pandas when the dataset is large, the logic is complex, or the process needs to be automated.