Excel Notebook

Excel is the tool most Taiwan offices use every day. Even if you mainly work in Python or SQL, Excel is the shared language between you and your non-technical colleagues. This notebook covers the functions and workflows that show up most often in DA work.

For the same things in code, see Pandas Notebook and SQL Notes.

All examples use a parking system dataset with these columns: station_code, station_name, city, license_plate, entry_time, exit_time, parking_type, amount, payment_method.


Keyboard Shortcuts Worth Knowing

Shortcut Action
Ctrl + T Turn a range into a Table (formulas auto-extend)
Ctrl + Shift + L Turn filters on / off
`Ctrl + `` Show every formula on the sheet
Alt + = Auto-sum the selected range
F4 Switch between absolute and relative reference (A1$A$1)
Ctrl + D Fill down
Ctrl + Shift + End Select to the last used cell

Tables vs Ranges

Always turn raw data into a Table (Ctrl + T) before you work on it.

Range Table
Formula fill Manual Auto-extends to new rows
Column reference B2:B1000 Table1[amount] — readable
Filter Set up by hand Built in
Pivot Table source Fixed range Auto-updates

Refer to a table column in formulas: =SUM(Table1[amount]) instead of =SUM(B2:B1000).


Lookup Functions

XLOOKUP — Modern Standard

=XLOOKUP(lookup_value, lookup_array, return_array, [if_not_found])
-- find station name by station code
=XLOOKUP(A2, station_table[station_code], station_table[station_name], "Not Found")

-- return several columns at once (spills automatically)
=XLOOKUP(A2, station_table[station_code], station_table[[station_name]:[city]])

XLOOKUP replaces VLOOKUP entirely. It searches in any direction, handles missing values cleanly, and can return several columns. Use it by default.

VLOOKUP — Legacy (Still Common)

=VLOOKUP(lookup_value, table_array, col_index_num, [exact_match])
-- find city for a station code (city is the 3rd column of the lookup table)
=VLOOKUP(A2, $E$2:$G$100, 3, FALSE)

Always use FALSE (exact match) for DA work. The TRUE (approximate match) is for sorted ranges only and causes hidden bugs.

VLOOKUP limits — use XLOOKUP instead when you can: - The lookup column has to be the leftmost column of the table. - No built-in "not found" fallback (you have to wrap it in IFERROR). - Returns only one column.

INDEX / MATCH — Flexible Lookup

=INDEX(return_range, MATCH(lookup_value, lookup_range, 0))
-- same as the VLOOKUP example above, but the lookup column can sit anywhere
=INDEX(station_table[city], MATCH(A2, station_table[station_code], 0))

Use INDEX/MATCH when the column you want is to the left of the lookup column. VLOOKUP cannot do this.


Conditional Aggregation

SUMIF / SUMIFS

=SUMIF(range, criteria, sum_range)
=SUMIFS(sum_range, criteria_range1, criteria1, criteria_range2, criteria2, ...)
-- total revenue for station S001
=SUMIF(Table1[station_code], "S001", Table1[amount])

-- total revenue for station S001, hourly parking only
=SUMIFS(Table1[amount], Table1[station_code], "S001", Table1[parking_type], "hourly")

-- revenue for a station coming from cell A2
=SUMIFS(Table1[amount], Table1[station_code], A2, Table1[parking_type], "hourly")

-- revenue greater than 500
=SUMIFS(Table1[amount], Table1[amount], ">500")

SUMIF takes one condition. SUMIFS takes several. When in doubt, use SUMIFS — it works for the one-condition case too.

COUNTIF / COUNTIFS

-- count visits to station S001
=COUNTIF(Table1[station_code], "S001")

-- count hourly visits to station S001
=COUNTIFS(Table1[station_code], "S001", Table1[parking_type], "hourly")

-- count amounts above 500
=COUNTIF(Table1[amount], ">500")

AVERAGEIF / AVERAGEIFS

-- average amount for credit card payments
=AVERAGEIF(Table1[payment_method], "Credit_Card", Table1[amount])

-- average amount for credit card, hourly only
=AVERAGEIFS(Table1[amount], Table1[payment_method], "Credit_Card", Table1[parking_type], "hourly")

Logical Functions

IF

=IF(logical_test, value_if_true, value_if_false)
-- flag high-value transactions
=IF(B2>500, "High", "Normal")

-- nested IF: three tiers
=IF(B2>1000, "High", IF(B2>500, "Medium", "Low"))

Avoid deep IF nesting (more than 2 levels). Use IFS or SWITCH instead.

IFS — Many Conditions, No Nesting

=IFS(condition1, value1, condition2, value2, ..., TRUE, default)
=IFS(
    B2 > 1000, "High",
    B2 > 500,  "Medium",
    B2 > 100,  "Low",
    TRUE,      "Minimal"
)

The final TRUE, default is the else clause. Always include it. Without it, IFS returns an error when nothing matches.

SWITCH — Match Against Fixed Values

=SWITCH(expression, value1, result1, value2, result2, ..., default)
-- translate payment method codes
=SWITCH(C2,
    "CC",  "Credit Card",
    "MP",  "Mobile Pay",
    "CASH","Cash",
    "Other"
)

Use SWITCH when you check a value against a list of fixed values. Use IFS when you check different conditions.


Text Functions

=LEFT(text, n)           -- first n characters
=RIGHT(text, n)          -- last n characters
=MID(text, start, n)     -- n characters starting at position start
=LEN(text)               -- character count
=TRIM(text)              -- remove spaces at the start and end
=UPPER(text)             -- uppercase
=LOWER(text)             -- lowercase
=PROPER(text)            -- Title Case
=SUBSTITUTE(text, old, new)  -- replace every match
=FIND(find_text, within) -- position of a substring (case-sensitive)
=CONCAT(text1, text2)    -- join strings (or use &)
=TEXT(value, format)     -- format a number or date as text

Common Patterns

-- get the first 3 characters of a license plate
=LEFT(A2, 3)

-- remove dashes from license plate "ABC-1234" → "ABC1234"
=SUBSTITUTE(A2, "-", "")

-- combine station code and name: "S001 - North Station"
=A2 & " - " & B2

-- format a date as "2025-06"
=TEXT(C2, "YYYY-MM")

-- check if station name contains "North"
=ISNUMBER(FIND("North", B2))

Date Functions

=TODAY()                          -- today's date
=NOW()                            -- the current date and time
=YEAR(date)                       -- pull out the year
=MONTH(date)                      -- pull out the month (1–12)
=DAY(date)                        -- pull out the day
=WEEKDAY(date, 2)                 -- day of the week (2 = Mon=1, Sun=7)
=TEXT(date, "DDDD")               -- weekday name: "Monday"
=EOMONTH(date, 0)                 -- last day of the same month
=EOMONTH(date, -1) + 1            -- first day of the same month
=DATEDIF(start, end, "D")        -- days between two dates
=DATEDIF(start, end, "M")        -- complete months between two dates
=NETWORKDAYS(start, end)          -- working days between two dates (no weekends)

Common Patterns

-- parking duration in minutes
=(D2 - C2) * 24 * 60

-- days since entry
=TODAY() - C2

-- format as a year-month label for grouping
=TEXT(C2, "YYYY-MM")

-- which quarter
="Q" & ROUNDUP(MONTH(C2) / 3, 0)

-- first day of the month (for monthly grouping)
=DATE(YEAR(C2), MONTH(C2), 1)

Error Handling

=IFERROR(formula, value_if_error)   -- catch any error
=IFNA(formula, value_if_na)         -- catch only #N/A (lookup not found)
=ISBLANK(cell)                      -- TRUE if the cell is empty
=ISNUMBER(value)                    -- TRUE if the value is numeric
-- VLOOKUP with a fallback instead of #N/A
=IFERROR(VLOOKUP(A2, $E$2:$G$100, 3, FALSE), "Not Found")

-- safer: use XLOOKUP with a built-in fallback
=XLOOKUP(A2, station_table[station_code], station_table[city], "Not Found")

Use IFNA instead of IFERROR when you only want to catch missing lookups. IFERROR quietly hides every error, including formula mistakes.


Pivot Tables

Pivot tables are the fastest way to summarize tabular data. Click any cell in your Table, then: Insert → Pivot Table.

Basic Setup

Drag fields into the four areas:

Area Purpose Example
Rows Group by station_code
Columns Cross-tabulate payment_method
Values Aggregate amount → Sum
Filters Slice the whole table city

Value Field Settings

Right-click a value → Value Field Settings to change the aggregation:

Setting Equivalent
Sum SUM()
Count COUNT(*)
Average AVG()
% of Grand Total Share of the total
% of Row Total Share within the row
Running Total In Running sum

Grouping Dates

Right-click a date field in Rows → Group → pick Year / Quarter / Month. This creates a date hierarchy without needing helper columns.

Slicers

Insert → Slicer adds a panel of filter buttons. One slicer can drive several pivot tables on the same sheet — right-click the slicer → Report Connections.


Conditional Formatting

Select a range → Home → Conditional Formatting.

Use case Rule type
Highlight top 10 values Top/Bottom Rules → Top 10 Items
Color scale by value Color Scales
In-cell bar chart Data Bars
Flag values above a threshold Highlight Cell Rules → Greater Than
Duplicate values Highlight Cell Rules → Duplicate Values

For DA dashboards: use Data Bars on amount columns and Color Scales on percentage columns. Both are faster to read than raw numbers.


Common DA Workflows

1. Monthly Revenue Summary

1. Turn the data into a Table (Ctrl+T)
2. Insert a Pivot Table
3. Rows: entry_time (grouped by Month)
4. Values: amount → Sum, parking_id → Count
5. Add a slicer on city or parking_type

2. VLOOKUP / XLOOKUP to Add Information

Scenario: the parking transaction table has station_code but not city.
Reference table: station_code | station_name | city

=XLOOKUP([@station_code], station_table[station_code], station_table[city], "Unknown")

Add this as a calculated column in the transactions table. The @ prefix means "this row's value", and the formula auto-fills down the whole table.

3. KPI Dashboard with SUMIFS

Build a summary table where each row is a station and each column is a metric:

-- in cell B2 (station S001 total revenue)
=SUMIFS(transactions[amount], transactions[station_code], $A2)

-- in cell C2 (station S001 hourly revenue)
=SUMIFS(transactions[amount], transactions[station_code], $A2, transactions[parking_type], "hourly")

-- in cell D2 (visit count)
=COUNTIFS(transactions[station_code], $A2)

Lock the station column ($A2) so you can copy the formula across the metric columns.

4. Period-over-Period Comparison

-- current month revenue (where the helper column "YM" = TEXT(entry_time,"YYYY-MM"))
=SUMIF(Table1[YM], TEXT(TODAY(),"YYYY-MM"), Table1[amount])

-- previous month
=SUMIF(Table1[YM], TEXT(EDATE(TODAY(),-1),"YYYY-MM"), Table1[amount])

-- MoM change %
=(current - previous) / previous

5. De-duplication Check

-- flag duplicate license plates
=COUNTIF($A$2:$A$1000, A2)

-- rows where the count > 1 are duplicates
-- filter on this column and check

Or use: Data → Remove Duplicates to delete duplicates in place (destructive — work on a copy).

6. Data Validation

Data → Data Validation controls what can go in a cell.

Allow: List
Source: hourly,monthly,daily

This adds a dropdown, which prevents typos in category columns. Useful when you build input templates for non-technical colleagues.


Excel vs Pandas — Quick Reference

Task Excel Pandas
Filter rows AutoFilter or a boolean formula df[df['col'] == val]
VLOOKUP / XLOOKUP =XLOOKUP(...) pd.merge()
SUMIFS =SUMIFS(...) df.groupby().agg()
Pivot table Insert → Pivot Table pd.pivot_table()
Remove duplicates Data → Remove Duplicates df.drop_duplicates()
Date format TEXT(date, "YYYY-MM") df['col'].dt.to_period('M')
Conditional column =IFS(...) np.select(...)
Running total Value Field → Running Total df['col'].cumsum()

Use Excel when you share with non-technical colleagues or build input templates. Use Pandas when the dataset is large, the logic is complex, or the process has to run on its own.