Python Environment Notebook

Every Python project should run inside its own virtual environment — an isolated copy of Python with its own installed packages. Without one, every project shares one global package pool: upgrading pandas for project B silently breaks project A, and nobody else can reproduce your setup.

This notebook covers the standard tooling: venv for isolation, pip for packages, requirements.txt for reproducibility, plus the Jupyter and secrets handling that DA work needs.

For keeping these files in version control (and which ones to ignore), see Git Notebook.

venv — One Environment per Project

cd revenue-analysis

# create it (a folder named .venv inside the project — the convention)
python3 -m venv .venv

# activate it
source .venv/bin/activate          # macOS / Linux
.venv\Scripts\activate             # Windows

# your prompt now shows (.venv) — pip and python now point INSIDE the project

deactivate                         # leave the environment

Activation only lasts for the current terminal session — re-activate whenever you open a new terminal. Two checks when something feels off:

which python        # should end in .venv/bin/python
pip --version       # should show a path inside .venv

Add .venv/ to .gitignore. The environment is rebuildable from requirements.txt; the folder itself is hundreds of MB of machine-specific files.

pip Essentials

pip install pandas                     # latest version
pip install pandas==2.2.3             # exact version
pip install "pandas>=2.0,<3.0"        # bounded range
pip install pandas numpy matplotlib    # several at once

pip install --upgrade pandas           # upgrade one package
pip uninstall pandas

pip list                               # everything installed in this environment
pip show pandas                        # version, dependencies, install location

Always with the environment activated — otherwise you are installing into the global Python (see Common Mistakes).

requirements.txt — Reproducibility

requirements.txt lists the packages a project needs, so any machine can rebuild the environment:

# write the current environment's packages into the file
pip freeze > requirements.txt

# rebuild from the file (on a new machine / fresh clone)
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Two styles of writing it:

Style Looks like Best for
Frozen (pip freeze) Every package, exact pins: pandas==2.2.3, plus all sub-dependencies Deployments — identical installs every time
Hand-written top-level Only what you import: pandas, matplotlib, requests Analysis projects — readable, flexible

For a deployed app, freeze. For an exploratory analysis repo, a short hand-written list is easier to read and maintain — sub-dependencies resolve themselves.

Commit requirements.txt to Git. It is the partner of the ignored .venv/: one is the recipe, the other is the rebuildable result.

Which Python Am I Running?

Most "it works in the terminal but not in my editor" problems are two Pythons disagreeing.

python3 --version          # version of the default python3
which python3              # where it lives
python3 -m pip install x   # guarantee pip belongs to THIS python

The python3 -m pip / python3 -m venv pattern sidesteps PATH confusion entirely: whatever python3 is, you are using its pip and its venv module.

In VS Code: Cmd+Shift+P → "Python: Select Interpreter" → pick .venv/bin/python inside your project. The integrated terminal and the run button then agree with each other.

Jupyter and Virtual Environments

Jupyter runs code through a kernel, and the kernel must point at your project's environment — otherwise the notebook can't see the packages you installed.

# inside the activated .venv:
pip install ipykernel
python -m ipykernel install --user --name revenue-analysis --display-name "Python (revenue-analysis)"

Then pick "Python (revenue-analysis)" from the kernel menu (top-right in Jupyter / VS Code). The quick sanity check when imports mysteriously fail:

import sys
print(sys.executable)     # should point inside your project's .venv

Secrets with .env

Database URLs and API keys do not belong in code or in Git. The standard pattern is a .env file plus python-dotenv:

# .env  (gitignored — see Git Notebook)
DATABASE_URL=postgresql://user:pass@host:5432/mydb
API_KEY=sk-xxxx
import os
from dotenv import load_dotenv

load_dotenv()                          # reads .env into environment variables
db_url = os.getenv("DATABASE_URL")

Commit a .env.example with the same keys but placeholder values — it documents what configuration the project needs without leaking anything:

# .env.example  (committed)
DATABASE_URL=postgresql://user:password@localhost:5432/dbname
API_KEY=your-key-here

Alternatives in Brief

Tool What it adds When to care
venv + pip Nothing extra — built into Python Default. Fine for nearly all DA work
conda Manages Python itself + non-Python libs Heavy scientific stacks, GPU setups
uv A drop-in pip/venv replacement, 10–100× faster Same workflow, much faster installs

Learn the venv + pip workflow first — the others are variations on the same ideas, and every tutorial assumes you know it.

Common Workflows

1. New Project Checklist

mkdir new-analysis && cd new-analysis
python3 -m venv .venv
source .venv/bin/activate
pip install pandas matplotlib
pip freeze > requirements.txt        # or hand-write the top-level list
# .gitignore with .venv/ and .env, then git init (see Git Notebook)

2. Reproduce Someone Else's Project

git clone https://github.com/user/their-analysis.git
cd their-analysis
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

3. Add a Dependency Properly

pip install seaborn
# then record it — the step everyone forgets:
pip freeze > requirements.txt        # frozen style
# or add one line by hand            # top-level style

4. Upgrade Safely

pip list --outdated                  # what has new versions
pip install --upgrade pandas         # upgrade one thing
# run your scripts/tests, THEN update requirements.txt

Upgrade one package at a time. Upgrading everything at once means that when something breaks, you don't know which upgrade did it.

Common Mistakes

1. Installing into the Global Python

The classic symptom: pip install succeeded but import fails. The install went to a different Python than the one running your code. Activate first, and verify with which python — or make it impossible by always using python3 -m pip.

2. Forgetting the Environment Exists

New terminal, no (.venv) in the prompt, commands half-work. Make activation the first thing you type in a project directory — or let VS Code do it automatically once the interpreter is selected.

3. Freezing a Polluted Global Environment

pip freeze run outside a venv dumps every package ever installed on the machine — hundreds of irrelevant lines. Only freeze from inside the project's activated environment.

4. Committing .venv/ to Git

Hundreds of MB, machine-specific, fully rebuildable. Ignore the folder, commit requirements.txt. If it is already tracked: git rm -r --cached .venv, commit, done.

5. sudo pip install

Installing into the system Python with root can break OS tools that depend on specific package versions. If you ever feel you need sudo to install a package, the real answer is a virtual environment.

6. Jupyter Kernel Pointing at the Wrong Python

Notebook can't import what you just installed → the kernel is not your venv. Register the venv as a kernel (see above) and check sys.executable.


Together with Git Notebook, this is the project hygiene layer: requirements.txt and .env.example in the repo, .venv/ and .env on the machine only — anyone (including future you) can clone and rebuild the whole setup in two minutes.