Python Environment Notebook
Every Python project should run inside its own virtual environment — an isolated copy of Python with its own installed packages. Without one, every project shares one global package pool: upgrading pandas for project B silently breaks project A, and nobody else can reproduce your setup.
This notebook covers the standard tooling: venv for isolation, pip for packages, requirements.txt for reproducibility, plus the Jupyter and secrets handling that DA work needs.
For keeping these files in version control (and which ones to ignore), see Git Notebook.
venv — One Environment per Project
cd revenue-analysis
# create it (a folder named .venv inside the project — the convention)
python3 -m venv .venv
# activate it
source .venv/bin/activate # macOS / Linux
.venv\Scripts\activate # Windows
# your prompt now shows (.venv) — pip and python now point INSIDE the project
deactivate # leave the environment
Activation only lasts for the current terminal session — re-activate whenever you open a new terminal. Two checks when something feels off:
which python # should end in .venv/bin/python
pip --version # should show a path inside .venv
Add .venv/ to .gitignore. The environment is rebuildable from requirements.txt; the folder itself is hundreds of MB of machine-specific files.
pip Essentials
pip install pandas # latest version
pip install pandas==2.2.3 # exact version
pip install "pandas>=2.0,<3.0" # bounded range
pip install pandas numpy matplotlib # several at once
pip install --upgrade pandas # upgrade one package
pip uninstall pandas
pip list # everything installed in this environment
pip show pandas # version, dependencies, install location
Always with the environment activated — otherwise you are installing into the global Python (see Common Mistakes).
requirements.txt — Reproducibility
requirements.txt lists the packages a project needs, so any machine can rebuild the environment:
# write the current environment's packages into the file
pip freeze > requirements.txt
# rebuild from the file (on a new machine / fresh clone)
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Two styles of writing it:
| Style | Looks like | Best for |
|---|---|---|
Frozen (pip freeze) |
Every package, exact pins: pandas==2.2.3, plus all sub-dependencies |
Deployments — identical installs every time |
| Hand-written top-level | Only what you import: pandas, matplotlib, requests |
Analysis projects — readable, flexible |
For a deployed app, freeze. For an exploratory analysis repo, a short hand-written list is easier to read and maintain — sub-dependencies resolve themselves.
Commit requirements.txt to Git. It is the partner of the ignored .venv/: one is the recipe, the other is the rebuildable result.
Which Python Am I Running?
Most "it works in the terminal but not in my editor" problems are two Pythons disagreeing.
python3 --version # version of the default python3
which python3 # where it lives
python3 -m pip install x # guarantee pip belongs to THIS python
The python3 -m pip / python3 -m venv pattern sidesteps PATH confusion entirely: whatever python3 is, you are using its pip and its venv module.
In VS Code: Cmd+Shift+P → "Python: Select Interpreter" → pick .venv/bin/python inside your project. The integrated terminal and the run button then agree with each other.
Jupyter and Virtual Environments
Jupyter runs code through a kernel, and the kernel must point at your project's environment — otherwise the notebook can't see the packages you installed.
# inside the activated .venv:
pip install ipykernel
python -m ipykernel install --user --name revenue-analysis --display-name "Python (revenue-analysis)"
Then pick "Python (revenue-analysis)" from the kernel menu (top-right in Jupyter / VS Code). The quick sanity check when imports mysteriously fail:
import sys
print(sys.executable) # should point inside your project's .venv
Secrets with .env
Database URLs and API keys do not belong in code or in Git. The standard pattern is a .env file plus python-dotenv:
# .env (gitignored — see Git Notebook)
DATABASE_URL=postgresql://user:pass@host:5432/mydb
API_KEY=sk-xxxx
import os
from dotenv import load_dotenv
load_dotenv() # reads .env into environment variables
db_url = os.getenv("DATABASE_URL")
Commit a .env.example with the same keys but placeholder values — it documents what configuration the project needs without leaking anything:
# .env.example (committed)
DATABASE_URL=postgresql://user:password@localhost:5432/dbname
API_KEY=your-key-here
Alternatives in Brief
| Tool | What it adds | When to care |
|---|---|---|
venv + pip |
Nothing extra — built into Python | Default. Fine for nearly all DA work |
conda |
Manages Python itself + non-Python libs | Heavy scientific stacks, GPU setups |
uv |
A drop-in pip/venv replacement, 10–100× faster | Same workflow, much faster installs |
Learn the venv + pip workflow first — the others are variations on the same ideas, and every tutorial assumes you know it.
Common Workflows
1. New Project Checklist
mkdir new-analysis && cd new-analysis
python3 -m venv .venv
source .venv/bin/activate
pip install pandas matplotlib
pip freeze > requirements.txt # or hand-write the top-level list
# .gitignore with .venv/ and .env, then git init (see Git Notebook)
2. Reproduce Someone Else's Project
git clone https://github.com/user/their-analysis.git
cd their-analysis
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
3. Add a Dependency Properly
pip install seaborn
# then record it — the step everyone forgets:
pip freeze > requirements.txt # frozen style
# or add one line by hand # top-level style
4. Upgrade Safely
pip list --outdated # what has new versions
pip install --upgrade pandas # upgrade one thing
# run your scripts/tests, THEN update requirements.txt
Upgrade one package at a time. Upgrading everything at once means that when something breaks, you don't know which upgrade did it.
Common Mistakes
1. Installing into the Global Python
The classic symptom: pip install succeeded but import fails. The install went to a different Python than the one running your code. Activate first, and verify with which python — or make it impossible by always using python3 -m pip.
2. Forgetting the Environment Exists
New terminal, no (.venv) in the prompt, commands half-work. Make activation the first thing you type in a project directory — or let VS Code do it automatically once the interpreter is selected.
3. Freezing a Polluted Global Environment
pip freeze run outside a venv dumps every package ever installed on the machine — hundreds of irrelevant lines. Only freeze from inside the project's activated environment.
4. Committing .venv/ to Git
Hundreds of MB, machine-specific, fully rebuildable. Ignore the folder, commit requirements.txt. If it is already tracked: git rm -r --cached .venv, commit, done.
5. sudo pip install
Installing into the system Python with root can break OS tools that depend on specific package versions. If you ever feel you need sudo to install a package, the real answer is a virtual environment.
6. Jupyter Kernel Pointing at the Wrong Python
Notebook can't import what you just installed → the kernel is not your venv. Register the venv as a kernel (see above) and check sys.executable.
Together with Git Notebook, this is the project hygiene layer: requirements.txt and .env.example in the repo, .venv/ and .env on the machine only — anyone (including future you) can clone and rebuild the whole setup in two minutes.