ATOC 4815/5815

Final Project Guide

Will Chapman

CU Boulder ATOC

Spring 2026

Final Project Clinic

Today’s Objectives

  • See a complete, minimal final project you can use as a reference
  • Know the fastest path from messy project folder → submission-ready
  • Understand what a PR contribution looks like in practice
  • Leave with a clear list of what to work on today

Reminders

Final Project Presentations: April 27

  • Late work accepted through April 23
  • FCQ’s are open
  • check your canvas grade

Office Hours:

Will: Tu 11:15-12:15p Th 9-10a Aerospace Cafe

Aiden: M / W 3:30-4:30p DUAN D319

Deliverables checklist:

The Minimum Viable Final Project

What “Done” Looks Like

A finished final project is not a massive codebase. It is a focused, working tool that someone else can install and use.

The minimum bar:

Component What it means
Installable package pip install -e . works in a fresh environment
3–4 core functions Each does one thing, documented, tested by hand
README ≤ 3 commands to go from zero → first result
Example A notebook or script that runs top-to-bottom
Commit history ≥ 5 commits with messages that explain why
PR contribution One merged PR to a classmate’s repo

The one-sentence test

Can a classmate clone your repo, install it, and produce a result in under 5 minutes? If yes, you are done.

The Repo Anatomy

Every strong final project has the same shape:

your-package-name/          ← repo root
├── pyproject.toml          ← makes it pip-installable
├── README.md               ← the first thing reviewers read
├── .gitignore              ← never commit .DS_Store or __pycache__
├── your_package/           ← the actual Python code (underscores!)
│   ├── __init__.py         ← exports your public API
│   └── core.py             ← 3–4 functions here is enough
└── examples/
    └── quickstart.ipynb    ← or quickstart.py

That is it. No test suite, no CI/CD, no docs site. Just this shape.

Common mistake

Do not put pyproject.toml inside your package folder. It belongs at the repo root, one level above the Python package directory.

A Real Example: geowind-era5

The Reference Repo

github.com/WillyChap/geowind_era5 — a complete, minimal final-project-scale package.

What it does: computes geostrophic wind from ERA5 reanalysis data with zero credentials required — just pip install and run.

geowind-era5/
├── pyproject.toml          ← pip-installable, declares all dependencies
├── README.md               ← install + quickstart + full API reference
├── .gitignore
├── geowind_era5/           ← the whole package: 3 files
│   ├── __init__.py         ← exports the 3 public functions
│   ├── core.py             ← open_geopotential(), geostrophic_wind(), load()
│   └── cli.py              ← geowind command-line tool
└── examples/
    ├── geowind_500hPa.py   ← runnable script
    └── geowind_500hPa.ipynb

Three source files. That is enough for a strong final project.

The pyproject.toml

[build-system]
requires = ["setuptools >= 61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "geowind-era5"
version = "0.0.1"
description = "Geostrophic wind from ERA5 reanalysis via ARCO-ERA5 on Google Cloud Storage"
requires-python = ">=3.10"
dependencies = [
    "xarray>=2022.6",
    "zarr",
    "gcsfs",
    "dask",
    "numpy",
    "netCDF4",
]

[project.optional-dependencies]
plot = ["matplotlib"]

[project.scripts]
geowind = "geowind_era5.cli:main"

Two things to notice:

  1. dependencies — every package your code imports is listed. Users do not install them manually; pip does it for them.
  2. [project.scripts]geowind becomes a real terminal command after install. Optional but impressive.

The __init__.py — Your Public API

from .core import open_geopotential, geostrophic_wind, load

__all__ = ["open_geopotential", "geostrophic_wind", "load"]

This is the entire __init__.py. It says: “these three functions are what users get when they import geowind_era5.”

After install, users write:

from geowind_era5 import open_geopotential, geostrophic_wind, load

Instead of the ugly version:

from geowind_era5.core import open_geopotential  # works, but makes users dig into your internals

Keep init.py short

It is just a re-export list. Put all logic in core.py (or multiple module files). The __init__.py is your public-facing menu.

The README — First 30 Seconds Matter

The README is your sales pitch. A grader decides in 30 seconds whether your project is usable.

The geowind-era5 README structure (copy this pattern):

# your-package-name
One sentence: what does it do and for whom?

## Installation
pip install -e .

## Quick start
< 10 lines of code that produce a real result >

## API
One paragraph per public function: parameters, return value.

Write the README before you write the code

If you cannot write three lines of quickstart, your API is not designed yet. The README forces clarity. The code follows naturally.

The Example — Make It Runnable

From examples/geowind_500hPa.py:

from geowind_era5 import geostrophic_wind, load, open_geopotential

# Open ERA5 500 hPa geopotential over CONUS for one day
phi = open_geopotential(
    time_slice=("2010-01-01", "2010-01-01"),
    level=500,
    lat=(20.0, 60.0),
    lon=(-135.0, -60.0),
)

phi = load(phi)             # download with progress bar
ug, vg = geostrophic_wind(phi)
ug.isel(time=0).plot()

Requirements for your example:

  • Runs top-to-bottom with no edits
  • Uses only your installed package — no sys.path hacks
  • Produces visible output (a plot, a printed result, a saved file)

What Makes Commits “Meaningful”

Your grade requires ≥ 5 commits. Quality matters more than count.

Bad commit messages:

fix stuff
update
asdfgh
more changes
final version
actually final

Good commit messages:

add open_geopotential() with lazy zarr load
implement geostrophic_wind using centered diff
add pyproject.toml and make package installable
write quickstart example for 500 hPa CONUS
add README with install and API docs

The rule: a commit message should complete the sentence “This commit will…”

One logical change per commit

If you are writing a commit message and struggling to describe it, your commit is probably too large. Split it.

From Messy to Submission-Ready

The 7-Step Checklist

Work through this in order. Do not jump ahead — each step depends on the one before it.

  1. One core function works end-to-end — pick your most important function. Make it produce a correct result on real input. Everything else depends on this.
  2. Add input checks — what happens if someone passes a string where you expect a number? An empty array? Add at least one if / raise ValueError.
  3. Write docstrings — one per public function: parameters, return value, one-sentence description.
  4. Write one working example — a script or notebook that imports your package and produces output. Run it. Fix it until it passes.
  5. Make pip install -e . work — write pyproject.toml, list your dependencies. Test it.
  6. Run the fresh-clone test — create a fresh conda env, clone your repo, install, run your example. Fix what breaks.
  7. Write your README — paste your quickstart code in. Add install instructions.
  8. Upload to TestPyPI — four commands. You get a public URL to submit on Canvas.

Step 1 in Detail: One Core Function

Do not write five half-finished functions. Write one function that actually works.

What “works” means:

  • Takes real input (not hardcoded test data inside the function)
  • Produces correct output you can verify by hand or plot
  • Has a docstring
  • Raises a clear error on bad input
def geostrophic_wind(phi):
    """
    Compute geostrophic wind components from geopotential.

    Parameters
    ----------
    phi : xr.DataArray
        Geopotential in m2 s-2 with dimensions (lat, lon) or (time, lat, lon).

    Returns
    -------
    ug, vg : xr.DataArray
        Zonal and meridional geostrophic wind in m s-1.
    """
    if "lat" not in phi.dims or "lon" not in phi.dims:
        raise ValueError("phi must have 'lat' and 'lon' dimensions")
    # ... computation ...

Step 6: The Fresh-Clone Test

This is the most important test you will run before submitting.

# Create a completely fresh environment
conda create -n test_install python=3.11 -y
conda activate test_install

# Clone as if you were a stranger seeing this repo for the first time
git clone https://github.com/yourusername/your-package.git
cd your-package

# Install
pip install -e .

# Run your example
python examples/quickstart.py

What commonly breaks:

  • You forgot to list a dependency in pyproject.toml
  • Your example uses a hardcoded path that only works on your machine
  • You imported a helper file that is not part of the installed package

If this breaks, the grader’s install will break too

The fresh-clone test is the grading environment. Run it before the final PLEASE.

Upload to TestPyPI

Why TestPyPI?

TestPyPI is identical to the real Python Package Index but designed for practice. You can upload, delete, and re-upload as many times as you want — no consequences.

What you are submitting on Canvas:

  • Your GitHub repo URL
  • Your TestPyPI package URL — e.g. https://test.pypi.org/project/geowind-era5-yourCUusername/

Avoid name collisions — append your CU username

TestPyPI is a shared namespace. Change your package name in pyproject.toml before uploading:

[project]
name = "your-package-yourCUusername"   # e.g. geowind-era5-wchap

Otherwise you will get a 403 Forbidden if someone already claimed that name.

The Four Commands

Prerequisites: a TestPyPI account at test.pypi.org and an API token.

One-time setup — save your API token to ~/.pypirc:

[testpypi]
  username = __token__
  password = pypi-your-token-here

Every time you publish:

pip install build twine          # install tools (one time)

python -m build                  # creates dist/*.whl and dist/*.tar.gz

twine upload --repository testpypi dist/*
# → view at https://test.pypi.org/project/your-package-yourCUusername/

That is the whole workflow. Three commands after the one-time setup.

Verify It Installed

After uploading, confirm it installs from TestPyPI in a fresh environment:

conda create -n testpypi_check python=3.11 -y
conda activate testpypi_check

pip install \
  --index-url https://test.pypi.org/simple/ \
  --extra-index-url https://pypi.org/simple/ \
  your-package-yourCUusername

Two URLs are required:

  • --index-url → fetch your package from TestPyPI
  • --extra-index-url → fetch dependencies (numpy, xarray, etc.) from real PyPI

Common error: “No matching distribution found for numpy”

You forgot --extra-index-url. TestPyPI only hosts packages uploaded there — not the full PyPI catalog.

What to Submit on Canvas

Two links + your presentation slides + PR link

1. Your GitHub repo URL

https://github.com/yourCUusername/your-package

The grader will clone this, run pip install -e ., and run your example.

2. Your TestPyPI package URL

https://test.pypi.org/project/your-package-yourCUusername/

The grader will confirm the package is live and the version number matches your repo.

If you update your package after uploading

Bump the version in pyproject.toml (0.0.10.0.2) before re-running python -m build and twine upload. TestPyPI will reject an upload if that version already exists.

Stretch Goal: Add a CLI

What a CLI Entry Point Is

After pip install, your package can register a terminal command — no python script.py, just a short command name.

geowind-era5 already does this:

# after pip install -e .
geowind --level 500 --lat 20 60 --lon -135 -60 --time 2010-01-01 2010-01-01

That geowind command is a Python function wired up in pyproject.toml. You can do the same thing for your package in about 30 lines.

Why bother?

  • Makes your package feel like a real tool, not just a library
  • Demonstrates you understand entry points (a real packaging concept)
  • Myself and classmates can use it without writing any Python
  • Directly mirrors how most scientific CLI tools (nco, cdo, cfgrib) work

Write a cli.py

Create your_package/cli.py and add an argparse-based main() function:

import argparse
from .core import open_geopotential, geostrophic_wind, load   # your imports

def main():
    parser = argparse.ArgumentParser(
        description="Compute geostrophic wind from ERA5 geopotential"
    )
    parser.add_argument("--level",  type=int,   default=500,
                        help="Pressure level in hPa (default: 500)")
    parser.add_argument("--lat",    type=float, nargs=2, default=[20.0, 60.0],
                        metavar=("SOUTH", "NORTH"))
    parser.add_argument("--lon",    type=float, nargs=2, default=[-135.0, -60.0],
                        metavar=("WEST", "EAST"))
    parser.add_argument("--time",   type=str,   nargs=2, required=True,
                        metavar=("START", "END"))
    parser.add_argument("-o", "--output", default="geowind_out.nc",
                        help="Output NetCDF filename")
    args = parser.parse_args()

    phi = open_geopotential(
        time_slice=tuple(args.time),
        level=args.level,
        lat=tuple(args.lat),
        lon=tuple(args.lon),
    )
    phi  = load(phi)
    ug, vg = geostrophic_wind(phi)
    ug.to_netcdf(args.output)
    print(f"Saved to {args.output}")

if __name__ == "__main__":
    main()

Wire It Up in pyproject.toml

Add one section to your existing pyproject.toml:

[project.scripts]
your-command = "your_package.cli:main"

For example, geowind-era5 uses:

[project.scripts]
geowind = "geowind_era5.cli:main"

The format is always: command-name = "package.module:function"

After pip install -e ., the command is live:

your-command --help
your-command --time 2010-01-01 2010-01-01 --level 500 -o result.nc

Already covered this in Week 11 (argparse)

If you completed the argparse lab, you already know everything you need. This is just connecting that main() function to the installer.

The PR Contribution

What a PR Contribution Is

You must make at least one merged Pull Request to a classmate’s repository.

It does not have to be big. All of these count:

Type Example
Bug fix Fix a function that crashes on certain inputs
Documentation Improve their README or add a docstring
New example Add a usage example they did not have
Dependency fix Add a missing package to their pyproject.toml
Code quality Simplify logic, fix a typo, remove duplication

The easiest PR: run the fresh-clone test on a classmate’s repo

Clone their repo. Try to install and run it. Something will break. Fix it. Submit a PR. You just did the whole thing.

How to Submit a PR

# 1. Fork their repo on GitHub  (click the Fork button, top-right)

# 2. Clone YOUR fork (not theirs)
git clone https://github.com/yourusername/their-repo.git
cd their-repo

# 3. Create a branch — never commit directly to main
git checkout -b fix/missing-netcdf4-dependency

# 4. Make your change
#    e.g. add netCDF4 to pyproject.toml dependencies

# 5. Commit with a clear message
git add pyproject.toml
git commit -m "add netCDF4 to dependencies (missing from fresh-clone install)"

# 6. Push to YOUR fork
git push origin fix/missing-netcdf4-dependency

# 7. On GitHub: your fork → "Compare & pull request"

In the PR description: one sentence explaining what you changed and why. That is enough.

What Makes a Good PR Description

Minimal but complete:

Add netCDF4 to dependencies

netCDF4 is imported in core.py but was
not listed in pyproject.toml. Fresh-clone
install failed with ImportError.

Also fine:

Fix README quickstart example

The lon argument in the quickstart used
positive values for W longitudes. Changed
to negative values to match the docstring.

Heads-up is good practice

email or message your classmate: “I am going to look at your repo and submit a PR — is that OK?” This is what real open-source contribution looks like.

What to Do Right Now

Your Priority List for the Next Two Weeks

Work through these in order. If you are stuck on one, move to the next and come back.

  1. Do you have one working function? → If not, write one today. It does not need to be polished.
  2. Does pip install -e . work? → If not, write pyproject.toml today. It is 15 lines.
  3. Does your example run? → Write one script, top-to-bottom. Run it. Fix it.
  4. Is your README useful? → Paste your quickstart code into a README section. You are done.
  5. Run the fresh-clone test → Before April 23.
  6. Upload to TestPyPI → Append your CU username to the package name. python -m build && twine upload --repository testpypi dist/*. Copy the URL.
  7. Submit your classmate’s PR → Clone their repo, run fresh-clone test, fix what breaks, open PR.

The Final Project Mindset

Your project is not graded on scientific novelty. It is graded on execution:

Criterion What the grader actually checks
Installability Does pip install -e . work in a fresh env?
Functionality Do the functions run and produce output?
Documentation README? Docstring on each public function?
Workflow ≥ 5 meaningful commits with clear messages?
TestPyPI Is the package live? Does it install from TestPyPI?
Collaboration A merged PR with a description?

None of these require a large codebase. geowind-era5 is 3 source files and passes every criterion.

When in doubt: do less, better

Three well-documented functions beat ten broken ones. One clean example beats five notebooks with TODO cells.

Questions?

Reference repo: github.com/WillyChap/geowind_era5

Final project guidelines: Canvas → Assignments → Final Project

Deadline: Presentations April 27 · Late work April 23

Office Hours:

Will: Tu 11:15–12:15p, Th 9–10a — Aerospace Cafe

Aiden: M / W 3:30–4:30p — DUAN D319