README
The course website is located here. Lecture materials, assignments, quizzes, etc. can be accessed at that link. You will need an API key to submit notebooks and that will be provided to you via email.
This site contains jupyter notebooks, data, and other code artifacts associated with this course.
Choosing a Notebook Environment
Most work will not require the use of GPUs. You can probably get away with not using them at all, unless you have a particular desire to do so.
Google Colab - single notebook experience
If you prefer:
- working within a single notebook
- are already comfortable with Google Colab
- don’t mind re-installing dependencies on re-start
- need access to GPUs
you may prefer Google Colab.
Deepnote
If you prefer:
- easy install, more persistence of dependencies
- large number of system integrations
- Dataframe charts, interactive widgets, dashboards, app deployment
- realtime collaboration
you may prefer Deepnote.
You will need to create a free account and then request an education plan. To use GPUs or higher performance machines, you must add a payment method - but you do not need to upgrade the plan.
All students will be given links to deepnote for labs.
Local JupyterLab / Notebook
If you are already comfortable in Jupyter in your local environment and:
- you want full control of your machine and environment
- persistence of dependencies
- and don’t mind dealing with management of your environment
you may prefer local Jupyter. The downside is that there is no GPU access unless you know how to set up something like a remote modal function that uses GPU.
Installation
For Students (Google Colab)
To use Colab and submit for credit:
- Download a notebook from GitHub
- Upload a local copy of the notebook to Colab
- Save a copy in Drive
- Ensure the file name matches the variable NOTEBOOK_NAME in the section “Submit Notebook for Credit”.
Saving to Drive and matching the filename are only required if you are submitting for credit.
You will need to add the SUBMIT_API_KEY to environmental variables.
For Deepnote
Every week, there will be new link posted for a Deepnote project. At least the first time, when you click on the link you will be asked to login or sign up to see the project. If you sign up, you’ll get a free 14-trial of the Team plan, and from there you can request the education plan.
- When the project opens, click Duplicate (top right).
- This creates your own private copy of the lab.
- You will need to add the SUBMIT_API_KEY to environmental variables.
For Local Development
If you want to run notebooks locally:
# Clone the repository
git clone https://github.com/su-dataAI/data401-nlp.git
cd data401-nlp
# If you don't have uv you can:
#curl -LsSf https://astral.sh/uv/install.sh | sh (macOS/Linux) or pip install uv as a fallback
# Create a virtual environment using uv (requires Python 3.11+)
# If you want to use a 3.13+, you will need to upgrade torch to torch>=2.1,<2.6
uv venv --python 3.11
# Activate the virtual environment
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate
# Install with all dependencies
uv pip install -e ".[dev,all]"
# Download spaCy model
python -m spacy download en_core_web_sm
# Start Jupyter Lab
jupyter lab
# Add .env file (root or nbs folder)
You will need to git pull when each new lab is posted.
Installation Options
The package supports flexible installation based on your needs:
# Minimal installation (core utilities only)
pip install data401-nlp
# With NLP tools (spaCy, NLTK)
pip install data401-nlp[nlp]
# With transformers and PyTorch
pip install data401-nlp[transformers]
# With API support (FastAPI, Pydantic)
pip install data401-nlp[api]
# Everything (recommended for students)
pip install data401-nlp[all]Platform Support
✅ Google Colab
✅ Deepnote
✅ Jupyter Lab
✅ Local Python 3.11+
Helper Modules
The package includes several helper modules to make your NLP work easier:
data401_nlp.helpers.env- Environment detection and API key loadingdata401_nlp.helpers.spacy- Automatic spaCy model managementdata401_nlp.helpers.submit- Assignment submission utilitiesdata401_nlp.helpers.llm- LLM integration helpers
The helper libraries may be updated as the course proceeds.