Examples
MLOps Pokemon Model
This example shows how to build a complete MLOps workflow using NoBS Python. The workflow includes:
- Fetching and storing data
- Training and logging models into MLflow
- Exposing a Streamlit application for interactive prediction
- Running everything as a repeatable and deployable NoBS Python project
The project is structured in three main parts:
- Project configuration (
project.py) - Data loading and model training logic (
src/pokemon.py) - User-facing prediction interface (
src/pokemon_app.py)
Project Overview
The project file defines:
- A shared MLflow server
- Two jobs: one to load Pokémon data, and one to train a classifier
- A Streamlit app for real-time predictions
project.py:
from nobs.models import Project, MlflowServer, Job, StreamlitApp, Worker
from nobs.secrets import MlflowConfig, S3StorageConfig
from src.pokemon_app import main
from src.pokemon import LoadData, TrainConfig, load_all_data, train_model
project = Project(
name="mlops-example",
shared_secrets=[S3StorageConfig],
mlflow_server=MlflowServer(
domain_names=[
"example.aligned.codes"
]
),
load_pokemon_data=Job(
load_all_data,
arguments=LoadData()
),
train_pokemon_model=Job(
train_model,
arguments=TrainConfig(),
secrets=[MlflowConfig]
),
is_legendary_app=StreamlitApp(
main,
secrets=[MlflowConfig],
)
)
This configuration does the following:
- The MLflow server is deployed for logging and tracking experiments.
load_pokemon_dataloads the Pokémon dataset and stores it in S3.train_pokemon_modeltrains a classifier and registers it in MLflow.- The Streamlit application provides a UI to make predictions over the trained models.
Data Loading and Model Training Logic
The logic for data ingestion, dataset definition, and model training is located in src/pokemon.py.
Key steps:
- Data is retrieved using the AiopokeClient.
- The retrieved data is validated and stored using aligned data contracts.
- A classification model is defined through a model contract.
- The model is trained and logged to MLflow.
src/pokemon.py:
# (full code omitted for brevity in this explanation, provided above in reference)
Important points:
- LoadData defines the configuration for data ingestion.
- TrainConfig defines adjustable training parameters.
- Pokemon is a data contract describing the dataset schema.
- IsLegendaryPokemon defines the model contract for predicting whether a Pokémon is legendary.
- load_all_data loads and stores the dataset.
- train_model prepares the dataset, trains a RandomForest model, logs metrics, and registers the model in MLflow.
When the train_pokemon_model job runs, MLflow will contain runs, metrics, and registered models that can be promoted or compared.
User-Facing Prediction App
The Streamlit application provides a simple UI that allows a user to input Pokémon data and view predictions from the trained model.
src/pokemon_app.py:
# (full code provided above)
Key points:
- The app loads both the data contract and the model contract.
- The UI allows input by Pokémon ID or raw feature inputs.
- Predictions are displayed directly in the interface.
Running this app in NoBS Python provides a hosted interactive model demonstration.
Running the Workflow
Install dependencies using uv:
uv sync
source .venv/bin/activate
Load the data:
nobs run load_pokemon_data
Train the model:
nobs run train_pokemon_model
Deploy everything:
nobs deploy
After deployment:
- The MLflow UI will be available at the configured domain.
- The Streamlit application will be accessible via the deployment’s public URL.