Examples

MLOps Pokemon Model

This example shows how to build a complete MLOps workflow using NoBS Python. The workflow includes:

Fetching and storing data
Training and logging models into MLflow
Exposing a Streamlit application for interactive prediction
Running everything as a repeatable and deployable NoBS Python project

The project is structured in three main parts:

Project configuration (project.py)
Data loading and model training logic (src/pokemon.py)
User-facing prediction interface (src/pokemon_app.py)

Project Overview

The project file defines:

A shared MLflow server
Two jobs: one to load Pokémon data, and one to train a classifier
A Streamlit app for real-time predictions

project.py:

python

from nobs.models import Project, MlflowServer, Job, StreamlitApp, Worker
from nobs.secrets import MlflowConfig, S3StorageConfig

from src.pokemon_app import main
from src.pokemon import LoadData, TrainConfig, load_all_data, train_model


project = Project(
    name="mlops-example",
    shared_secrets=[S3StorageConfig],

    mlflow_server=MlflowServer(
        domain_names=[
            "example.aligned.codes"
        ]
    ),

    load_pokemon_data=Job(
        load_all_data,
        arguments=LoadData()
    ),

    train_pokemon_model=Job(
        train_model,
        arguments=TrainConfig(),
        secrets=[MlflowConfig]
    ),

    is_legendary_app=StreamlitApp(
        main,
        secrets=[MlflowConfig],
    )
)

This configuration does the following:

The MLflow server is deployed for logging and tracking experiments.
load_pokemon_data loads the Pokémon dataset and stores it in S3.
train_pokemon_model trains a classifier and registers it in MLflow.
The Streamlit application provides a UI to make predictions over the trained models.

Data Loading and Model Training Logic

The logic for data ingestion, dataset definition, and model training is located in src/pokemon.py.

Key steps:

Data is retrieved using the AiopokeClient.
The retrieved data is validated and stored using aligned data contracts.
A classification model is defined through a model contract.
The model is trained and logged to MLflow.

src/pokemon.py:

python

# (full code omitted for brevity in this explanation, provided above in reference)

Important points:

LoadData defines the configuration for data ingestion.
TrainConfig defines adjustable training parameters.
Pokemon is a data contract describing the dataset schema.
IsLegendaryPokemon defines the model contract for predicting whether a Pokémon is legendary.
load_all_data loads and stores the dataset.
train_model prepares the dataset, trains a RandomForest model, logs metrics, and registers the model in MLflow.

When the train_pokemon_model job runs, MLflow will contain runs, metrics, and registered models that can be promoted or compared.

User-Facing Prediction App

The Streamlit application provides a simple UI that allows a user to input Pokémon data and view predictions from the trained model.

src/pokemon_app.py:

python

# (full code provided above)

Key points:

The app loads both the data contract and the model contract.
The UI allows input by Pokémon ID or raw feature inputs.
Predictions are displayed directly in the interface.

Running this app in NoBS Python provides a hosted interactive model demonstration.

Running the Workflow

Install dependencies using uv:

uv sync
source .venv/bin/activate

Load the data:

nobs run load_pokemon_data

Train the model:

nobs run train_pokemon_model

Deploy everything:

nobs deploy

After deployment:

The MLflow UI will be available at the configured domain.
The Streamlit application will be accessible via the deployment’s public URL.