NoBS Python

Examples

MLOps Pokemon Model

This example shows how to build a complete MLOps workflow using NoBS Python. The workflow includes:

  • Fetching and storing data
  • Training and logging models into MLflow
  • Exposing a Streamlit application for interactive prediction
  • Running everything as a repeatable and deployable NoBS Python project

The project is structured in three main parts:

  • Project configuration (project.py)
  • Data loading and model training logic (src/pokemon.py)
  • User-facing prediction interface (src/pokemon_app.py)

Project Overview

The project file defines:

  • A shared MLflow server
  • Two jobs: one to load Pokémon data, and one to train a classifier
  • A Streamlit app for real-time predictions

project.py:

python
from nobs.models import Project, MlflowServer, Job, StreamlitApp, Worker
from nobs.secrets import MlflowConfig, S3StorageConfig

from src.pokemon_app import main
from src.pokemon import LoadData, TrainConfig, load_all_data, train_model


project = Project(
    name="mlops-example",
    shared_secrets=[S3StorageConfig],

    mlflow_server=MlflowServer(
        domain_names=[
            "example.aligned.codes"
        ]
    ),

    load_pokemon_data=Job(
        load_all_data,
        arguments=LoadData()
    ),

    train_pokemon_model=Job(
        train_model,
        arguments=TrainConfig(),
        secrets=[MlflowConfig]
    ),

    is_legendary_app=StreamlitApp(
        main,
        secrets=[MlflowConfig],
    )
)

This configuration does the following:

  • The MLflow server is deployed for logging and tracking experiments.
  • load_pokemon_data loads the Pokémon dataset and stores it in S3.
  • train_pokemon_model trains a classifier and registers it in MLflow.
  • The Streamlit application provides a UI to make predictions over the trained models.

Data Loading and Model Training Logic

The logic for data ingestion, dataset definition, and model training is located in src/pokemon.py.

Key steps:

  • Data is retrieved using the AiopokeClient.
  • The retrieved data is validated and stored using aligned data contracts.
  • A classification model is defined through a model contract.
  • The model is trained and logged to MLflow.

src/pokemon.py:

python
# (full code omitted for brevity in this explanation, provided above in reference)

Important points:

  • LoadData defines the configuration for data ingestion.
  • TrainConfig defines adjustable training parameters.
  • Pokemon is a data contract describing the dataset schema.
  • IsLegendaryPokemon defines the model contract for predicting whether a Pokémon is legendary.
  • load_all_data loads and stores the dataset.
  • train_model prepares the dataset, trains a RandomForest model, logs metrics, and registers the model in MLflow.

When the train_pokemon_model job runs, MLflow will contain runs, metrics, and registered models that can be promoted or compared.

User-Facing Prediction App

The Streamlit application provides a simple UI that allows a user to input Pokémon data and view predictions from the trained model.

src/pokemon_app.py:

python
# (full code provided above)

Key points:

  • The app loads both the data contract and the model contract.
  • The UI allows input by Pokémon ID or raw feature inputs.
  • Predictions are displayed directly in the interface.

Running this app in NoBS Python provides a hosted interactive model demonstration.

Running the Workflow

Install dependencies using uv:

uv sync
source .venv/bin/activate

Load the data:

nobs run load_pokemon_data

Train the model:

nobs run train_pokemon_model

Deploy everything:

nobs deploy

After deployment:

  • The MLflow UI will be available at the configured domain.
  • The Streamlit application will be accessible via the deployment’s public URL.
Previous
Pub/Sub