flowchart TD subgraph "Traditional ML Development Cycle" direction LR A[Idea] --> B(Write DataLoaders) --> C(Write Model) --> D(Write Training Loop) --> E(Write Logging/Metrics) --> F(Debug & Run) --> G{Evaluate} G -- "Hypothesis Fails?" --> B G -- "Works!" --> H[Deploy] end subgraph "cmn_ai Accelerated Cycle" direction LR I[Idea] --> J[Configure Data & Model] --> K["learner = Learner(...)"] --> L["learner.fit()"] --> M{Evaluate} M -- "Hypothesis Fails?" --> J M -- "Works!" --> N[Deploy] end style D stroke-width:2px,stroke-dasharray: 5 5,stroke:red style E stroke-width:2px,stroke-dasharray: 5 5,stroke:red style K stroke-width:2px,stroke:green style L stroke-width:2px,stroke:green
In the world of machine learning, the biggest bottleneck isn’t always model performance—it’s the time it takes to get there. cmn_ai is a high-performance Python library designed to break that bottleneck. Built for AI, Deep Learning, and Data Science, it provides a robust toolkit of reusable components for PyTorch and scikit-learn that eliminates boilerplate and lets you focus on what truly matters: rapid experimentation and faster delivery.
The Problem: Why Is ML Development So Slow?
If you’ve ever built a machine learning model, you know the routine. You spend hours, if not days, writing and rewriting the same boilerplate code: custom training loops, logging mechanisms, data loading pipelines, and metric calculations. While essential, this repetitive work slows down the cycle of experimentation, which is the very heart of machine learning. Every moment spent on boilerplate is a moment not spent testing a new hypothesis, tuning a hyperparameter, or analyzing a result.
This is the problem cmn_ai
was built to solve. It is a comprehensive library born from years of real-world ML engineering experience, designed to abstract away the repetitive tasks and provide a solid foundation for your projects.
Takeaway: The overhead of writing boilerplate code for training loops, data handling, and logging is a major obstacle to rapid experimentation in machine learning.
Our Guiding Principle: Boyd’s Law in Machine Learning
The philosophy behind cmn_ai
is directly inspired by Boyd’s Law of Iteration: Speed of iteration beats quality of iteration.
In combat, the fighter pilot John Boyd argued that the side able to observe, orient, decide, and act (OODA) the fastest would win, even if their individual actions weren’t perfect. The same is true in machine learning. The ability to quickly run an experiment, get feedback, and start the next cycle is more valuable than spending weeks perfecting a single, monolithic training script.
cmn_ai
is designed to accelerate your OODA loop, letting you test ideas faster and arrive at a better solution sooner.
Diagram 1: Workflow Comparison. The cmn_ai
workflow (bottom) significantly reduces the repetitive, time-consuming steps (red dashed boxes) inherent in the traditional approach (top) by encapsulating them within the Learner
class.
Takeaway: By prioritizing iteration speed,
cmn_ai
helps you learn from more experiments in less time, leading to better models, faster.
The Core Engine: A Flexible Learner
Architecture
The heart of cmn_ai
’s deep learning toolkit is the Learner
class. It serves as a powerful, flexible orchestrator for the entire training process, handling everything from device placement to mixed-precision training and metric tracking.
Let’s compare a standard PyTorch training snippet with the cmn_ai
approach.
Before: A Manual PyTorch Training Loop
# A lot of manual steps...
model.to(device)for epoch in range(epochs):
for xb, yb in dl:
= xb.to(device), yb.to(device)
xb, yb
optimizer.zero_grad()= model(xb)
pred = loss_func(pred, yb)
loss
loss.backward()
optimizer.step()# ... and you still need to add logging, metrics, etc.
After: Using the cmn_ai
Learner
from cmn_ai.learner import Learner
from cmn_ai.callbacks.training import DeviceCallBack, Recorder
# Create a learner with data, model, and callbacks
= Learner(model, dls, loss_func, opt_func, callbacks=[Recorder("lr")])
learner "cuda:0")) # Handles moving data to the GPU
learner.add_callback(DeviceCallBack(
# Train your model with one line
=10, lr=1e-3) learner.fit(epochs
As you can see, the Learner
API is clean and concise, yet highly extensible through its powerful callback system.
Takeaway: The
Learner
class replaces manual training loops with a clean, high-level API, letting you focus on the model and data, not the plumbing.
Fine-Grained Control with an Exception-Based Callback System
While Learner
provides simplicity, callbacks provide power. cmn_ai
uses a unique, exception-based callback system that gives you precise control over every stage of the training process.
Callbacks are small, self-contained classes that can be “hooked” into the Learner
to perform actions at specific moments (e.g., after_batch
, before_epoch
). By raising a specific Cancel...Exception
, a callback can gracefully interrupt and modify the training flow on the fly.
sequenceDiagram participant Learner participant Callback loop Training Loop Learner->>Callback: before_batch() Note over Learner: Forward pass, calculate loss... Learner->>Callback: after_loss() opt CancelBackwardException thrown? Learner->>Learner: Skip backward() end Learner->>Learner: loss.backward() Learner->>Callback: after_backward() opt CancelStepException thrown? Learner->>Learner: Skip optimizer.step() end Learner->>Learner: optimizer.step() Learner->>Callback: after_step() end
Diagram 2: Callback Exception Flow. This diagram shows how a callback can throw an exception (e.g., CancelStepException
) after the backward pass to prevent the optimizer from updating the model weights for a specific batch, giving you ultimate control.
The main exceptions include:
CancelBatchException
: Skips the remainder of the current batch.CancelBackwardException
: Skips theloss.backward()
call.CancelStepException
: Skips theoptimizer.step()
call.CancelEpochException
: Skips the remainder of the current epoch.CancelFitException
: Stops the entire training process immediately.
This system enables sophisticated training techniques like gradient accumulation or freezing layers without complicating your main training logic.
Takeaway: The exception-based callback framework offers a powerful and clean way to customize training behavior without rewriting the
Learner
or creating complex stateful logic.
Key Features at a Glance
Feature | Description |
---|---|
🚀 Accelerated Development | Pre-built modules and a flexible Learner eliminate boilerplate, enabling rapid prototyping. |
🎯 Best Practices Built-In | The library distills years of ML engineering experience into robust, reusable components with consistent APIs. |
🔧 Framework Integration | Built on PyTorch for deep learning and fully compatible with scikit-learn Pipeline and ColumnTransformer for tabular data. |
📊 Domain-Specific Tools | Specialized utilities for Vision, Text, and Tabular machine learning, including EDA tools and data visualizers. |
Getting Started: A Quick Tour
Getting started with cmn_ai
is simple.
Important Note:
cmn_ai
requires Python 3.13+ and depends on PyTorch, scikit-learn, NumPy, and pandas.
Installation
The recommended way to install is directly from PyPI:
pip install cmn-ai
Quick Examples
Here’s how you can use cmn_ai
for different tasks:
1. General Deep Learning Customize your training loop with powerful callbacks for scheduling learning rates and tracking metrics.
from cmn_ai.learner import Learner
from cmn_ai.callbacks.schedule import BatchScheduler
from cmn_ai.callbacks.training import MetricsCallback, ProgressCallback
from torcheval.metrics import MulticlassAccuracy
import torch.optim as opt
from functools import partial
# Schedule learning rate over all batches
= partial(opt.lr_scheduler.OneCycleLR, max_lr=6e-2, total_steps=100)
sched = Learner(model, dls, loss_func, opt_func)
learner
learner.add_callbacks([
ProgressCallback(),
BatchScheduler(sched),=MulticlassAccuracy(num_classes=10)),
MetricsCallback(accuracy
])
=50, lr=1e-3) learner.fit(epochs
2. Computer Vision The VisionLearner
provides handy utilities like show_batch
to quickly visualize your data.
from cmn_ai.vision import VisionLearner
# Vision-specific learner with built-in utilities
= VisionLearner(model, dls, loss_func)
vision_learner # Visualize a batch of training data
vision_learner.show_batch() =20, lr=1e-4) vision_learner.fit(epochs
3. Tabular Data Processing cmn_ai
’s tabular tools are fully compatible with scikit-learn, so they can be dropped directly into your existing pipelines.
import pandas as pd
from cmn_ai.tabular.preprocessing import DateTransformer
from sklearn.pipeline import Pipeline
# Create sample time-series data
= pd.DataFrame(
x =pd.to_datetime("1/1/2018"), end=pd.to_datetime("1/08/2018"))
pd.date_range(start
)# This transformer automatically extracts date features like Day, Month, Year, etc.
= DateTransformer(drop=False)
tfm = tfm.fit_transform(x) transformed_data
Takeaway:
cmn_ai
provides a simple installation and a consistent API across different ML domains, making it easy to integrate into new or existing projects.
Under the Hood: A Modular Design
cmn_ai
is designed to be modular, so you can use as much or as little of the library as you need. The architecture is organized logically by function.
cmn_ai/
├── learner.py # Core Learner class
├── callbacks/ # Training callbacks
├── vision/ # Computer vision utilities
├── text/ # NLP processing tools
├── tabular/ # Traditional ML tools
├── utils/ # Core utilities
├── plot.py # Visualization tools
└── losses.py # Custom loss functions
Source: cmn_ai GitHub Repository
This structure separates the core training engine from the domain-specific tools, making the library easy to maintain and extend.
graph TD subgraph "cmn_ai Architecture" L[Learner] CB[Callbacks] U[Utils] P[Plot] Loss[Losses] L -- "Uses" --> CB L -- "Uses" --> Loss L -- "Uses" --> U L -- "Uses" --> P subgraph "Domain Layers" V[VisionLearner] T[TextList] Tab[Tabular Transformers] end V -- "Extends" --> L T -- "Built for" --> L Tab -- "Integrates with" --> Sklearn[scikit-learn] U -- "Supports" --> V U -- "Supports" --> T U -- "Supports" --> Tab end
Diagram 3: High-Level Architecture. The core Learner
is extended by domain-specific modules like VisionLearner
, while tabular tools integrate directly with the scikit-learn ecosystem. Common utilities support all parts of the library.
Takeaway: The modular design allows you to adopt
cmn_ai
incrementally and ensures that the library remains organized and scalable.
Conclusion: Build Faster, Iterate Smarter
cmn_ai is more than just a collection of tools; it’s a workflow philosophy designed to make you a more effective and efficient machine learning practitioner. By handling the boilerplate and providing a flexible, powerful framework for experimentation, it allows you to accelerate your development cycles and deliver robust solutions faster.
Ready to speed up your workflow?
- Install the library:
pip install cmn-ai
- Explore the code: Visit the GitHub Repository
- Read the docs: Full Documentation
License and Citation
cmn_ai
is licensed under the Apache License 2.0. If you use this library in your research, please consider citing it:
@software{cmn_ai,
title={cmn_ai: A Machine Learning Library for Accelerated AI Workflows},
author={Imad Dabbura},
url={https://github.com/ImadDabbura/cmn_ai},
year={2024}
}