# Optuna — Automatic Hyperparameter Optimization Framework

> Optuna is an automatic hyperparameter optimization framework for machine learning. It provides an imperative define-by-run API that lets you construct search spaces dynamically, with built-in pruning, visualization, and distributed optimization across multiple workers.

## Install

Save as a script file and run:

# Optuna — Automatic Hyperparameter Optimization Framework

## Quick Use
```bash
pip install optuna
python -c "
import optuna

def objective(trial):
    x = trial.suggest_float('x', -10, 10)
    return (x - 2) ** 2

study = optuna.create_study()
study.optimize(objective, n_trials=100, show_progress_bar=False)
print(f'Best value: {study.best_value:.4f}, Best params: {study.best_params}')
"
```

## Introduction
Optuna is a next-generation hyperparameter optimization framework that automates the tedious process of tuning ML model configurations. Its define-by-run API lets you build search spaces programmatically inside the objective function, making it more flexible than config-file-based alternatives. It is framework-agnostic and works with PyTorch, TensorFlow, scikit-learn, XGBoost, and LightGBM.

## What Optuna Does
- Searches hyperparameter spaces using efficient algorithms like TPE, CMA-ES, and random sampling
- Prunes unpromising trials early using median, percentile, or Hyperband-based strategies
- Supports multi-objective optimization for balancing competing metrics (e.g., accuracy vs. latency)
- Distributes trials across multiple workers using shared storage backends (RDB, Redis)
- Provides built-in visualization for optimization history, parameter importance, and parallel coordinates

## Architecture Overview
Optuna uses a study/trial abstraction. A Study manages a collection of Trials, each representing one hyperparameter evaluation. The sampler proposes parameter values using Bayesian optimization (Tree-structured Parzen Estimator by default) informed by past trials. A pruner monitors intermediate values during training to stop bad trials early. All trial data is stored in a configurable backend (SQLite, MySQL, PostgreSQL, or in-memory), enabling parallel distributed optimization.

## Self-Hosting & Configuration
- Install via pip: `pip install optuna` with optional extras: `optuna[visualization]`, `optuna[redis]`
- Create a persistent study: `optuna.create_study(storage='sqlite:///optuna.db', study_name='my_study')` 
- Use the Optuna Dashboard for a web UI: `pip install optuna-dashboard && optuna-dashboard sqlite:///optuna.db`
- Distribute optimization by pointing multiple workers at the same storage backend
- Integrate with ML frameworks via callbacks: `optuna.integration.PyTorchLightningPruningCallback`

## Key Features
- Define-by-run API enables conditional and dynamic search spaces that config files cannot express
- TPE sampler efficiently explores high-dimensional spaces with fewer trials than grid or random search
- Pruning saves compute by stopping unpromising trials after a few epochs
- Multi-objective optimization with Pareto front visualization for tradeoff analysis
- Framework integrations for PyTorch Lightning, Keras, XGBoost, LightGBM, and FastAI

## Comparison with Similar Tools
- **Ray Tune** — distributed HPO with more scheduling backends but heavier dependency footprint
- **Hyperopt** — pioneered TPE but has a less intuitive API and no built-in pruning
- **Weights & Biases Sweeps** — cloud-integrated HPO but requires a W&B account
- **Keras Tuner** — Keras-specific; Optuna is framework-agnostic
- **scikit-optimize** — Bayesian optimization library but lacks pruning, multi-objective, and distributed support

## FAQ
**Q: How many trials do I need for good results?**
A: It depends on dimensionality, but TPE typically finds good regions in 50-200 trials for 5-10 hyperparameters. Pruning reduces wall-clock time significantly.

**Q: Can Optuna optimize non-ML objectives?**
A: Yes, Optuna can optimize any black-box function. It is used for database query tuning, simulation parameters, and even recipe optimization.

**Q: How does pruning work?**
A: You report intermediate values (e.g., validation loss at each epoch) via `trial.report()`. The pruner compares against other trials and raises `TrialPruned` if the trial is underperforming.

**Q: Can I resume a study after interruption?**
A: Yes, if you use a persistent storage backend like SQLite or PostgreSQL. Calling `optuna.load_study()` picks up where you left off.

## Sources
- https://github.com/optuna/optuna
- https://optuna.readthedocs.io

---
Source: https://tokrepo.com/en/workflows/ad894074-3d9d-11f1-9bc6-00163e2b0d79
Author: Script Depot