API Reference¶
This section contains the auto-generated documentation for Octuner's classes, methods, and functions extracted from the source code docstrings.
Core Components¶
AutoTuner¶
The main entry point for auto-tuning LLM components. Orchestrates discovery, execution, and optimization.
octuner.optimization.auto.AutoTuner(component, entrypoint=None, dataset=None, metric=None, entrypoint_function=None, metric_function=None, max_workers=1, optimization_mode='pareto', n_trials=120, constraints=None, scalarization_weights=None) ¶
This is the main orchestrator for auto-tuning the LLM components. As the central class of the Octuner optimization system, it that coordinates the entire parameter optimization workflow. It automatically discovers tunable parameters in complex component hierarchies, builds search spaces, and runs optimization algorithms to find the best configuration.
How it works:
-
Component Discovery: Automatically finds all tunable components in the object hierarchy using ComponentDiscovery, identifying parameters that can be optimized (temperature, model selection, provider choices, etc.)
-
Search Space Construction: Builds a multi-dimensional search space from discovered parameters, defining the optimization landscape with parameter types, ranges, and constraints.
-
Optimization Execution: Runs intelligent search algorithms (Pareto, constrained, or scalarized optimization) to explore the search space and find optimal parameter combinations.
-
Result Analysis: Provides comprehensive results including the best parameters, performance metrics, and detailed trial information for analysis and application.
Key Features:
- Multi-Objective Optimization: Supports Pareto optimization for balancing quality, cost, and latency objectives
- Constraint Handling: Supports hard constraints for real-world deployment requirements
- Scalarization: Converts multi-objective problems to single-objective optimization with custom weights
- Parallel Execution: Supports concurrent trial execution for faster optimization
- Flexible Filtering: Include/exclude specific parameters to focus optimization
- Reproducible Results: Seed support for consistent optimization runs
Example
from octuner import AutoTuner, MultiProviderTunableLLM
# Create a tunable component
llm = MultiProviderTunableLLM(config_file="config.yaml")
# Define evaluation function and dataset
def evaluate(component, input_data):
result = component.call(input_data["text"])
return {"quality": compute_quality(result, input_data["target"])}
dataset = [{"text": "Hello", "target": "Hi there"}]
# Create and configure tuner
tuner = AutoTuner(
component=llm,
entrypoint=evaluate,
dataset=dataset,
metric=lambda output, target: output["quality"]
)
# Focus on specific parameters
tuner.include(["*.temperature", "*.provider_model"])
# Run optimization
result = tuner.search(max_trials=50, mode="pareto")
# Apply the best parameters
from octuner import apply_best
apply_best(llm, result.best_parameters)
Optimization Modes:
- **"pareto"**: Multi-objective optimization finding Pareto-optimal solutions
- **"constrained"**: Single-objective optimization with hard constraints
- **"scalarized"**: Multi-objective converted to single-objective with weights
Initialize the AutoTuner with a component and evaluation setup.
This constructor sets up the AutoTuner with all necessary components for parameter optimization. It validates the inputs and prepares the internal state for discovery and optimization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
component | Any | The component to optimize. Must contain tunable parameters (implement TunableMixin or be registered as tunable). Can be a single component or a complex hierarchy of components. | required |
entrypoint | EntrypointFunction | Function that evaluates the component with input data. Called as | None |
dataset | Dataset | List of input/target pairs for evaluation. Each item should contain the input data and expected output for evaluation. | None |
metric | MetricFunction | Function that computes quality scores from evaluation results. Called as | None |
entrypoint_function | EntrypointFunction | Same as entrypoint but with clearer naming. | None |
metric_function | MetricFunction | Same as metric but with clearer naming. | None |
max_workers | int | Maximum number of concurrent workers for parallel evaluation during optimization trials. Higher values speed up optimization but use more resources. | 1 |
optimization_mode | str | Optimization strategy to use. Options: - "pareto": Multi-objective optimization (default) - "constrained": Single-objective with constraints - "scalarized": Multi-objective with custom weights | 'pareto' |
n_trials | int | Default number of optimization trials to run. Can be overridden in search() calls. | 120 |
constraints | Optional[Constraints] | Hard constraints for constrained optimization mode. Dictionary with constraint names and values. | None |
scalarization_weights | Optional[ScalarizationWeights] | Weights for scalarized optimization mode. Dictionary mapping objective names to weights. | None |
from_component(*, component, entrypoint, dataset, metric, max_workers=1) classmethod ¶
Create an AutoTuner instance using the factory pattern. It's the recommended way to create AutoTuner instances for most use cases.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
component | Any | The component to optimize. Must contain tunable parameters (implement TunableMixin or be registered as tunable). | required |
entrypoint | EntrypointFunction | Function that evaluates the component with input data. Called as | required |
dataset | Dataset | List of input/target pairs for evaluation. Each item should contain the input data and expected output for evaluation. | required |
metric | MetricFunction | Function that computes quality scores from evaluation results. Called as | required |
max_workers | int | Maximum number of concurrent workers for parallel evaluation during optimization trials. Default is 1. | 1 |
Returns:
| Type | Description |
|---|---|
AutoTuner | Configured AutoTuner instance ready for optimization. |
Example
# Create tuner using factory method
tuner = AutoTuner.from_component(
component=my_llm,
entrypoint=lambda comp, data: comp.call(data["text"]),
dataset=test_dataset,
metric=lambda output, target: compute_quality(output, target),
max_workers=4
)
# Run optimization
result = tuner.search(max_trials=100)
include(patterns) ¶
This method allows to narrow down the optimization to only specific parameters or components, reducing the search space and focusing on the most important parameters for your use case.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
patterns | List[str] | List of glob patterns to include in optimization. Only parameters matching at least one pattern will be included in the search space. Examples: - [".temperature"]: Include all temperature parameters - [".provider_model"]: Include all provider/model selections - ["classifier_llm."]: Include all parameters in classifier_llm - [".max_tokens", "*.top_p"]: Include max_tokens and top_p | required |
Returns:
| Type | Description |
|---|---|
AutoTuner | Self for method chaining, allowing fluent interface. |
Example
Note
Include patterns are applied before exclude patterns. If no include patterns are set, all discovered parameters are considered for inclusion.
exclude(patterns) ¶
This method allows to exclude certain parameters from the optimization process, typically to remove debug parameters, verbose settings, or other parameters that shouldn't be optimized.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
patterns | List[str] | List of glob patterns to exclude from optimization. Parameters matching any pattern will be removed from the search space. Examples: - [".verbose"]: Exclude all verbose parameters - [".debug", ".log_level"]: Exclude debug and logging parameters - [".frequency_penalty"]: Exclude specific parameters - ["nested.component.*"]: Exclude all parameters in nested.component | required |
Returns:
| Type | Description |
|---|---|
AutoTuner | Self for method chaining |
Example
Note
Exclude patterns are applied after include patterns. If a parameter matches both include and exclude patterns, it will be excluded.
build_search_space() ¶
Performs the component discovery process and constructs the search space that defines the optimization landscape. It's automatically called by search() if not already built, but can be called manually to inspect the discovered parameters.
The discovery process:
- Component Traversal: Recursively explores the component hierarchy
- Parameter Detection: Identifies all tunable parameters using TunableMixin protocol or registry-based detection
- Search Space Construction: Builds a flat dictionary mapping parameter paths to their definitions and constraints
- Validation: Ensures at least one tunable parameter is found
The resulting search space contains:
- Parameter paths (e.g., "llm.temperature", "classifier.max_tokens")
- Parameter types ("float", "int", "choice", "bool")
- Value ranges or choices for each parameter
- Default values where applicable
Returns:
| Type | Description |
|---|---|
None | None. The search space is stored in self.search_space. |
Raises:
| Type | Description |
|---|---|
ValueError | If no tunable components are found in the component hierarchy. This usually means the component doesn't implement TunableMixin or isn't registered as tunable. |
Example
# Build search space manually
tuner.build_search_space()
# Inspect discovered parameters
summary = tuner.get_search_space_summary()
print(f"Found {summary['total_parameters']} tunable parameters")
print(f"Parameter types: {summary['parameter_types']}")
# View specific parameters
for param_path, param_def in tuner.search_space.items():
print(f"{param_path}: {param_def}")
Note
This method is idempotent - calling it multiple times has no effect after the first successful call. The search space is cached until the component structure changes.
search(*, max_trials=120, mode='pareto', constraints=None, scalarization_weights=None, replicates=1, timeout=None, seed=None) ¶
Run the optimization search to find the best parameter configuration.
This is the main method that orchestrates the entire optimization process. It automatically discovers tunable parameters, sets up the optimization environment, and runs intelligent search algorithms to find optimal parameter combinations.
The optimization process:
- Discovery: Finds all tunable parameters in the component hierarchy
- Search Space Setup: Builds the multi-dimensional search space
- Optimization: Runs the specified optimization algorithm
- Result Analysis: Analyzes results and returns comprehensive findings
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_trials | int | Maximum number of optimization trials to run. More trials generally lead to better results but take longer. Typical values range from 50-500 depending on search space size. | 120 |
mode | OptimizationMode | Optimization strategy to use: - "pareto": Multi-objective optimization finding Pareto-optimal solutions that balance multiple objectives (quality, cost, latency) - "constrained": Single-objective optimization with hard constraints for real-world deployment requirements - "scalarized": Multi-objective converted to single-objective using custom weights for different objectives | 'pareto' |
constraints | Optional[Constraints] | Hard constraints for constrained optimization mode. Dictionary with constraint names and maximum values. Example: {"max_cost": 0.01, "max_latency_ms": 1000} | None |
scalarization_weights | Optional[ScalarizationWeights] | Weights for scalarized optimization mode. Dictionary mapping objective names to weights. Weights should sum to 1.0 for best results. Example: {"quality": 0.7, "cost": 0.3} | None |
replicates | int | Number of replicates per trial for statistical robustness. Higher values reduce noise but increase computation time. Default is 1 for faster optimization. | 1 |
timeout | Optional[float] | Maximum time in seconds for the entire optimization process. If None, optimization runs until max_trials is reached. | None |
seed | Optional[int] | Random seed for reproducible optimization results. Use the same seed to get identical results across runs. | None |
Returns:
| Type | Description |
|---|---|
SearchResult | SearchResult containing: |
SearchResult |
|
SearchResult |
|
SearchResult |
|
SearchResult |
|
SearchResult |
|
Example
# Basic optimization
result = tuner.search()
# Advanced optimization with constraints
result = tuner.search(
max_trials=200,
mode="constrained",
constraints={"max_cost": 0.01, "max_latency_ms": 500},
replicates=3,
timeout=3600, # 1 hour timeout
seed=42
)
# Multi-objective optimization with custom weights
result = tuner.search(
mode="scalarized",
scalarization_weights={"quality": 0.8, "cost": 0.2},
max_trials=100
)
# Access results
print(f"Best quality: {result.best_trial.metrics.quality}")
print(f"Best parameters: {result.best_parameters}")
Note
The first call to search() will automatically discover tunable components and build the search space. Subsequent calls reuse the existing search space unless the component structure changes.
get_search_space_summary() ¶
Provides detailed information about the tunable parameters discovered in the component hierarchy, including counts, types, and component distribution. Useful for understanding the optimization landscape before running optimization.
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Dictionary containing search space summary with keys: "total_parameters", "parameter_types", "components" |
Example
# Build search space and get summary
tuner.build_search_space()
summary = tuner.get_search_space_summary()
print(f"Total parameters: {summary['total_parameters']}")
print(f"Parameter types: {summary['parameter_types']}")
print(f"Components: {summary['components']}")
# Output might be:
# Total parameters: 12
# Parameter types: {'float': 6, 'choice': 4, 'int': 2}
# Components: {'llm': 8, 'classifier_llm': 4}
Note
This method requires the search space to be built first. It's automatically called by search() if not already built.
get_current_parameters() ¶
Get the current parameter values on the component.
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Dictionary of current parameter values |
Raises:
| Type | Description |
|---|---|
ValueError | If search space has not been built yet |
Tunable LLM¶
MultiProviderTunableLLM¶
A tunable LLM wrapper that optimizes provider, model, and parameter selection across multiple LLM providers.
octuner.tunable.tunable_llm.MultiProviderTunableLLM(config_file, default_provider='openai', default_model=None, provider_configs=None) ¶
Bases: TunableMixin
A tunable LLM wrapper that optimizes provider, model, and parameter selection.
This class allows the optimization system to discover the best combination of: - LLM provider (OpenAI, Gemini, etc.) - Model within that provider - Model-specific parameters (temperature, max_tokens, etc.)
Configuration is defined explicitly via YAML files.
Example
llm = MultiProviderTunableLLM(config_file="my_llm_config.yaml") response = llm.call("What is the capital of France?") print(response.text)
Initialize the tunable LLM with explicit configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_file | str | Path to YAML configuration file (required) | required |
default_provider | str | Default provider to use ('openai', 'gemini') | 'openai' |
default_model | Optional[str] | Default model to use (if None, uses provider's default) | None |
provider_configs | Optional[Dict[str, Dict[str, Any]]] | Configuration for each provider (API keys, etc.) | None |
call(prompt, system_prompt=None, **kwargs) ¶
Make an LLM call using the current provider and parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt | str | The user prompt | required |
system_prompt | Optional[str] | Optional system prompt | None |
**kwargs | Additional parameters that override instance settings | {} |
Returns:
| Type | Description |
|---|---|
LLMResponse | LLMResponse with the result |
llm_eq_cost(*, input_tokens=None, output_tokens=None, metadata=None) ¶
Calculate the cost of an LLM call based on current provider and model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_tokens | Number of input tokens | None | |
output_tokens | Number of output tokens | None | |
metadata | Additional metadata from the LLM call | None |
Returns:
| Type | Description |
|---|---|
| Cost in USD, or None if tokens are not available |
get_current_provider_info() ¶
Get information about the current provider and model.
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Dictionary with provider info |
set_provider_configs(configs) ¶
Update provider configurations (API keys, base URLs, etc.).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
configs | Dict[str, Dict[str, Any]] | Dictionary mapping provider names to configuration dicts | required |
TunableMixin¶
Base mixin class that makes any component tunable by the optimization system.
octuner.tunable.mixin.TunableMixin() ¶
Mixin class for LLM components that can be auto-tuned.
Components can be made tunable by either: 1. Using instance methods like mark_as_tunable() (legacy approach) 2. Programmatic registration using register_tunable_class() (recommended)
Example
class MyLLM(TunableMixin): def init(self): super().init() # Legacy approach self.mark_as_tunable("temperature", "float", (0.0, 1.0), 0.7)
# Or programmatic registration (recommended)
from octuner.tunable.registry import register_tunable_class
register_tunable_class(
self.__class__,
params={
"temperature": ("float", 0.0, 1.0),
"max_tokens": ("int", 64, 4096),
},
call_method="send_prompt"
)
Initialize the tunable mixin.
mark_as_tunable(param_name, param_type, range_vals, default=None) ¶
Mark a parameter as tunable.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
param_name | str | Name of the parameter | required |
param_type | str | Type of parameter ("float", "int", "choice", "bool") | required |
range_vals | Tuple[Any, Any] | Range tuple (min, max) for numeric types, choices for choice type | required |
default | Any | Default value for the parameter | None |
get_tunable_parameters() ¶
Get all tunable parameters.
Returns:
| Type | Description |
|---|---|
Dict[str, Dict[str, Any]] | Dictionary of tunable parameter definitions |
is_tunable(param_name) ¶
Check if a parameter is tunable.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
param_name | str | Name of the parameter | required |
Returns:
| Type | Description |
|---|---|
bool | True if the parameter is tunable |
get_param_info(param_name) ¶
Get information about a tunable parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
param_name | str | Name of the parameter | required |
Returns:
| Type | Description |
|---|---|
Optional[Dict[str, Any]] | Parameter info dictionary or None if not found |
llm_eq_cost(*, input_tokens=None, output_tokens=None, metadata=None) ¶
Calculate the cost of an LLM call (optional).
Override this method to enable cost tracking during optimization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_tokens | Optional[int] | Number of input tokens | None |
output_tokens | Optional[int] | Number of output tokens | None |
metadata | Optional[Dict[str, Any]] | Additional metadata from the LLM call | None |
Returns:
| Type | Description |
|---|---|
Optional[float] | Cost in your preferred currency, or None to disable cost tracking |
Providers¶
Base Provider¶
Abstract base class for LLM providers and standard response format.
octuner.providers.base.BaseLLMProvider(config_loader, **kwargs) ¶
Bases: ABC
Abstract base class for implementing custom LLM providers in Octuner.
This class serves as the foundation for creating custom LLM provider implementations, enabling to integrate own self-hosted models, proprietary APIs, or any other LLM service.
Key Features: - Configuration-driven: Integrates with YAML-based configuration system - Parameter optimization: Supports automatic parameter tuning through the config loader - Type conversion: Automatic parameter type conversion based on configuration - Cost tracking: Built-in cost calculation and token usage tracking - Standardized responses: Returns consistent LLMResponse objects across all providers
To create a custom provider, you must:
- Inherit from BaseLLMProvider and set the
provider_nameattribute - Implement abstract methods:
call(): Main interface for making LLM requests_make_request(): Low-level API communication_parse_response(): Convert raw API response to LLMResponse-
_calculate_cost(): Calculate cost based on token usage -
Create a configuration file (YAML) defining:
- Available models and their parameters
- Parameter types, ranges, and defaults
- Pricing information for cost calculation
- Provider-specific settings
Example Usage:
from octuner.providers.base import BaseLLMProvider, LLMResponse
from octuner.config.loader import ConfigLoader
class CustomProvider(BaseLLMProvider):
def __init__(self, config_loader, **kwargs):
super().__init__(config_loader, **kwargs)
self.provider_name = "custom"
# Initialize your API client here
def call(self, prompt: str, system_prompt: Optional[str] = None, **kwargs) -> LLMResponse:
# Implementation details...
pass
# Implement other abstract methods...
# Usage with configuration
config_loader = ConfigLoader("my_custom_config.yaml")
provider = CustomProvider(config_loader, api_key="your-key")
response = provider.call("Hello, world!")
Configuration File Structure:
providers:
custom:
default_model: "my-model-v1"
available_models: ["my-model-v1", "my-model-v2"]
pricing_usd_per_1m_tokens:
my-model-v1: [0.5, 1.0] # [input_cost, output_cost]
model_capabilities:
my-model-v1:
supported_parameters: ["temperature", "max_tokens"]
parameters:
temperature:
type: float
range: [0.0, 2.0]
default: 0.7
max_tokens:
type: int
range: [1, 4000]
default: 1000
This constructor sets up the provider with access to the configuration system and stores any provider-specific parameters. The config_loader is mandatory as it provides access to model capabilities, parameter definitions, and pricing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_loader | ConfigLoader | Configuration loader instance that provides access to YAML configuration files. This is mandatory. | required |
**kwargs | Provider-specific configuration parameters. These can override default values from the configuration file. | {} |
call(prompt, system_prompt=None, **kwargs) abstractmethod ¶
This is the main interface method that users will call to interact with the LLM provider. It handles parameter resolution, type conversion, API communication, and response parsing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt | str | The user prompt/query to send to the LLM | required |
system_prompt | Optional[str] | Optional system prompt that sets the context or behavior for the LLM. If None, no system prompt is used. | None |
**kwargs | Additional parameters that can override configuration defaults. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
LLMResponse | LLMResponse | Standardized response object containing: |
Example Implementation
def call(self, prompt: str, system_prompt: Optional[str] = None, **kwargs) -> LLMResponse:
import time
start_time = time.time()
# Get model and parameters
model = self._get_parameter("model", kwargs, "default-model")
temperature = self._get_parameter("temperature", kwargs, model)
max_tokens = self._get_parameter("max_tokens", kwargs, model)
# Convert types
temperature = self._convert_parameter_type("temperature", temperature, model)
max_tokens = self._convert_parameter_type("max_tokens", max_tokens, model)
# Prepare API request
api_params = {
"model": model,
"prompt": prompt,
"system_prompt": system_prompt,
"temperature": temperature,
"max_tokens": max_tokens
}
# Make API call
response = self._make_request(**api_params)
# Parse and return response
result = self._parse_response(response)
result.latency_ms = (time.time() - start_time) * 1000
return result
get_cost_per_token(model) ¶
Get the cost per input and output token for a model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | str | Model identifier | required |
Returns:
| Type | Description |
|---|---|
Tuple[float, float] | Tuple of (input_cost_per_1M_tokens, output_cost_per_1M_tokens) |
get_available_models() ¶
Get list of available models for this provider.
Returns:
| Type | Description |
|---|---|
List[str] | List of model identifiers |
octuner.providers.base.LLMResponse(text, provider=None, model=None, cost=None, input_tokens=None, output_tokens=None, latency_ms=None, metadata=None) dataclass ¶
Standard response format from all LLM providers.
OpenAI Provider¶
OpenAI provider implementation using the OpenAI API.
octuner.providers.openai.OpenAIProvider(config_loader, **kwargs) ¶
Bases: BaseLLMProvider
OpenAI provider implementation for Octuner.
This module contains the OpenAI provider implementation using the Responses API.
call(prompt, system_prompt=None, use_websearch=False, **kwargs) ¶
Make a call to OpenAI Chat Completions API, with optional websearch support.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt | str | The user prompt | required |
system_prompt | Optional[str] | Optional system prompt | None |
use_websearch | bool | Whether to enable websearch tool | False |
**kwargs | Additional parameters to override defaults (e.g., model, temperature, max_tokens, etc.) | {} |
Gemini Provider¶
Google Gemini provider implementation using the google-generativeai SDK.
octuner.providers.gemini.GeminiProvider(config_loader, **kwargs) ¶
Bases: BaseLLMProvider
Gemini provider implementation. Using the google-generativeai SDK.
call(prompt, system_prompt=None, use_websearch=False, **kwargs) ¶
Make a call to Gemini API, with optional websearch support.
Provider Registry¶
Functions for registering and retrieving LLM providers.
octuner.providers.registry.register_provider(name, provider_class) ¶
Register a custom LLM provider.
This allows you to add custom providers for self-hosted LLMs or other services.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name | str | Provider name (e.g., 'ollama', 'vllm', 'custom') | required |
provider_class | Type[BaseLLMProvider] | Provider class that inherits from BaseLLMProvider | required |
Raises:
| Type | Description |
|---|---|
ValueError | If provider_class doesn't inherit from BaseLLMProvider |
octuner.providers.registry.get_provider(provider_name, config_loader, **kwargs) ¶
Get a provider instance by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider_name | str | Name of the provider ('openai', 'gemini', or custom) | required |
config_loader | ConfigLoader for configuration-driven behavior (mandatory) | required | |
**kwargs | Provider-specific configuration | {} |
Returns:
| Type | Description |
|---|---|
BaseLLMProvider | Provider instance |
Raises:
| Type | Description |
|---|---|
KeyError | If provider is not supported |
octuner.providers.registry.list_providers() ¶
Get list of all registered provider names.
Returns:
| Type | Description |
|---|---|
List[str] | List of provider names |
Optimization¶
LLMOptimizer¶
Main optimizer class that uses Optuna to find optimal parameter configurations.
octuner.optimization.optimizer.LLMOptimizer(search_space, mode='pareto', constraints=None, scalarization_weights=None, seed=None) ¶
Core optimization engine that uses Optuna to find optimal parameter configurations.
LLMOptimizer is the main optimization engine that coordinates the search for optimal parameter values using Optuna's sophisticated optimization algorithms. It integrates with DatasetExecutor to evaluate parameter configurations and uses optimization strategies to determine the best solutions.
The optimization process: 1. Parameter Suggestion: Uses Optuna to suggest parameter values from the search space 2. Trial Execution: Evaluates suggested parameters using DatasetExecutor 3. Objective Computation: Converts evaluation results to objective values using the strategy 4. Optimization: Uses Optuna's TPE sampler to intelligently explore the parameter space 5. Result Analysis: Extracts best parameters and trial results from completed optimization
Key features: - Supports multiple optimization strategies (Pareto, constrained, scalarized) - Intelligent parameter suggestion using TPE (Tree-structured Parzen Estimator) - Robust error handling with fallback objective values for failed trials - Comprehensive trial result tracking and analysis - Integration with Optuna's pruning mechanism for constraint handling
Initialize the LLMOptimizer with search space and optimization configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
search_space | Dict[str, Tuple[ParamType, Any, Any]] | Dictionary mapping parameter paths to their definitions. Each definition is a tuple of (type, min_value, max_value) for numeric parameters or (type, choices) for categorical parameters. | required |
mode | OptimizationMode | Optimization strategy to use: - "pareto": Multi-objective Pareto optimization (default) - "constrained": Single-objective with hard constraints - "scalarized": Multi-objective converted to single-objective | 'pareto' |
constraints | Optional[Constraints] | Hard constraints for constrained mode. Dictionary with constraint names and maximum values (e.g., {"cost_total": 0.01}). | None |
scalarization_weights | Optional[ScalarizationWeights] | Weights for scalarized mode. Defines relative importance of quality, cost, and latency objectives. | None |
seed | Optional[int] | Random seed for reproducible optimization results. Use the same seed to get identical optimization runs. | None |
suggest_parameters(trial) ¶
Suggest parameter values for an Optuna trial.
This method uses Optuna's intelligent parameter suggestion to generate parameter values from the search space. It handles different parameter types (float, int, choice, bool, list) and converts them to appropriate Optuna suggestion calls.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trial | Trial | Optuna trial object for parameter suggestion | required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Dictionary mapping parameter paths to suggested values |
Raises:
| Type | Description |
|---|---|
TypeError | If parameter ranges are invalid for numeric types |
ValueError | If an unknown parameter type is encountered |
objective_function(trial, executor, replicates=1) ¶
Objective function for Optuna optimization.
This method serves as the objective function that Optuna optimizes. It suggests parameters, executes the trial using the DatasetExecutor, and converts the results to objective values using the optimization strategy.
The function handles errors gracefully by returning fallback objective values for failed trials, ensuring the optimization process continues even when individual trials fail.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trial | Trial | Optuna trial object for parameter suggestion | required |
executor | Any | DatasetExecutor instance for trial evaluation | required |
replicates | int | Number of replicates to run for statistical robustness | 1 |
Returns:
| Type | Description |
|---|---|
Tuple[float, ...] | Tuple of objective values for Optuna optimization |
optimize(executor, max_trials=120, replicates=1, timeout=None) ¶
Run the optimization process to find optimal parameter configurations.
This method orchestrates the entire optimization process using Optuna. It runs the specified number of trials, each evaluating a different parameter configuration, and returns comprehensive results from all trials.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
executor | Any | DatasetExecutor instance for evaluating parameter configurations | required |
max_trials | int | Maximum number of optimization trials to run. More trials generally lead to better results but take longer. | 120 |
replicates | int | Number of replicates per trial for statistical robustness. Higher values reduce noise but increase computation time. | 1 |
timeout | Optional[float] | Maximum time in seconds for the entire optimization process. If None, optimization runs until max_trials is reached. | None |
Returns:
| Type | Description |
|---|---|
List[TrialResult] | List of TrialResult objects containing results from all completed trials, |
List[TrialResult] | including both successful and failed trials with error information. |
get_best_parameters() ¶
Get the best parameter configuration found during optimization.
This method returns the parameter values from the best trial according to the optimization strategy. The definition of "best" depends on the strategy used (e.g., highest quality for Pareto, lowest combined score for scalarized).
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Dictionary mapping parameter paths to their optimal values. |
Dict[str, Any] | Returns empty dictionary if no trials completed successfully. |
get_best_trial() ¶
Get the best trial result from the optimization.
This method returns the complete TrialResult object for the best trial, including parameters, metrics, and success status. The best trial is determined by the optimization strategy used.
Returns:
| Type | Description |
|---|---|
Optional[TrialResult] | TrialResult object for the best trial, or None if no trials |
Optional[TrialResult] | completed successfully. |
OptimizationStrategy¶
Abstract base class for different optimization strategies (single objective, multi-objective).
octuner.optimization.optimizer.OptimizationStrategy(constraints=None, scalarization_weights=None) ¶
Bases: ABC
Abstract base class for optimization strategies in Octuner.
This class defines the interface that all optimization strategies must implement to work with the LLMOptimizer. It provides a unified way to handle different optimization approaches while maintaining compatibility with Optuna.
Each strategy must implement methods for creating Optuna studies, computing objective values from metric results, and determining the best trial from completed studies.
create_study(study_name, seed=None) abstractmethod ¶
Create an Optuna study configured for this optimization strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
study_name | str | Name for the Optuna study | required |
seed | Optional[int] | Random seed for reproducible optimization | None |
Returns:
| Type | Description |
|---|---|
Study | Configured Optuna study ready for optimization |
compute_objectives(result) abstractmethod ¶
Convert the raw metric results (quality, cost, latency) into objective values that Optuna can optimize. The conversion depends on the specific strategy (e.g., Pareto uses multiple objectives, scalarized combines them into one).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result | MetricResult | MetricResult containing quality, cost, and latency | required |
Returns:
| Type | Description |
|---|---|
Tuple[float, ...] | Tuple of objective values for Optuna optimization |
get_fallback_objectives() abstractmethod ¶
When a trial fails (e.g., due to parameter constraints or errors), this method provides objective values that represent the worst possible performance, ensuring failed trials are properly ranked.
Returns:
| Type | Description |
|---|---|
Tuple[float, ...] | Tuple of fallback objective values |
get_best_trial_from_study(study) abstractmethod ¶
Get the best trial from the study according to this strategy.
Different strategies define "best" differently: - Pareto: Highest quality among Pareto-optimal solutions - Constrained: Highest quality among constraint-satisfying trials - Scalarized: Lowest combined objective score
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
study | Study | Completed Optuna study | required |
Returns:
| Type | Description |
|---|---|
Optional[FrozenTrial] | Best trial according to this strategy, or None if no trials completed |
DatasetExecutor¶
Executes trials on datasets with parallel processing support.
octuner.optimization.executor.DatasetExecutor(component, entrypoint, dataset, metric, max_workers=1) ¶
DatasetExecutor is a core component of the optimization system that handles the execution of evaluation trials during parameter optimization. It manages the evaluation of components over datasets, collects performance metrics, and provides both sequential and parallel execution capabilities.
How it works:
-
Parameter Application: Applies trial parameters to the component using the parameter setter utilities, ensuring consistent parameter configuration across trials.
-
Dataset Evaluation: Executes the entrypoint function over each dataset item, collecting outputs and computing quality scores using the provided metric function.
-
Metrics Collection: Automatically collects comprehensive metrics including quality scores, execution costs, and latency measurements for each trial.
-
Parallel Execution: Supports concurrent evaluation of dataset items using ThreadPoolExecutor for faster trial execution when max_workers > 1.
-
Statistical Aggregation: Provides robust statistical aggregation across replicates and dataset items using median-based aggregation for stability.
Key Features:
- Parallel Execution: Multi-threaded evaluation for faster optimization
- Comprehensive Metrics: Quality, cost, and latency tracking
- Statistical Robustness: Median-based aggregation to handle outliers
- Error Handling: Graceful handling of individual item failures
- Replicate Support: Multiple trial runs for statistical significance
- Cost Tracking: Automatic cost collection from tunable components
Constructor
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
component | Any | The component to evaluate. Must be tunable and support parameter setting via the parameter setter utilities. | required |
entrypoint | EntrypointFunction | Function that evaluates the component with input data. Called as | required |
dataset | Dataset | List of input/target pairs for evaluation. Each item should be a dictionary with 'input' and 'target' keys containing the input data and expected output respectively. | required |
metric | MetricFunction | Function that computes quality scores from evaluation results. Called as | required |
max_workers | int | Maximum number of concurrent workers for parallel evaluation. Use 1 for sequential execution, >1 for parallel execution. Higher values speed up I/O-bound tasks but may not help with CPU-bound tasks due to Python's GIL. | 1 |
execute_trial(parameters) ¶
Execute a single evaluation trial with the given parameters. t applies the trial parameters to the component, executes the evaluation over all dataset items, and returns aggregated metrics including quality, cost, and latency.
The execution process:
- Parameter Application: Sets the trial parameters on the component
- Call Log Clearing: Clears any previous call logs for clean metrics
- Dataset Evaluation: Runs the entrypoint function over each dataset item
- Metrics Collection: Collects quality scores, costs, and timing data
- Statistical Aggregation: Computes median quality and total cost
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parameters | Dict[str, Any] | Dictionary of parameter values to set on the component. Keys should match the parameter paths from the search space. Example: {"llm.temperature": 0.7, "llm.max_tokens": 100} | required |
Returns:
| Type | Description |
|---|---|
MetricResult | MetricResult containing: |
MetricResult |
|
MetricResult |
|
MetricResult |
|
execute_with_replicates(parameters, replicates=1) ¶
Execute a trial multiple times and aggregate results for statistical robustness. It's particularly useful for optimization scenarios where individual trials may have high variance due to non-deterministic components or external factors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parameters | Dict[str, Any] | Dictionary of parameter values to set on the component. Same format as execute_trial(). | required |
replicates | int | Number of times to run the trial. Higher values provide better statistical significance but take longer. Typical values range from 1-10 depending on variance requirements. | 1 |
Returns:
| Type | Description |
|---|---|
MetricResult | MetricResult containing aggregated metrics across all replicates: |
execute(parameters) ¶
Execute the full dataset and return per-item results, unlike execute_trial() which returns aggregated metrics. It's useful for detailed analysis, debugging, r when you need to examine individual item performance rather than overall trial performance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parameters | Dict[str, Any] | Dictionary of parameter values to set on the component. Same format as execute_trial(). | required |
Returns:
| Type | Description |
|---|---|
List[MetricResult] | List of MetricResult objects, one for each dataset item. Each result |
List[MetricResult] | contains the quality, cost, and latency for that specific item. |
Discovery¶
ComponentDiscovery¶
Discovers tunable components in a component tree and builds search spaces.
octuner.discovery.discovery.ComponentDiscovery(include_patterns=None, exclude_patterns=None) ¶
This is a core component of Octuner that automatically finds and catalogs all tunable parameters within an object hierarchy. It recursively traverses object attributes to identify components that implement the TunableMixin protocol, building a search space that can be optimized by the AutoTuner.
How it works: 1. Recursive Traversal: Starting from a root component, it recursively explores all object attributes using __dict__ introspection, avoiding circular references and method calls.
-
Tunable Detection: For each object found, it checks if it implements the TunableMixin protocol using
is_llm_tunable(), which supports both: - Instance-based tunables (legacy): Objects withget_tunable_parameters()method - Registry-based tunables (recommended): Objects registered throughregister_tunable_class() -
Parameter Extraction: For tunable objects, it extracts parameter definitions including type, range, and default values using
get_tunable_parameters(). -
Path Mapping: Each discovered parameter is mapped to a dotted path (e.g., "classifier_llm.temperature") that uniquely identifies its location in the hierarchy.
-
Filtering: Optional include/exclude patterns can be used to focus the search space on specific parameters or components.
Example
from octuner import ComponentDiscovery, MultiProviderTunableLLM
# Create a complex component hierarchy
analyzer = TunableSentimentAnalyzer(config_file)
# Discover all tunable parameters
discovery = ComponentDiscovery()
tunables = discovery.discover(analyzer)
# Result: {
# "classifier_llm": {
# "temperature": ("float", 0.0, 2.0),
# "max_tokens": ("int", 64, 4096),
# "provider_model": ("choice", ["openai:gpt-4", "gemini:gemini-pro"])
# },
# "confidence_llm": { ... },
# "reasoning_llm": { ... }
# }
# Focus on specific parameters
focused_discovery = ComponentDiscovery(
include_patterns=["*.temperature", "*.provider_model"],
exclude_patterns=["*.verbose"]
)
focused_tunables = focused_discovery.discover(analyzer)
Integration with AutoTuner
ComponentDiscovery is automatically used by AutoTuner to build the search space before optimization begins. The discovered parameters become the dimensions that the optimizer explores to find the best configuration.
Attributes:
| Name | Type | Description |
|---|---|---|
include_patterns | List of glob patterns to include in discovery | |
exclude_patterns | List of glob patterns to exclude from discovery |
Initialize discovery with optional filters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
include_patterns | Optional[List[str]] | Glob patterns to include (e.g., ["*.temperature"]) | None |
exclude_patterns | Optional[List[str]] | Glob patterns to exclude (e.g., ["*.verbose"]) | None |
discover(component) ¶
Discover all tunable components in the component tree.
This is the main entry point for component discovery. It performs a recursive traversal of the component hierarchy, identifying all objects that implement the TunableMixin protocol and extracting their tunable parameters.
The discovery process
- Starts from the provided root component
- Recursively explores all object attributes
- Identifies tunable components using
is_llm_tunable() - Extracts parameter definitions from each tunable component
- Applies "include/exclude" filters (optional)
- Returns a structured mapping of paths to parameters
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
component | Any | Root component to search. Can be any Python object, but typically a component hierarchy containing multiple tunable LLMs or other tunable components. | required |
Returns:
| Name | Type | Description |
|---|---|---|
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] | Dictionary mapping dotted paths to tunable parameter definitions. The structure is: | |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] | ``` | |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] | { "component_path": { "param_name": (param_type, min_value, max_value), "another_param": (param_type, choices_or_default, ...) } | |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] | } | |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] | ``` | |
Where | Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] | |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] |
| |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] |
| |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] |
| |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] |
| |
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] |
|
Note
I suppose this method is safe to call on any object. It will return an empty dictionary if no tunable components are found.
Discovery Functions¶
octuner.discovery.discovery.build_search_space(discovered) ¶
This method transforms the hierarchical discovery results into a flat search space suitable for optimization. It converts the nested structure from ComponentDiscovery into a flat dictionary where each parameter is identified by its full dotted path.
The transformation: - Input: {"component": {"param": (type, min, max)}} - Output: {"component.param": (type, min, max)}
This flattened structure is what the optimizer uses to define the search space dimensions for parameter optimization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discovered | Dict[str, Dict[str, Tuple[ParamType, Any, Any]]] | Discovery results from ComponentDiscovery.discover() or discover_tunable_components(). Dictionary mapping component paths to their tunable parameters. | required |
Returns:
| Type | Description |
|---|---|
Dict[str, Tuple[ParamType, Any, Any]] | Dictionary mapping full parameter paths to parameter definitions. |
Dict[str, Tuple[ParamType, Any, Any]] | Each key is a dotted path like "classifier_llm.temperature" and each |
Dict[str, Tuple[ParamType, Any, Any]] | value is a tuple containing the parameter type and constraints. |
Configuration¶
ConfigLoader¶
Loads and validates YAML configuration files for LLM providers and models.
octuner.config.loader.ConfigLoader(config_file) ¶
This class provides utilities to load YAML configuration files as described in config_templates/*.yaml.
It allows to get available providers, models, parameters, pricing, and capabilities. Those capabilities become available to the tuning algorithms to know what parameters can be optimized, their ranges, types, and default values.
IMPORTANT: Note that when instantiating this class, the configuration file is loaded immediately.
Initialize config loader with a specific file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_file | str | Path to the YAML configuration file | required |
get_providers() ¶
Get list of available providers.
get_provider_config(provider_name) ¶
Get configuration for a specific provider.
get_default_model(provider_name) ¶
Get default model for a provider.
get_available_models(provider_name) ¶
Get available models for a provider.
get_pricing(provider_name, model) ¶
Get pricing for a model (input_cost, output_cost per 1M tokens).
get_model_capabilities(provider_name, model) ¶
Get capabilities for a specific model.
get_supported_parameters(provider_name, model) ¶
Get list of parameters that can be optimized for a model.
model_supports_parameter(provider_name, model, parameter) ¶
Check if a model supports a specific parameter.
get_parameter_range(provider_name, model, parameter) ¶
Get optimization range for a parameter.
get_parameter_default(provider_name, model, parameter) ¶
Get default value for a parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider_name | str | Name of the provider | required |
model | str | Name of the model | required |
parameter | str | Name of the parameter | required |
Returns:
| Type | Description |
|---|---|
| Default parameter value from configuration |
Raises:
| Type | Description |
|---|---|
ValueError | If parameter default is not defined in configuration |
get_parameter_type(provider_name, model, parameter) ¶
Get the expected type for a parameter from YAML configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider_name | str | Name of the provider | required |
model | str | Name of the model | required |
parameter | str | Name of the parameter | required |
Returns:
| Type | Description |
|---|---|
str | Parameter type ('int', 'float', 'str', 'bool', 'choice') |
Raises:
| Type | Description |
|---|---|
ValueError | If parameter type is not defined in configuration |
get_forced_parameter(provider_name, model, parameter) ¶
Get forced value for a parameter (if any).
validate_config() ¶
Validate the configuration structure.
Utilities¶
Exporter Functions¶
Functions for saving and loading optimized parameters.
octuner.utils.exporter.save_parameters_to_yaml(parameters, path, metadata=None) ¶
Save parameters to a YAML file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parameters | Dict[str, Any] | Dictionary of parameter values | required |
path | str | Path to save the YAML file | required |
metadata | Optional[Dict[str, Any]] | Optional metadata to include | None |
octuner.utils.exporter.load_parameters_from_yaml(path) ¶
Load parameters from a YAML file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path | str | Path to the YAML file | required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Dictionary of parameter values |
Raises:
| Type | Description |
|---|---|
FileNotFoundError | If the file doesn't exist |
YAMLError | If the file is invalid YAML |
octuner.utils.exporter.create_metadata_summary(trials, optimization_mode='pareto', dataset_size=0, total_trials=0, best_quality=0.0, best_cost=None, best_latency_ms=None, dataset_fingerprint=None) ¶
Create a metadata summary for the optimization results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trials | list | List of trial results | required |
optimization_mode | str | Mode used for optimization | 'pareto' |
dataset_size | int | Number of examples in the dataset | 0 |
total_trials | int | Total number of trials run | 0 |
best_quality | float | Best quality score achieved | 0.0 |
best_cost | Optional[float] | Best cost achieved (if available) | None |
best_latency_ms | Optional[float] | Best latency achieved (if available) | None |
dataset_fingerprint | Optional[str] | Dataset fingerprint (if available) | None |
Returns:
| Type | Description |
|---|---|
Dict[str, Any] | Dictionary of metadata |
Parameter Setter¶
Apply optimized parameters to components.
octuner.utils.setter.set_parameters(component, parameters, strict=False) ¶
Convenience function to set parameters on a component.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
component | Any | Component to set parameters on | required |
parameters | Dict[str, Any] | Dictionary mapping dotted paths to values | required |
strict | bool | If True, raise exceptions instead of logging warnings | False |
Type Definitions¶
Core Types¶
Type definitions and data classes used throughout Octuner.
octuner.tunable.types.ParamType = Literal['float', 'int', 'choice', 'bool'] module-attribute ¶
octuner.tunable.types.Dataset = List[DatasetItem] module-attribute ¶
octuner.tunable.types.MetricResult(quality, cost=None, latency_ms=None) dataclass ¶
Result of a single metric evaluation.
octuner.tunable.types.TrialResult(trial_number, parameters, metrics, success=True, error=None) dataclass ¶
Result of a single optimization trial.
octuner.tunable.types.SearchResult(best_trial, all_trials, optimization_mode, dataset_size, total_trials, best_parameters, metrics_summary) dataclass ¶
Result of an optimization search.
save_best(path) ¶
Save best parameters to YAML file.
octuner.tunable.types.Constraints = Dict[str, Union[float, int]] module-attribute ¶
octuner.tunable.types.ScalarizationWeights(cost_weight=1.0, latency_weight=1.0) dataclass ¶
Weights for scalarized optimization mode.