Skip to content

API Reference

This section contains the auto-generated documentation for Octuner's classes, methods, and functions extracted from the source code docstrings.


Core Components

AutoTuner

The main entry point for auto-tuning LLM components. Orchestrates discovery, execution, and optimization.

octuner.optimization.auto.AutoTuner(component, entrypoint=None, dataset=None, metric=None, entrypoint_function=None, metric_function=None, max_workers=1, optimization_mode='pareto', n_trials=120, constraints=None, scalarization_weights=None)

This is the main orchestrator for auto-tuning the LLM components. As the central class of the Octuner optimization system, it that coordinates the entire parameter optimization workflow. It automatically discovers tunable parameters in complex component hierarchies, builds search spaces, and runs optimization algorithms to find the best configuration.

How it works:

  1. Component Discovery: Automatically finds all tunable components in the object hierarchy using ComponentDiscovery, identifying parameters that can be optimized (temperature, model selection, provider choices, etc.)

  2. Search Space Construction: Builds a multi-dimensional search space from discovered parameters, defining the optimization landscape with parameter types, ranges, and constraints.

  3. Optimization Execution: Runs intelligent search algorithms (Pareto, constrained, or scalarized optimization) to explore the search space and find optimal parameter combinations.

  4. Result Analysis: Provides comprehensive results including the best parameters, performance metrics, and detailed trial information for analysis and application.

Key Features:

  • Multi-Objective Optimization: Supports Pareto optimization for balancing quality, cost, and latency objectives
  • Constraint Handling: Supports hard constraints for real-world deployment requirements
  • Scalarization: Converts multi-objective problems to single-objective optimization with custom weights
  • Parallel Execution: Supports concurrent trial execution for faster optimization
  • Flexible Filtering: Include/exclude specific parameters to focus optimization
  • Reproducible Results: Seed support for consistent optimization runs
Example
from octuner import AutoTuner, MultiProviderTunableLLM

# Create a tunable component
llm = MultiProviderTunableLLM(config_file="config.yaml")

# Define evaluation function and dataset
def evaluate(component, input_data):
    result = component.call(input_data["text"])
    return {"quality": compute_quality(result, input_data["target"])}

dataset = [{"text": "Hello", "target": "Hi there"}]

# Create and configure tuner
tuner = AutoTuner(
    component=llm,
    entrypoint=evaluate,
    dataset=dataset,
    metric=lambda output, target: output["quality"]
)

# Focus on specific parameters
tuner.include(["*.temperature", "*.provider_model"])

# Run optimization
result = tuner.search(max_trials=50, mode="pareto")

# Apply the best parameters
from octuner import apply_best
apply_best(llm, result.best_parameters)

Optimization Modes:

- **"pareto"**: Multi-objective optimization finding Pareto-optimal solutions
- **"constrained"**: Single-objective optimization with hard constraints
- **"scalarized"**: Multi-objective converted to single-objective with weights

Initialize the AutoTuner with a component and evaluation setup.

This constructor sets up the AutoTuner with all necessary components for parameter optimization. It validates the inputs and prepares the internal state for discovery and optimization.

Parameters:

Name Type Description Default
component Any

The component to optimize. Must contain tunable parameters (implement TunableMixin or be registered as tunable). Can be a single component or a complex hierarchy of components.

required
entrypoint EntrypointFunction

Function that evaluates the component with input data. Called as entrypoint(component, input_data) for each dataset item. Should return a dictionary with metrics. (Legacy parameter name)

None
dataset Dataset

List of input/target pairs for evaluation. Each item should contain the input data and expected output for evaluation.

None
metric MetricFunction

Function that computes quality scores from evaluation results. Called as metric(output, target) where output is the result from entrypoint and target is the expected output. (Legacy parameter name)

None
entrypoint_function EntrypointFunction

Same as entrypoint but with clearer naming.

None
metric_function MetricFunction

Same as metric but with clearer naming.

None
max_workers int

Maximum number of concurrent workers for parallel evaluation during optimization trials. Higher values speed up optimization but use more resources.

1
optimization_mode str

Optimization strategy to use. Options: - "pareto": Multi-objective optimization (default) - "constrained": Single-objective with constraints - "scalarized": Multi-objective with custom weights

'pareto'
n_trials int

Default number of optimization trials to run. Can be overridden in search() calls.

120
constraints Optional[Constraints]

Hard constraints for constrained optimization mode. Dictionary with constraint names and values.

None
scalarization_weights Optional[ScalarizationWeights]

Weights for scalarized optimization mode. Dictionary mapping objective names to weights.

None
from_component(*, component, entrypoint, dataset, metric, max_workers=1) classmethod

Create an AutoTuner instance using the factory pattern. It's the recommended way to create AutoTuner instances for most use cases.

Parameters:

Name Type Description Default
component Any

The component to optimize. Must contain tunable parameters (implement TunableMixin or be registered as tunable).

required
entrypoint EntrypointFunction

Function that evaluates the component with input data. Called as entrypoint(component, input_data) for each dataset item. Should return a dictionary with metrics.

required
dataset Dataset

List of input/target pairs for evaluation. Each item should contain the input data and expected output for evaluation.

required
metric MetricFunction

Function that computes quality scores from evaluation results. Called as metric(output, target) where output is the result from entrypoint and target is the expected output.

required
max_workers int

Maximum number of concurrent workers for parallel evaluation during optimization trials. Default is 1.

1

Returns:

Type Description
AutoTuner

Configured AutoTuner instance ready for optimization.

Example
# Create tuner using factory method
tuner = AutoTuner.from_component(
    component=my_llm,
    entrypoint=lambda comp, data: comp.call(data["text"]),
    dataset=test_dataset,
    metric=lambda output, target: compute_quality(output, target),
    max_workers=4
)

# Run optimization
result = tuner.search(max_trials=100)
include(patterns)

This method allows to narrow down the optimization to only specific parameters or components, reducing the search space and focusing on the most important parameters for your use case.

Parameters:

Name Type Description Default
patterns List[str]

List of glob patterns to include in optimization. Only parameters matching at least one pattern will be included in the search space. Examples: - [".temperature"]: Include all temperature parameters - [".provider_model"]: Include all provider/model selections - ["classifier_llm."]: Include all parameters in classifier_llm - [".max_tokens", "*.top_p"]: Include max_tokens and top_p

required

Returns:

Type Description
AutoTuner

Self for method chaining, allowing fluent interface.

Example
# Focus on core LLM parameters
tuner.include(["*.temperature", "*.max_tokens", "*.provider_model"])

# Focus on specific components
tuner.include(["classifier_llm.*", "confidence_llm.*"])

# Chain with exclude for fine control
tuner.include(["*.temperature"]).exclude(["*.verbose"])
Note

Include patterns are applied before exclude patterns. If no include patterns are set, all discovered parameters are considered for inclusion.

exclude(patterns)

This method allows to exclude certain parameters from the optimization process, typically to remove debug parameters, verbose settings, or other parameters that shouldn't be optimized.

Parameters:

Name Type Description Default
patterns List[str]

List of glob patterns to exclude from optimization. Parameters matching any pattern will be removed from the search space. Examples: - [".verbose"]: Exclude all verbose parameters - [".debug", ".log_level"]: Exclude debug and logging parameters - [".frequency_penalty"]: Exclude specific parameters - ["nested.component.*"]: Exclude all parameters in nested.component

required

Returns:

Type Description
AutoTuner

Self for method chaining

Example
# Exclude debug parameters
tuner.exclude(["*.verbose", "*.debug", "*.log_level"])

# Exclude less important parameters
tuner.exclude(["*.frequency_penalty", "*.presence_penalty"])

# Chain with include for precise control
tuner.include(["*.temperature"]).exclude(["*.verbose"])
Note

Exclude patterns are applied after include patterns. If a parameter matches both include and exclude patterns, it will be excluded.

build_search_space()

Performs the component discovery process and constructs the search space that defines the optimization landscape. It's automatically called by search() if not already built, but can be called manually to inspect the discovered parameters.

The discovery process:

  1. Component Traversal: Recursively explores the component hierarchy
  2. Parameter Detection: Identifies all tunable parameters using TunableMixin protocol or registry-based detection
  3. Search Space Construction: Builds a flat dictionary mapping parameter paths to their definitions and constraints
  4. Validation: Ensures at least one tunable parameter is found

The resulting search space contains:

  • Parameter paths (e.g., "llm.temperature", "classifier.max_tokens")
  • Parameter types ("float", "int", "choice", "bool")
  • Value ranges or choices for each parameter
  • Default values where applicable

Returns:

Type Description
None

None. The search space is stored in self.search_space.

Raises:

Type Description
ValueError

If no tunable components are found in the component hierarchy. This usually means the component doesn't implement TunableMixin or isn't registered as tunable.

Example
# Build search space manually
tuner.build_search_space()

# Inspect discovered parameters
summary = tuner.get_search_space_summary()
print(f"Found {summary['total_parameters']} tunable parameters")
print(f"Parameter types: {summary['parameter_types']}")

# View specific parameters
for param_path, param_def in tuner.search_space.items():
    print(f"{param_path}: {param_def}")
Note

This method is idempotent - calling it multiple times has no effect after the first successful call. The search space is cached until the component structure changes.

search(*, max_trials=120, mode='pareto', constraints=None, scalarization_weights=None, replicates=1, timeout=None, seed=None)

Run the optimization search to find the best parameter configuration.

This is the main method that orchestrates the entire optimization process. It automatically discovers tunable parameters, sets up the optimization environment, and runs intelligent search algorithms to find optimal parameter combinations.

The optimization process:

  1. Discovery: Finds all tunable parameters in the component hierarchy
  2. Search Space Setup: Builds the multi-dimensional search space
  3. Optimization: Runs the specified optimization algorithm
  4. Result Analysis: Analyzes results and returns comprehensive findings

Parameters:

Name Type Description Default
max_trials int

Maximum number of optimization trials to run. More trials generally lead to better results but take longer. Typical values range from 50-500 depending on search space size.

120
mode OptimizationMode

Optimization strategy to use: - "pareto": Multi-objective optimization finding Pareto-optimal solutions that balance multiple objectives (quality, cost, latency) - "constrained": Single-objective optimization with hard constraints for real-world deployment requirements - "scalarized": Multi-objective converted to single-objective using custom weights for different objectives

'pareto'
constraints Optional[Constraints]

Hard constraints for constrained optimization mode. Dictionary with constraint names and maximum values. Example: {"max_cost": 0.01, "max_latency_ms": 1000}

None
scalarization_weights Optional[ScalarizationWeights]

Weights for scalarized optimization mode. Dictionary mapping objective names to weights. Weights should sum to 1.0 for best results. Example: {"quality": 0.7, "cost": 0.3}

None
replicates int

Number of replicates per trial for statistical robustness. Higher values reduce noise but increase computation time. Default is 1 for faster optimization.

1
timeout Optional[float]

Maximum time in seconds for the entire optimization process. If None, optimization runs until max_trials is reached.

None
seed Optional[int]

Random seed for reproducible optimization results. Use the same seed to get identical results across runs.

None

Returns:

Type Description
SearchResult

SearchResult containing:

SearchResult
  • best_trial: The best performing trial with optimal parameters
SearchResult
  • all_trials: List of all trials for detailed analysis
SearchResult
  • best_parameters: Dictionary of best parameter values
SearchResult
  • optimization_mode: The mode used for optimization
SearchResult
  • metrics_summary: Statistical summary of all trials
Example
# Basic optimization
result = tuner.search()

# Advanced optimization with constraints
result = tuner.search(
    max_trials=200,
    mode="constrained",
    constraints={"max_cost": 0.01, "max_latency_ms": 500},
    replicates=3,
    timeout=3600,  # 1 hour timeout
    seed=42
)

# Multi-objective optimization with custom weights
result = tuner.search(
    mode="scalarized",
    scalarization_weights={"quality": 0.8, "cost": 0.2},
    max_trials=100
)

# Access results
print(f"Best quality: {result.best_trial.metrics.quality}")
print(f"Best parameters: {result.best_parameters}")
Note

The first call to search() will automatically discover tunable components and build the search space. Subsequent calls reuse the existing search space unless the component structure changes.

get_search_space_summary()

Provides detailed information about the tunable parameters discovered in the component hierarchy, including counts, types, and component distribution. Useful for understanding the optimization landscape before running optimization.

Returns:

Type Description
Dict[str, Any]

Dictionary containing search space summary with keys: "total_parameters", "parameter_types", "components"

Example
# Build search space and get summary
tuner.build_search_space()
summary = tuner.get_search_space_summary()

print(f"Total parameters: {summary['total_parameters']}")
print(f"Parameter types: {summary['parameter_types']}")
print(f"Components: {summary['components']}")

# Output might be:
# Total parameters: 12
# Parameter types: {'float': 6, 'choice': 4, 'int': 2}
# Components: {'llm': 8, 'classifier_llm': 4}
Note

This method requires the search space to be built first. It's automatically called by search() if not already built.

get_current_parameters()

Get the current parameter values on the component.

Returns:

Type Description
Dict[str, Any]

Dictionary of current parameter values

Raises:

Type Description
ValueError

If search space has not been built yet


Tunable LLM

MultiProviderTunableLLM

A tunable LLM wrapper that optimizes provider, model, and parameter selection across multiple LLM providers.

octuner.tunable.tunable_llm.MultiProviderTunableLLM(config_file, default_provider='openai', default_model=None, provider_configs=None)

Bases: TunableMixin

A tunable LLM wrapper that optimizes provider, model, and parameter selection.

This class allows the optimization system to discover the best combination of: - LLM provider (OpenAI, Gemini, etc.) - Model within that provider - Model-specific parameters (temperature, max_tokens, etc.)

Configuration is defined explicitly via YAML files.

Example

llm = MultiProviderTunableLLM(config_file="my_llm_config.yaml") response = llm.call("What is the capital of France?") print(response.text)

Initialize the tunable LLM with explicit configuration.

Parameters:

Name Type Description Default
config_file str

Path to YAML configuration file (required)

required
default_provider str

Default provider to use ('openai', 'gemini')

'openai'
default_model Optional[str]

Default model to use (if None, uses provider's default)

None
provider_configs Optional[Dict[str, Dict[str, Any]]]

Configuration for each provider (API keys, etc.)

None
call(prompt, system_prompt=None, **kwargs)

Make an LLM call using the current provider and parameters.

Parameters:

Name Type Description Default
prompt str

The user prompt

required
system_prompt Optional[str]

Optional system prompt

None
**kwargs

Additional parameters that override instance settings

{}

Returns:

Type Description
LLMResponse

LLMResponse with the result

llm_eq_cost(*, input_tokens=None, output_tokens=None, metadata=None)

Calculate the cost of an LLM call based on current provider and model.

Parameters:

Name Type Description Default
input_tokens

Number of input tokens

None
output_tokens

Number of output tokens

None
metadata

Additional metadata from the LLM call

None

Returns:

Type Description

Cost in USD, or None if tokens are not available

get_current_provider_info()

Get information about the current provider and model.

Returns:

Type Description
Dict[str, Any]

Dictionary with provider info

set_provider_configs(configs)

Update provider configurations (API keys, base URLs, etc.).

Parameters:

Name Type Description Default
configs Dict[str, Dict[str, Any]]

Dictionary mapping provider names to configuration dicts

required

TunableMixin

Base mixin class that makes any component tunable by the optimization system.

octuner.tunable.mixin.TunableMixin()

Mixin class for LLM components that can be auto-tuned.

Components can be made tunable by either: 1. Using instance methods like mark_as_tunable() (legacy approach) 2. Programmatic registration using register_tunable_class() (recommended)

Example

class MyLLM(TunableMixin): def init(self): super().init() # Legacy approach self.mark_as_tunable("temperature", "float", (0.0, 1.0), 0.7)

    # Or programmatic registration (recommended)
    from octuner.tunable.registry import register_tunable_class
    register_tunable_class(
        self.__class__,
        params={
            "temperature": ("float", 0.0, 1.0),
            "max_tokens": ("int", 64, 4096),
        },
        call_method="send_prompt"
    )

Initialize the tunable mixin.

mark_as_tunable(param_name, param_type, range_vals, default=None)

Mark a parameter as tunable.

Parameters:

Name Type Description Default
param_name str

Name of the parameter

required
param_type str

Type of parameter ("float", "int", "choice", "bool")

required
range_vals Tuple[Any, Any]

Range tuple (min, max) for numeric types, choices for choice type

required
default Any

Default value for the parameter

None
get_tunable_parameters()

Get all tunable parameters.

Returns:

Type Description
Dict[str, Dict[str, Any]]

Dictionary of tunable parameter definitions

is_tunable(param_name)

Check if a parameter is tunable.

Parameters:

Name Type Description Default
param_name str

Name of the parameter

required

Returns:

Type Description
bool

True if the parameter is tunable

get_param_info(param_name)

Get information about a tunable parameter.

Parameters:

Name Type Description Default
param_name str

Name of the parameter

required

Returns:

Type Description
Optional[Dict[str, Any]]

Parameter info dictionary or None if not found

llm_eq_cost(*, input_tokens=None, output_tokens=None, metadata=None)

Calculate the cost of an LLM call (optional).

Override this method to enable cost tracking during optimization.

Parameters:

Name Type Description Default
input_tokens Optional[int]

Number of input tokens

None
output_tokens Optional[int]

Number of output tokens

None
metadata Optional[Dict[str, Any]]

Additional metadata from the LLM call

None

Returns:

Type Description
Optional[float]

Cost in your preferred currency, or None to disable cost tracking


Providers

Base Provider

Abstract base class for LLM providers and standard response format.

octuner.providers.base.BaseLLMProvider(config_loader, **kwargs)

Bases: ABC

Abstract base class for implementing custom LLM providers in Octuner.

This class serves as the foundation for creating custom LLM provider implementations, enabling to integrate own self-hosted models, proprietary APIs, or any other LLM service.

Key Features: - Configuration-driven: Integrates with YAML-based configuration system - Parameter optimization: Supports automatic parameter tuning through the config loader - Type conversion: Automatic parameter type conversion based on configuration - Cost tracking: Built-in cost calculation and token usage tracking - Standardized responses: Returns consistent LLMResponse objects across all providers

To create a custom provider, you must:

  1. Inherit from BaseLLMProvider and set the provider_name attribute
  2. Implement abstract methods:
  3. call(): Main interface for making LLM requests
  4. _make_request(): Low-level API communication
  5. _parse_response(): Convert raw API response to LLMResponse
  6. _calculate_cost(): Calculate cost based on token usage

  7. Create a configuration file (YAML) defining:

  8. Available models and their parameters
  9. Parameter types, ranges, and defaults
  10. Pricing information for cost calculation
  11. Provider-specific settings

Example Usage:

from octuner.providers.base import BaseLLMProvider, LLMResponse
from octuner.config.loader import ConfigLoader

class CustomProvider(BaseLLMProvider):
    def __init__(self, config_loader, **kwargs):
        super().__init__(config_loader, **kwargs)
        self.provider_name = "custom"
        # Initialize your API client here

    def call(self, prompt: str, system_prompt: Optional[str] = None, **kwargs) -> LLMResponse:
        # Implementation details...
        pass

    # Implement other abstract methods...

# Usage with configuration
config_loader = ConfigLoader("my_custom_config.yaml")
provider = CustomProvider(config_loader, api_key="your-key")
response = provider.call("Hello, world!")

Configuration File Structure:

providers:
  custom:
    default_model: "my-model-v1"
    available_models: ["my-model-v1", "my-model-v2"]
    pricing_usd_per_1m_tokens:
      my-model-v1: [0.5, 1.0]  # [input_cost, output_cost]
    model_capabilities:
      my-model-v1:
        supported_parameters: ["temperature", "max_tokens"]
        parameters:
          temperature:
            type: float
            range: [0.0, 2.0]
            default: 0.7
          max_tokens:
            type: int
            range: [1, 4000]
            default: 1000

This constructor sets up the provider with access to the configuration system and stores any provider-specific parameters. The config_loader is mandatory as it provides access to model capabilities, parameter definitions, and pricing.

Parameters:

Name Type Description Default
config_loader ConfigLoader

Configuration loader instance that provides access to YAML configuration files. This is mandatory.

required
**kwargs

Provider-specific configuration parameters. These can override default values from the configuration file.

{}
call(prompt, system_prompt=None, **kwargs) abstractmethod

This is the main interface method that users will call to interact with the LLM provider. It handles parameter resolution, type conversion, API communication, and response parsing.

Parameters:

Name Type Description Default
prompt str

The user prompt/query to send to the LLM

required
system_prompt Optional[str]

Optional system prompt that sets the context or behavior for the LLM. If None, no system prompt is used.

None
**kwargs

Additional parameters that can override configuration defaults.

{}

Returns:

Name Type Description
LLMResponse LLMResponse

Standardized response object containing:

Example Implementation
def call(self, prompt: str, system_prompt: Optional[str] = None, **kwargs) -> LLMResponse:
    import time
    start_time = time.time()

    # Get model and parameters
    model = self._get_parameter("model", kwargs, "default-model")
    temperature = self._get_parameter("temperature", kwargs, model)
    max_tokens = self._get_parameter("max_tokens", kwargs, model)

    # Convert types
    temperature = self._convert_parameter_type("temperature", temperature, model)
    max_tokens = self._convert_parameter_type("max_tokens", max_tokens, model)

    # Prepare API request
    api_params = {
        "model": model,
        "prompt": prompt,
        "system_prompt": system_prompt,
        "temperature": temperature,
        "max_tokens": max_tokens
    }

    # Make API call
    response = self._make_request(**api_params)

    # Parse and return response
    result = self._parse_response(response)
    result.latency_ms = (time.time() - start_time) * 1000
    return result
get_cost_per_token(model)

Get the cost per input and output token for a model.

Parameters:

Name Type Description Default
model str

Model identifier

required

Returns:

Type Description
Tuple[float, float]

Tuple of (input_cost_per_1M_tokens, output_cost_per_1M_tokens)

get_available_models()

Get list of available models for this provider.

Returns:

Type Description
List[str]

List of model identifiers

octuner.providers.base.LLMResponse(text, provider=None, model=None, cost=None, input_tokens=None, output_tokens=None, latency_ms=None, metadata=None) dataclass

Standard response format from all LLM providers.

OpenAI Provider

OpenAI provider implementation using the OpenAI API.

octuner.providers.openai.OpenAIProvider(config_loader, **kwargs)

Bases: BaseLLMProvider

OpenAI provider implementation for Octuner.

This module contains the OpenAI provider implementation using the Responses API.

call(prompt, system_prompt=None, use_websearch=False, **kwargs)

Make a call to OpenAI Chat Completions API, with optional websearch support.

Parameters:

Name Type Description Default
prompt str

The user prompt

required
system_prompt Optional[str]

Optional system prompt

None
use_websearch bool

Whether to enable websearch tool

False
**kwargs

Additional parameters to override defaults (e.g., model, temperature, max_tokens, etc.)

{}

Gemini Provider

Google Gemini provider implementation using the google-generativeai SDK.

octuner.providers.gemini.GeminiProvider(config_loader, **kwargs)

Bases: BaseLLMProvider

Gemini provider implementation. Using the google-generativeai SDK.

call(prompt, system_prompt=None, use_websearch=False, **kwargs)

Make a call to Gemini API, with optional websearch support.

Provider Registry

Functions for registering and retrieving LLM providers.

octuner.providers.registry.register_provider(name, provider_class)

Register a custom LLM provider.

This allows you to add custom providers for self-hosted LLMs or other services.

Parameters:

Name Type Description Default
name str

Provider name (e.g., 'ollama', 'vllm', 'custom')

required
provider_class Type[BaseLLMProvider]

Provider class that inherits from BaseLLMProvider

required

Raises:

Type Description
ValueError

If provider_class doesn't inherit from BaseLLMProvider

octuner.providers.registry.get_provider(provider_name, config_loader, **kwargs)

Get a provider instance by name.

Parameters:

Name Type Description Default
provider_name str

Name of the provider ('openai', 'gemini', or custom)

required
config_loader

ConfigLoader for configuration-driven behavior (mandatory)

required
**kwargs

Provider-specific configuration

{}

Returns:

Type Description
BaseLLMProvider

Provider instance

Raises:

Type Description
KeyError

If provider is not supported

octuner.providers.registry.list_providers()

Get list of all registered provider names.

Returns:

Type Description
List[str]

List of provider names


Optimization

LLMOptimizer

Main optimizer class that uses Optuna to find optimal parameter configurations.

octuner.optimization.optimizer.LLMOptimizer(search_space, mode='pareto', constraints=None, scalarization_weights=None, seed=None)

Core optimization engine that uses Optuna to find optimal parameter configurations.

LLMOptimizer is the main optimization engine that coordinates the search for optimal parameter values using Optuna's sophisticated optimization algorithms. It integrates with DatasetExecutor to evaluate parameter configurations and uses optimization strategies to determine the best solutions.

The optimization process: 1. Parameter Suggestion: Uses Optuna to suggest parameter values from the search space 2. Trial Execution: Evaluates suggested parameters using DatasetExecutor 3. Objective Computation: Converts evaluation results to objective values using the strategy 4. Optimization: Uses Optuna's TPE sampler to intelligently explore the parameter space 5. Result Analysis: Extracts best parameters and trial results from completed optimization

Key features: - Supports multiple optimization strategies (Pareto, constrained, scalarized) - Intelligent parameter suggestion using TPE (Tree-structured Parzen Estimator) - Robust error handling with fallback objective values for failed trials - Comprehensive trial result tracking and analysis - Integration with Optuna's pruning mechanism for constraint handling

Initialize the LLMOptimizer with search space and optimization configuration.

Parameters:

Name Type Description Default
search_space Dict[str, Tuple[ParamType, Any, Any]]

Dictionary mapping parameter paths to their definitions. Each definition is a tuple of (type, min_value, max_value) for numeric parameters or (type, choices) for categorical parameters.

required
mode OptimizationMode

Optimization strategy to use: - "pareto": Multi-objective Pareto optimization (default) - "constrained": Single-objective with hard constraints - "scalarized": Multi-objective converted to single-objective

'pareto'
constraints Optional[Constraints]

Hard constraints for constrained mode. Dictionary with constraint names and maximum values (e.g., {"cost_total": 0.01}).

None
scalarization_weights Optional[ScalarizationWeights]

Weights for scalarized mode. Defines relative importance of quality, cost, and latency objectives.

None
seed Optional[int]

Random seed for reproducible optimization results. Use the same seed to get identical optimization runs.

None
suggest_parameters(trial)

Suggest parameter values for an Optuna trial.

This method uses Optuna's intelligent parameter suggestion to generate parameter values from the search space. It handles different parameter types (float, int, choice, bool, list) and converts them to appropriate Optuna suggestion calls.

Parameters:

Name Type Description Default
trial Trial

Optuna trial object for parameter suggestion

required

Returns:

Type Description
Dict[str, Any]

Dictionary mapping parameter paths to suggested values

Raises:

Type Description
TypeError

If parameter ranges are invalid for numeric types

ValueError

If an unknown parameter type is encountered

objective_function(trial, executor, replicates=1)

Objective function for Optuna optimization.

This method serves as the objective function that Optuna optimizes. It suggests parameters, executes the trial using the DatasetExecutor, and converts the results to objective values using the optimization strategy.

The function handles errors gracefully by returning fallback objective values for failed trials, ensuring the optimization process continues even when individual trials fail.

Parameters:

Name Type Description Default
trial Trial

Optuna trial object for parameter suggestion

required
executor Any

DatasetExecutor instance for trial evaluation

required
replicates int

Number of replicates to run for statistical robustness

1

Returns:

Type Description
Tuple[float, ...]

Tuple of objective values for Optuna optimization

optimize(executor, max_trials=120, replicates=1, timeout=None)

Run the optimization process to find optimal parameter configurations.

This method orchestrates the entire optimization process using Optuna. It runs the specified number of trials, each evaluating a different parameter configuration, and returns comprehensive results from all trials.

Parameters:

Name Type Description Default
executor Any

DatasetExecutor instance for evaluating parameter configurations

required
max_trials int

Maximum number of optimization trials to run. More trials generally lead to better results but take longer.

120
replicates int

Number of replicates per trial for statistical robustness. Higher values reduce noise but increase computation time.

1
timeout Optional[float]

Maximum time in seconds for the entire optimization process. If None, optimization runs until max_trials is reached.

None

Returns:

Type Description
List[TrialResult]

List of TrialResult objects containing results from all completed trials,

List[TrialResult]

including both successful and failed trials with error information.

get_best_parameters()

Get the best parameter configuration found during optimization.

This method returns the parameter values from the best trial according to the optimization strategy. The definition of "best" depends on the strategy used (e.g., highest quality for Pareto, lowest combined score for scalarized).

Returns:

Type Description
Dict[str, Any]

Dictionary mapping parameter paths to their optimal values.

Dict[str, Any]

Returns empty dictionary if no trials completed successfully.

get_best_trial()

Get the best trial result from the optimization.

This method returns the complete TrialResult object for the best trial, including parameters, metrics, and success status. The best trial is determined by the optimization strategy used.

Returns:

Type Description
Optional[TrialResult]

TrialResult object for the best trial, or None if no trials

Optional[TrialResult]

completed successfully.

OptimizationStrategy

Abstract base class for different optimization strategies (single objective, multi-objective).

octuner.optimization.optimizer.OptimizationStrategy(constraints=None, scalarization_weights=None)

Bases: ABC

Abstract base class for optimization strategies in Octuner.

This class defines the interface that all optimization strategies must implement to work with the LLMOptimizer. It provides a unified way to handle different optimization approaches while maintaining compatibility with Optuna.

Each strategy must implement methods for creating Optuna studies, computing objective values from metric results, and determining the best trial from completed studies.

create_study(study_name, seed=None) abstractmethod

Create an Optuna study configured for this optimization strategy.

Parameters:

Name Type Description Default
study_name str

Name for the Optuna study

required
seed Optional[int]

Random seed for reproducible optimization

None

Returns:

Type Description
Study

Configured Optuna study ready for optimization

compute_objectives(result) abstractmethod

Convert the raw metric results (quality, cost, latency) into objective values that Optuna can optimize. The conversion depends on the specific strategy (e.g., Pareto uses multiple objectives, scalarized combines them into one).

Parameters:

Name Type Description Default
result MetricResult

MetricResult containing quality, cost, and latency

required

Returns:

Type Description
Tuple[float, ...]

Tuple of objective values for Optuna optimization

get_fallback_objectives() abstractmethod

When a trial fails (e.g., due to parameter constraints or errors), this method provides objective values that represent the worst possible performance, ensuring failed trials are properly ranked.

Returns:

Type Description
Tuple[float, ...]

Tuple of fallback objective values

get_best_trial_from_study(study) abstractmethod

Get the best trial from the study according to this strategy.

Different strategies define "best" differently: - Pareto: Highest quality among Pareto-optimal solutions - Constrained: Highest quality among constraint-satisfying trials - Scalarized: Lowest combined objective score

Parameters:

Name Type Description Default
study Study

Completed Optuna study

required

Returns:

Type Description
Optional[FrozenTrial]

Best trial according to this strategy, or None if no trials completed

DatasetExecutor

Executes trials on datasets with parallel processing support.

octuner.optimization.executor.DatasetExecutor(component, entrypoint, dataset, metric, max_workers=1)

DatasetExecutor is a core component of the optimization system that handles the execution of evaluation trials during parameter optimization. It manages the evaluation of components over datasets, collects performance metrics, and provides both sequential and parallel execution capabilities.

How it works:

  1. Parameter Application: Applies trial parameters to the component using the parameter setter utilities, ensuring consistent parameter configuration across trials.

  2. Dataset Evaluation: Executes the entrypoint function over each dataset item, collecting outputs and computing quality scores using the provided metric function.

  3. Metrics Collection: Automatically collects comprehensive metrics including quality scores, execution costs, and latency measurements for each trial.

  4. Parallel Execution: Supports concurrent evaluation of dataset items using ThreadPoolExecutor for faster trial execution when max_workers > 1.

  5. Statistical Aggregation: Provides robust statistical aggregation across replicates and dataset items using median-based aggregation for stability.

Key Features:

  • Parallel Execution: Multi-threaded evaluation for faster optimization
  • Comprehensive Metrics: Quality, cost, and latency tracking
  • Statistical Robustness: Median-based aggregation to handle outliers
  • Error Handling: Graceful handling of individual item failures
  • Replicate Support: Multiple trial runs for statistical significance
  • Cost Tracking: Automatic cost collection from tunable components

Constructor

Parameters:

Name Type Description Default
component Any

The component to evaluate. Must be tunable and support parameter setting via the parameter setter utilities.

required
entrypoint EntrypointFunction

Function that evaluates the component with input data. Called as entrypoint(component, input_data) for each dataset item. Should return a dictionary or object that can be processed by the metric function.

required
dataset Dataset

List of input/target pairs for evaluation. Each item should be a dictionary with 'input' and 'target' keys containing the input data and expected output respectively.

required
metric MetricFunction

Function that computes quality scores from evaluation results. Called as metric(output, target) where output is the result from entrypoint and target is the expected output. Should return a float score (higher is better).

required
max_workers int

Maximum number of concurrent workers for parallel evaluation. Use 1 for sequential execution, >1 for parallel execution. Higher values speed up I/O-bound tasks but may not help with CPU-bound tasks due to Python's GIL.

1
execute_trial(parameters)

Execute a single evaluation trial with the given parameters. t applies the trial parameters to the component, executes the evaluation over all dataset items, and returns aggregated metrics including quality, cost, and latency.

The execution process:

  1. Parameter Application: Sets the trial parameters on the component
  2. Call Log Clearing: Clears any previous call logs for clean metrics
  3. Dataset Evaluation: Runs the entrypoint function over each dataset item
  4. Metrics Collection: Collects quality scores, costs, and timing data
  5. Statistical Aggregation: Computes median quality and total cost

Parameters:

Name Type Description Default
parameters Dict[str, Any]

Dictionary of parameter values to set on the component. Keys should match the parameter paths from the search space. Example: {"llm.temperature": 0.7, "llm.max_tokens": 100}

required

Returns:

Type Description
MetricResult

MetricResult containing:

MetricResult
  • quality: Median quality score across all dataset items
MetricResult
  • cost: Total cost from all component calls (if available)
MetricResult
  • latency_ms: Total execution time in milliseconds
execute_with_replicates(parameters, replicates=1)

Execute a trial multiple times and aggregate results for statistical robustness. It's particularly useful for optimization scenarios where individual trials may have high variance due to non-deterministic components or external factors.

Parameters:

Name Type Description Default
parameters Dict[str, Any]

Dictionary of parameter values to set on the component. Same format as execute_trial().

required
replicates int

Number of times to run the trial. Higher values provide better statistical significance but take longer. Typical values range from 1-10 depending on variance requirements.

1

Returns:

Type Description
MetricResult

MetricResult containing aggregated metrics across all replicates:

execute(parameters)

Execute the full dataset and return per-item results, unlike execute_trial() which returns aggregated metrics. It's useful for detailed analysis, debugging, r when you need to examine individual item performance rather than overall trial performance.

Parameters:

Name Type Description Default
parameters Dict[str, Any]

Dictionary of parameter values to set on the component. Same format as execute_trial().

required

Returns:

Type Description
List[MetricResult]

List of MetricResult objects, one for each dataset item. Each result

List[MetricResult]

contains the quality, cost, and latency for that specific item.


Discovery

ComponentDiscovery

Discovers tunable components in a component tree and builds search spaces.

octuner.discovery.discovery.ComponentDiscovery(include_patterns=None, exclude_patterns=None)

This is a core component of Octuner that automatically finds and catalogs all tunable parameters within an object hierarchy. It recursively traverses object attributes to identify components that implement the TunableMixin protocol, building a search space that can be optimized by the AutoTuner.

How it works: 1. Recursive Traversal: Starting from a root component, it recursively explores all object attributes using __dict__ introspection, avoiding circular references and method calls.

  1. Tunable Detection: For each object found, it checks if it implements the TunableMixin protocol using is_llm_tunable(), which supports both: - Instance-based tunables (legacy): Objects with get_tunable_parameters() method - Registry-based tunables (recommended): Objects registered through register_tunable_class()

  2. Parameter Extraction: For tunable objects, it extracts parameter definitions including type, range, and default values using get_tunable_parameters().

  3. Path Mapping: Each discovered parameter is mapped to a dotted path (e.g., "classifier_llm.temperature") that uniquely identifies its location in the hierarchy.

  4. Filtering: Optional include/exclude patterns can be used to focus the search space on specific parameters or components.

Example
from octuner import ComponentDiscovery, MultiProviderTunableLLM

# Create a complex component hierarchy
analyzer = TunableSentimentAnalyzer(config_file)

# Discover all tunable parameters
discovery = ComponentDiscovery()
tunables = discovery.discover(analyzer)

# Result: {
#     "classifier_llm": {
#         "temperature": ("float", 0.0, 2.0),
#         "max_tokens": ("int", 64, 4096),
#         "provider_model": ("choice", ["openai:gpt-4", "gemini:gemini-pro"])
#     },
#     "confidence_llm": { ... },
#     "reasoning_llm": { ... }
# }

# Focus on specific parameters
focused_discovery = ComponentDiscovery(
    include_patterns=["*.temperature", "*.provider_model"],
    exclude_patterns=["*.verbose"]
)
focused_tunables = focused_discovery.discover(analyzer)
Integration with AutoTuner

ComponentDiscovery is automatically used by AutoTuner to build the search space before optimization begins. The discovered parameters become the dimensions that the optimizer explores to find the best configuration.

Attributes:

Name Type Description
include_patterns

List of glob patterns to include in discovery

exclude_patterns

List of glob patterns to exclude from discovery

Initialize discovery with optional filters.

Parameters:

Name Type Description Default
include_patterns Optional[List[str]]

Glob patterns to include (e.g., ["*.temperature"])

None
exclude_patterns Optional[List[str]]

Glob patterns to exclude (e.g., ["*.verbose"])

None
discover(component)

Discover all tunable components in the component tree.

This is the main entry point for component discovery. It performs a recursive traversal of the component hierarchy, identifying all objects that implement the TunableMixin protocol and extracting their tunable parameters.

The discovery process
  1. Starts from the provided root component
  2. Recursively explores all object attributes
  3. Identifies tunable components using is_llm_tunable()
  4. Extracts parameter definitions from each tunable component
  5. Applies "include/exclude" filters (optional)
  6. Returns a structured mapping of paths to parameters

Parameters:

Name Type Description Default
component Any

Root component to search. Can be any Python object, but typically a component hierarchy containing multiple tunable LLMs or other tunable components.

required

Returns:

Name Type Description
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]

Dictionary mapping dotted paths to tunable parameter definitions. The structure is:

Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]

```

Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]

{ "component_path": { "param_name": (param_type, min_value, max_value), "another_param": (param_type, choices_or_default, ...) }

Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]

}

Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]

```

Where Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]
  • component_path: Dotted path to the component (e.g., "classifier_llm")
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]
  • param_name: Name of the tunable parameter
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]
  • param_type: Type of parameter ("float", "int", "choice", "bool")
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]
  • min_value, max_value: Range for numeric parameters
Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]
  • choices_or_default: Choices for choice parameters or default values
Note

I suppose this method is safe to call on any object. It will return an empty dictionary if no tunable components are found.

Discovery Functions

octuner.discovery.discovery.build_search_space(discovered)

This method transforms the hierarchical discovery results into a flat search space suitable for optimization. It converts the nested structure from ComponentDiscovery into a flat dictionary where each parameter is identified by its full dotted path.

The transformation: - Input: {"component": {"param": (type, min, max)}} - Output: {"component.param": (type, min, max)}

This flattened structure is what the optimizer uses to define the search space dimensions for parameter optimization.

Parameters:

Name Type Description Default
discovered Dict[str, Dict[str, Tuple[ParamType, Any, Any]]]

Discovery results from ComponentDiscovery.discover() or discover_tunable_components(). Dictionary mapping component paths to their tunable parameters.

required

Returns:

Type Description
Dict[str, Tuple[ParamType, Any, Any]]

Dictionary mapping full parameter paths to parameter definitions.

Dict[str, Tuple[ParamType, Any, Any]]

Each key is a dotted path like "classifier_llm.temperature" and each

Dict[str, Tuple[ParamType, Any, Any]]

value is a tuple containing the parameter type and constraints.


Configuration

ConfigLoader

Loads and validates YAML configuration files for LLM providers and models.

octuner.config.loader.ConfigLoader(config_file)

This class provides utilities to load YAML configuration files as described in config_templates/*.yaml.

It allows to get available providers, models, parameters, pricing, and capabilities. Those capabilities become available to the tuning algorithms to know what parameters can be optimized, their ranges, types, and default values.

IMPORTANT: Note that when instantiating this class, the configuration file is loaded immediately.

Initialize config loader with a specific file.

Parameters:

Name Type Description Default
config_file str

Path to the YAML configuration file

required
get_providers()

Get list of available providers.

get_provider_config(provider_name)

Get configuration for a specific provider.

get_default_model(provider_name)

Get default model for a provider.

get_available_models(provider_name)

Get available models for a provider.

get_pricing(provider_name, model)

Get pricing for a model (input_cost, output_cost per 1M tokens).

get_model_capabilities(provider_name, model)

Get capabilities for a specific model.

get_supported_parameters(provider_name, model)

Get list of parameters that can be optimized for a model.

model_supports_parameter(provider_name, model, parameter)

Check if a model supports a specific parameter.

get_parameter_range(provider_name, model, parameter)

Get optimization range for a parameter.

get_parameter_default(provider_name, model, parameter)

Get default value for a parameter.

Parameters:

Name Type Description Default
provider_name str

Name of the provider

required
model str

Name of the model

required
parameter str

Name of the parameter

required

Returns:

Type Description

Default parameter value from configuration

Raises:

Type Description
ValueError

If parameter default is not defined in configuration

get_parameter_type(provider_name, model, parameter)

Get the expected type for a parameter from YAML configuration.

Parameters:

Name Type Description Default
provider_name str

Name of the provider

required
model str

Name of the model

required
parameter str

Name of the parameter

required

Returns:

Type Description
str

Parameter type ('int', 'float', 'str', 'bool', 'choice')

Raises:

Type Description
ValueError

If parameter type is not defined in configuration

get_forced_parameter(provider_name, model, parameter)

Get forced value for a parameter (if any).

validate_config()

Validate the configuration structure.


Utilities

Exporter Functions

Functions for saving and loading optimized parameters.

octuner.utils.exporter.save_parameters_to_yaml(parameters, path, metadata=None)

Save parameters to a YAML file.

Parameters:

Name Type Description Default
parameters Dict[str, Any]

Dictionary of parameter values

required
path str

Path to save the YAML file

required
metadata Optional[Dict[str, Any]]

Optional metadata to include

None

octuner.utils.exporter.load_parameters_from_yaml(path)

Load parameters from a YAML file.

Parameters:

Name Type Description Default
path str

Path to the YAML file

required

Returns:

Type Description
Dict[str, Any]

Dictionary of parameter values

Raises:

Type Description
FileNotFoundError

If the file doesn't exist

YAMLError

If the file is invalid YAML

octuner.utils.exporter.create_metadata_summary(trials, optimization_mode='pareto', dataset_size=0, total_trials=0, best_quality=0.0, best_cost=None, best_latency_ms=None, dataset_fingerprint=None)

Create a metadata summary for the optimization results.

Parameters:

Name Type Description Default
trials list

List of trial results

required
optimization_mode str

Mode used for optimization

'pareto'
dataset_size int

Number of examples in the dataset

0
total_trials int

Total number of trials run

0
best_quality float

Best quality score achieved

0.0
best_cost Optional[float]

Best cost achieved (if available)

None
best_latency_ms Optional[float]

Best latency achieved (if available)

None
dataset_fingerprint Optional[str]

Dataset fingerprint (if available)

None

Returns:

Type Description
Dict[str, Any]

Dictionary of metadata

Parameter Setter

Apply optimized parameters to components.

octuner.utils.setter.set_parameters(component, parameters, strict=False)

Convenience function to set parameters on a component.

Parameters:

Name Type Description Default
component Any

Component to set parameters on

required
parameters Dict[str, Any]

Dictionary mapping dotted paths to values

required
strict bool

If True, raise exceptions instead of logging warnings

False

Type Definitions

Core Types

Type definitions and data classes used throughout Octuner.

octuner.tunable.types.ParamType = Literal['float', 'int', 'choice', 'bool'] module-attribute

octuner.tunable.types.Dataset = List[DatasetItem] module-attribute

octuner.tunable.types.MetricResult(quality, cost=None, latency_ms=None) dataclass

Result of a single metric evaluation.

octuner.tunable.types.TrialResult(trial_number, parameters, metrics, success=True, error=None) dataclass

Result of a single optimization trial.

octuner.tunable.types.SearchResult(best_trial, all_trials, optimization_mode, dataset_size, total_trials, best_parameters, metrics_summary) dataclass

Result of an optimization search.

save_best(path)

Save best parameters to YAML file.

octuner.tunable.types.Constraints = Dict[str, Union[float, int]] module-attribute

octuner.tunable.types.ScalarizationWeights(cost_weight=1.0, latency_weight=1.0) dataclass

Weights for scalarized optimization mode.