API Reference¶
Core¶
Collector¶
- class aiobs.collector.Collector[source]¶
Bases:
objectSimple, global-style collector with pluggable provider instrumentation.
- API:
observe(): enable instrumentation and start a session
end(): finish current session
flush(): write captured data to JSON (default: ./<session-id>.json)
- add_label(key: str, value: str) None[source]¶
Add a single label to the current session.
- Parameters:
key – Label key (lowercase alphanumeric with underscores).
value – Label value (UTF-8 string, max 256 chars).
- Raises:
RuntimeError – If no active session.
ValueError – If key or value is invalid.
- flush(path: str | None = None, include_trace_tree: bool = True, exporter: 'BaseExporter' | None = None, **exporter_kwargs: Any) str | 'ExportResult'[source]¶
Flush all sessions and events to a file or custom exporter.
- Parameters:
path – Output file path. Defaults to LLM_OBS_OUT env var or ‘<session-id>.json’. Ignored if exporter is provided.
include_trace_tree – Whether to include the nested trace_tree structure. Defaults to True.
exporter – Optional exporter instance (e.g., GCSExporter, CustomExporter). If provided, data is exported using this exporter instead of writing to a local file.
**exporter_kwargs – Additional keyword arguments passed to the exporter’s export() method.
- Returns:
ExportResult from the exporter. Otherwise: The output file path used.
- Return type:
If exporter is provided
- get_current_span_id() str | None[source]¶
Get the current span ID from context (for parent-child linking).
- get_labels() Dict[str, str][source]¶
Get all labels for the current session.
- Returns:
Dictionary of current labels (empty dict if none).
- Raises:
RuntimeError – If no active session.
- observe(session_name: str | None = None, api_key: str | None = None, labels: Dict[str, str] | None = None) str[source]¶
Enable instrumentation (once) and start a new session.
- Parameters:
session_name – Optional name for the session.
api_key – API key (aiobs_sk_…) for usage tracking with shepherd-server. Can also be set via AIOBS_API_KEY environment variable.
labels – Optional dictionary of key-value labels for filtering and categorization. Keys must be lowercase alphanumeric with underscores (matching ^[a-z][a-z0-9_]{0,62}$). Values are UTF-8 strings (max 256 chars). Labels from AIOBS_LABEL_* environment variables are automatically merged.
Returns a session id.
- Raises:
ValueError – If no API key is provided, the API key is invalid, or labels contain invalid keys/values.
RuntimeError – If unable to connect to shepherd server.
- remove_label(key: str) None[source]¶
Remove a label from the current session.
- Parameters:
key – Label key to remove.
- Raises:
RuntimeError – If no active session.
ValueError – If trying to remove a system label.
- set_current_span_id(span_id: str | None) Token[source]¶
Set the current span ID in context. Returns a token to restore previous value.
- set_labels(labels: Dict[str, str], merge: bool = True) None[source]¶
Set or update labels for the current session.
- Parameters:
labels – Dictionary of labels to set.
merge – If True, merge with existing labels. If False, replace all user labels (system labels are preserved).
- Raises:
RuntimeError – If no active session.
ValueError – If labels contain invalid keys or values.
Observe Decorator¶
@observe decorator for tracing function execution.
- aiobs.observe.observe(func: F) F[source]¶
- aiobs.observe.observe(*, name: str | None = None, capture_args: bool = True, capture_result: bool = True, enh_prompt: bool = False, auto_enhance_after: int | None = None) Callable[[F], F]
Decorator to trace function execution.
- Can be used with or without arguments:
@observe def my_func(): …
@observe(name=”custom_name”) def my_func(): …
@observe(enh_prompt=True, auto_enhance_after=10) def my_func(): …
- Parameters:
func – The function to wrap (when used without parentheses)
name – Optional custom name for the traced function
capture_args – Whether to capture function arguments (default: True)
capture_result – Whether to capture the return value (default: True)
enh_prompt – Whether to include this trace in enh_prompt_traces for enhanced prompt analysis (default: False)
auto_enhance_after – Number of traces after which to run auto prompt enhancer (only relevant when enh_prompt=True)
- Returns:
The wrapped function that records execution traces
Models¶
- class aiobs.models.observability.Callsite(*, file: str | None = None, line: int | None = None, function: str | None = None)[source]¶
Bases:
BaseModel- file: str | None¶
- function: str | None¶
- line: int | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class aiobs.models.observability.Event(*, provider: str, api: str, request: Any, response: Any | None = None, error: str | None = None, started_at: float, ended_at: float, duration_ms: float, callsite: Callsite | None = None, span_id: str | None = None, parent_span_id: str | None = None)[source]¶
Bases:
BaseModel- api: str¶
- duration_ms: float¶
- ended_at: float¶
- error: str | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- parent_span_id: str | None¶
- provider: str¶
- request: Any¶
- response: Any | None¶
- span_id: str | None¶
- started_at: float¶
- class aiobs.models.observability.FunctionEvent(*, provider: str = 'function', api: str, name: str, module: str | None = None, args: List[Any] | None = None, kwargs: dict | None = None, result: Any | None = None, error: str | None = None, started_at: float, ended_at: float, duration_ms: float, callsite: Callsite | None = None, span_id: str | None = None, parent_span_id: str | None = None, enh_prompt: bool = False, enh_prompt_id: str | None = None, auto_enhance_after: int | None = None)[source]¶
Bases:
BaseModelEvent model for tracing decorated functions.
- api: str¶
- args: List[Any] | None¶
- auto_enhance_after: int | None¶
- duration_ms: float¶
- ended_at: float¶
- enh_prompt: bool¶
- enh_prompt_id: str | None¶
- error: str | None¶
- kwargs: dict | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- module: str | None¶
- name: str¶
- parent_span_id: str | None¶
- provider: str¶
- result: Any | None¶
- span_id: str | None¶
- started_at: float¶
- class aiobs.models.observability.ObservabilityExport(*, sessions: ~typing.List[~aiobs.models.observability.Session], events: ~typing.List[~aiobs.models.observability.ObservedEvent], function_events: ~typing.List[~aiobs.models.observability.ObservedFunctionEvent] = <factory>, trace_tree: ~typing.List[~typing.Any] | None = None, enh_prompt_traces: ~typing.List[str] | None = None, generated_at: float, version: int = 1)[source]¶
Bases:
BaseModel- enh_prompt_traces: List[str] | None¶
- events: List[ObservedEvent]¶
- function_events: List[ObservedFunctionEvent]¶
- generated_at: float¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- trace_tree: List[Any] | None¶
- version: int¶
- class aiobs.models.observability.ObservedEvent(*, provider: str, api: str, request: Any, response: Any | None = None, error: str | None = None, started_at: float, ended_at: float, duration_ms: float, callsite: Callsite | None = None, span_id: str | None = None, parent_span_id: str | None = None, session_id: str)[source]¶
Bases:
Event- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- session_id: str¶
- class aiobs.models.observability.ObservedFunctionEvent(*, provider: str = 'function', api: str, name: str, module: str | None = None, args: List[Any] | None = None, kwargs: dict | None = None, result: Any | None = None, error: str | None = None, started_at: float, ended_at: float, duration_ms: float, callsite: Callsite | None = None, span_id: str | None = None, parent_span_id: str | None = None, enh_prompt: bool = False, enh_prompt_id: str | None = None, auto_enhance_after: int | None = None, session_id: str)[source]¶
Bases:
FunctionEventFunction event with session_id for export.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- session_id: str¶
- class aiobs.models.observability.Session(*, id: str, name: str, started_at: float, ended_at: float | None = None, meta: SessionMeta, labels: Dict[str, str] | None = None)[source]¶
Bases:
BaseModel- ended_at: float | None¶
- id: str¶
- labels: Dict[str, str] | None¶
- meta: SessionMeta¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str¶
- started_at: float¶
Providers¶
Base Provider¶
- class aiobs.providers.base.BaseProvider[source]¶
Bases:
ABCAbstract base class for provider instrumentation.
Subclasses install monkeypatches or hooks to capture request/response details and call collector._record_event(…) with normalized payloads.
- abstractmethod install(collector: Any) Callable[[], None] | None[source]¶
Apply instrumentation and return an optional unpatch function.
- classmethod is_available() bool[source]¶
Return True if the provider can be instrumented (deps present).
- name: str = 'provider'¶
OpenAI Provider¶
- class aiobs.providers.openai.provider.OpenAIProvider[source]¶
Bases:
BaseProvider- install(collector: Any) Callable[[], None] | None[source]¶
Apply instrumentation and return an optional unpatch function.
- classmethod is_available() bool[source]¶
Return True if the provider can be instrumented (deps present).
- name: str = 'openai'¶
OpenAI API Modules¶
- class aiobs.providers.openai.apis.base_api.BaseOpenAIAPIModule[source]¶
Bases:
ABCAbstract interface for an OpenAI API module.
- abstractmethod install(collector: Any) Callable[[], None] | None[source]¶
Install instrumentation and return optional unpatch function.
- name: str = 'openai-api'¶
OpenAI API Models¶
- class aiobs.providers.openai.apis.models.base.BaseOpenAIRequest(*, model: str | None = None)[source]¶
Bases:
BaseModelBase class for OpenAI request capture models.
- model: str | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- redacted() BaseOpenAIRequest[source]¶
Return a copy safe for logging (override in subclasses).
- class aiobs.providers.openai.apis.models.base.BaseOpenAIResponse(*, id: str | None = None, model: str | None = None, usage: Dict[str, Any] | None = None)[source]¶
Bases:
BaseModelBase class for OpenAI response capture models.
- id: str | None¶
- model: str | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- redacted() BaseOpenAIResponse[source]¶
Return a copy safe for logging (override in subclasses).
- usage: Dict[str, Any] | None¶
- class aiobs.providers.openai.apis.models.chat_completions.ChatCompletionsRequest(*, model: str | None = None, messages: ~typing.List[~aiobs.providers.openai.apis.models.chat_completions.Message] | None = None, temperature: float | None = None, max_tokens: int | None = None, other: ~typing.Dict[str, ~typing.Any] = <factory>)[source]¶
Bases:
BaseOpenAIRequest- max_tokens: int | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- other: Dict[str, Any]¶
- temperature: float | None¶
- class aiobs.providers.openai.apis.models.chat_completions.ChatCompletionsResponse(*, id: str | None = None, model: str | None = None, usage: Dict[str, Any] | None = None, text: str | None = None)[source]¶
Bases:
BaseOpenAIResponse- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- text: str | None¶
- class aiobs.providers.openai.apis.models.chat_completions.Message(*, role: str, content: Any)[source]¶
Bases:
BaseModel- content: Any¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- role: str¶
- class aiobs.providers.openai.apis.models.embeddings.EmbeddingData(*, index: int, embedding: ~typing.List[float] = <factory>, object: str = 'embedding')[source]¶
Bases:
BaseModelSingle embedding object in the response.
- embedding: List[float]¶
- index: int¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- object: str¶
- class aiobs.providers.openai.apis.models.embeddings.EmbeddingsRequest(*, model: str | None = None, input: str | ~typing.List[str] | ~typing.List[int] | ~typing.List[~typing.List[int]] | None = None, encoding_format: str | None = None, dimensions: int | None = None, user: str | None = None, other: ~typing.Dict[str, ~typing.Any] = <factory>)[source]¶
Bases:
BaseOpenAIRequestRequest model for OpenAI embeddings.create API.
- dimensions: int | None¶
- encoding_format: str | None¶
- input: str | List[str] | List[int] | List[List[int]] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- other: Dict[str, Any]¶
- user: str | None¶
- class aiobs.providers.openai.apis.models.embeddings.EmbeddingsResponse(*, id: str | None = None, model: str | None = None, usage: Dict[str, Any] | None = None, object: str | None = None, data: List[EmbeddingData] | None = None, embedding_dimensions: int | None = None)[source]¶
Bases:
BaseOpenAIResponseResponse model for OpenAI embeddings.create API.
- data: List[EmbeddingData] | None¶
- embedding_dimensions: int | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- object: str | None¶
Gemini Provider¶
- class aiobs.providers.gemini.provider.GeminiProvider[source]¶
Bases:
BaseProvider- install(collector: Any) Callable[[], None] | None[source]¶
Apply instrumentation and return an optional unpatch function.
- classmethod is_available() bool[source]¶
Return True if the provider can be instrumented (deps present).
- name: str = 'gemini'¶
Gemini API Modules¶
- class aiobs.providers.gemini.apis.base_api.BaseGeminiAPIModule[source]¶
Bases:
ABCAbstract interface for a Gemini API module.
- abstractmethod install(collector: Any) Callable[[], None] | None[source]¶
Install instrumentation and return optional unpatch function.
- name: str = 'gemini-api'¶
- class aiobs.providers.gemini.apis.generate_content.GenerateContentAPI[source]¶
Bases:
BaseGeminiAPIModule- install(collector: Any) Callable[[], None] | None[source]¶
Install instrumentation and return optional unpatch function.
- name: str = 'models.generate_content'¶
Gemini API Models¶
- class aiobs.providers.gemini.apis.models.base.BaseGeminiRequest(*, model: str | None = None)[source]¶
Bases:
BaseModelBase class for Gemini request capture models.
- model: str | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- redacted() BaseGeminiRequest[source]¶
Return a copy safe for logging (override in subclasses).
- class aiobs.providers.gemini.apis.models.base.BaseGeminiResponse(*, model: str | None = None, usage: Dict[str, Any] | None = None)[source]¶
Bases:
BaseModelBase class for Gemini response capture models.
- model: str | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- redacted() BaseGeminiResponse[source]¶
Return a copy safe for logging (override in subclasses).
- usage: Dict[str, Any] | None¶
- class aiobs.providers.gemini.apis.models.generate_content.Content(*, role: str | None = None, parts: List[ContentPart] | None = None)[source]¶
Bases:
BaseModelContent structure for Gemini messages.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- parts: List[ContentPart] | None¶
- role: str | None¶
- class aiobs.providers.gemini.apis.models.generate_content.ContentPart(*, text: str | None = None, inline_data: Dict[str, Any] | None = None)[source]¶
Bases:
BaseModelA part of content (text, image, etc.).
- inline_data: Dict[str, Any] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- text: str | None¶
- class aiobs.providers.gemini.apis.models.generate_content.GenerateContentRequest(*, model: str | None = None, contents: str | ~typing.List[~aiobs.providers.gemini.apis.models.generate_content.Content] | ~typing.Any | None = None, system_instruction: ~typing.Any | None = None, config: ~typing.Dict[str, ~typing.Any] | None = None, other: ~typing.Dict[str, ~typing.Any] = <factory>)[source]¶
Bases:
BaseGeminiRequestRequest model for generate_content API.
- config: Dict[str, Any] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- other: Dict[str, Any]¶
- system_instruction: Any | None¶
- class aiobs.providers.gemini.apis.models.generate_content.GenerateContentResponse(*, model: str | None = None, usage: Dict[str, Any] | None = None, text: str | None = None, candidates: List[Dict[str, Any]] | None = None)[source]¶
Bases:
BaseGeminiResponseResponse model for generate_content API.
- candidates: List[Dict[str, Any]] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- text: str | None¶
- class aiobs.providers.gemini.apis.models.generate_videos.GenerateVideosRequest(*, model: str | None = None, prompt: str | None = None, image: ~typing.Dict[str, ~typing.Any] | None = None, video: ~typing.Dict[str, ~typing.Any] | None = None, config: ~typing.Dict[str, ~typing.Any] | None = None, other: ~typing.Dict[str, ~typing.Any] = <factory>)[source]¶
Bases:
BaseGeminiRequestRequest model for generate_videos API.
- config: Dict[str, Any] | None¶
- image: Dict[str, Any] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- other: Dict[str, Any]¶
- prompt: str | None¶
- video: Dict[str, Any] | None¶
- class aiobs.providers.gemini.apis.models.generate_videos.GenerateVideosResponse(*, model: str | None = None, usage: Dict[str, Any] | None = None, operation_name: str | None = None, done: bool | None = None, generated_videos: List[Dict[str, Any]] | None = None)[source]¶
Bases:
BaseGeminiResponseResponse model for generate_videos API.
- done: bool | None¶
- generated_videos: List[Dict[str, Any]] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- operation_name: str | None¶
- class aiobs.providers.gemini.apis.models.generate_videos.GeneratedVideo(*, video: Dict[str, Any] | None = None)[source]¶
Bases:
BaseModelA generated video result.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- video: Dict[str, Any] | None¶
- class aiobs.providers.gemini.apis.models.generate_videos.VideoGenerationConfig(*, aspect_ratio: str | None = None, number_of_videos: int | None = None, resolution: str | None = None, duration_seconds: int | None = None, negative_prompt: str | None = None, generate_audio: bool | None = None, enhance_prompt: bool | None = None, person_generation: str | None = None, seed: int | None = None, output_gcs_uri: str | None = None)[source]¶
Bases:
BaseModelConfiguration for video generation.
- aspect_ratio: str | None¶
- duration_seconds: int | None¶
- enhance_prompt: bool | None¶
- generate_audio: bool | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- negative_prompt: str | None¶
- number_of_videos: int | None¶
- output_gcs_uri: str | None¶
- person_generation: str | None¶
- resolution: str | None¶
- seed: int | None¶
Classifiers¶
Base Classifier¶
Base classifier interface for aiobs.
- class aiobs.classifier.base.BaseClassifier(config: ClassificationConfig | None = None)[source]¶
Bases:
ABCAbstract base class for response classifiers.
Classifiers evaluate model outputs against user inputs and system prompts to determine if the response quality is good, bad, or uncertain.
- Subclasses must implement:
classify(): Synchronous classification
classify_async(): Asynchronous classification
classify_batch(): Batch classification for multiple inputs
- Example usage:
from aiobs.classifier import OpenAIClassifier
classifier = OpenAIClassifier(api_key=”…”) result = classifier.classify(
system_prompt=”You are a helpful assistant.”, user_input=”What is 2+2?”, model_output=”2+2 equals 4.”
) print(result.verdict) # ClassificationVerdict.GOOD
- abstractmethod classify(user_input: str, model_output: str, system_prompt: str | None = None, **kwargs: Any) ClassificationResult[source]¶
Classify a model response synchronously.
- Parameters:
user_input – The user’s input/query to the model.
model_output – The model’s generated response.
system_prompt – Optional system prompt provided to the model.
**kwargs – Additional arguments for the classifier.
- Returns:
ClassificationResult with verdict, confidence, and reasoning.
- abstractmethod async classify_async(user_input: str, model_output: str, system_prompt: str | None = None, **kwargs: Any) ClassificationResult[source]¶
Classify a model response asynchronously.
- Parameters:
user_input – The user’s input/query to the model.
model_output – The model’s generated response.
system_prompt – Optional system prompt provided to the model.
**kwargs – Additional arguments for the classifier.
- Returns:
ClassificationResult with verdict, confidence, and reasoning.
- abstractmethod classify_batch(inputs: List[ClassificationInput], **kwargs: Any) List[ClassificationResult][source]¶
Classify multiple model responses in batch.
- Parameters:
inputs – List of ClassificationInput objects to classify.
**kwargs – Additional arguments for the classifier.
- Returns:
List of ClassificationResult objects, one per input.
- abstractmethod async classify_batch_async(inputs: List[ClassificationInput], **kwargs: Any) List[ClassificationResult][source]¶
Classify multiple model responses asynchronously in batch.
- Parameters:
inputs – List of ClassificationInput objects to classify.
**kwargs – Additional arguments for the classifier.
- Returns:
List of ClassificationResult objects, one per input.
- classmethod is_available() bool[source]¶
Check if this classifier can be used (dependencies present).
- Returns:
True if all required dependencies are available.
- name: str = 'base'¶
Classification Models¶
Pydantic models for classification inputs and outputs.
- class aiobs.classifier.models.classification.ClassificationConfig(*, model: str = 'gpt-4o-mini', temperature: Annotated[float, Ge(ge=0.0), Le(le=2.0)] = 0.0, max_tokens: int = 1024, classification_prompt: str | None = None, confidence_threshold: Annotated[float, Ge(ge=0.0), Le(le=1.0)] = 0.7)[source]¶
Bases:
BaseModelConfiguration for classifier behavior.
- classification_prompt: str | None¶
- confidence_threshold: float¶
- max_tokens: int¶
- model: str¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- temperature: float¶
- class aiobs.classifier.models.classification.ClassificationInput(*, system_prompt: str | None = None, user_input: str, model_output: str, context: Dict[str, Any] | None = None)[source]¶
Bases:
BaseModelInput model for classification.
Contains the system prompt, user input, and model output that will be evaluated by the classifier.
- context: Dict[str, Any] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_output: str¶
- system_prompt: str | None¶
- user_input: str¶
- class aiobs.classifier.models.classification.ClassificationResult(*, verdict: ClassificationVerdict, confidence: Annotated[float, Ge(ge=0.0), Le(le=1.0)], reasoning: str | None = None, categories: List[str] | None = None, raw_response: Any | None = None, metadata: Dict[str, Any] | None = None)[source]¶
Bases:
BaseModelResult model for classification.
Contains the verdict (good/bad/uncertain), confidence score, reasoning, and any additional metadata.
- categories: List[str] | None¶
- confidence: float¶
- metadata: Dict[str, Any] | None¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- raw_response: Any | None¶
- reasoning: str | None¶
- verdict: ClassificationVerdict¶
OpenAI Classifier¶
OpenAI-based classifier implementation.
- class aiobs.classifier.openai.classifier.OpenAIClassifier(api_key: str | None = None, config: ClassificationConfig | None = None, client: Any | None = None, async_client: Any | None = None)[source]¶
Bases:
BaseClassifierClassifier using OpenAI’s models to evaluate response quality.
Uses OpenAI’s chat completion API to analyze model outputs and determine if they are good, bad, or uncertain.
Example
from aiobs.classifier import OpenAIClassifier
classifier = OpenAIClassifier(api_key=”sk-…”) result = classifier.classify(
user_input=”What is the capital of France?”, model_output=”The capital of France is Paris.”, system_prompt=”You are a helpful geography assistant.”
)
- if result.verdict == ClassificationVerdict.GOOD:
print(“Response is good!”)
- else:
print(f”Issues: {result.categories}”)
- classify(user_input: str, model_output: str, system_prompt: str | None = None, **kwargs: Any) ClassificationResult[source]¶
Classify a model response synchronously using OpenAI.
- Parameters:
user_input – The user’s input/query to the model.
model_output – The model’s generated response.
system_prompt – Optional system prompt provided to the model.
**kwargs – Additional arguments (passed to context).
- Returns:
ClassificationResult with verdict, confidence, and reasoning.
- async classify_async(user_input: str, model_output: str, system_prompt: str | None = None, **kwargs: Any) ClassificationResult[source]¶
Classify a model response asynchronously using OpenAI.
- Parameters:
user_input – The user’s input/query to the model.
model_output – The model’s generated response.
system_prompt – Optional system prompt provided to the model.
**kwargs – Additional arguments (passed to context).
- Returns:
ClassificationResult with verdict, confidence, and reasoning.
- classify_batch(inputs: List[ClassificationInput], **kwargs: Any) List[ClassificationResult][source]¶
Classify multiple model responses in batch (sequential).
Note: This runs classifications sequentially. For true parallel execution, use classify_batch_async.
- Parameters:
inputs – List of ClassificationInput objects to classify.
**kwargs – Additional arguments for the classifier.
- Returns:
List of ClassificationResult objects, one per input.
- async classify_batch_async(inputs: List[ClassificationInput], **kwargs: Any) List[ClassificationResult][source]¶
Classify multiple model responses asynchronously in parallel.
Uses asyncio.gather for concurrent classification requests.
- Parameters:
inputs – List of ClassificationInput objects to classify.
**kwargs – Additional arguments for the classifier.
- Returns:
List of ClassificationResult objects, one per input.
- name: str = 'openai'¶
LLM Abstraction¶
The LLM module provides a unified interface for interacting with different
LLM providers. It is used internally by LLM-based evaluators like
HallucinationDetectionEval.
LLM Factory¶
LLM factory for auto-detecting and creating LLM adapters.
- class aiobs.llm.factory.LLM[source]¶
Bases:
objectFactory class for creating LLM adapters.
Provides a unified interface for different LLM providers through automatic client detection or explicit provider specification.
Example
from openai import OpenAI from aiobs.llm import LLM
# Auto-detect from client client = OpenAI() llm = LLM.from_client(client, model=”gpt-4o”)
response = llm.complete(“What is 2+2?”) print(response.content) # “4”
# Async usage response = await llm.complete_async(“What is 2+2?”)
- static anthropic(client: Any, model: str = 'claude-3-sonnet-20240229', temperature: float = 0.0, max_tokens: int | None = 1024) AnthropicLLM[source]¶
Create an Anthropic LLM adapter explicitly.
- Parameters:
client – Anthropic client instance.
model – Model name (default: “claude-3-sonnet-20240229”).
temperature – Sampling temperature.
max_tokens – Maximum tokens to generate (default: 1024).
- Returns:
AnthropicLLM adapter instance.
- static from_client(client: Any, model: str, temperature: float = 0.0, max_tokens: int | None = None) BaseLLM[source]¶
Create an LLM adapter by auto-detecting the client type.
- Parameters:
client – The LLM provider’s client instance.
model – Model name/identifier.
temperature – Sampling temperature (0.0 = deterministic).
max_tokens – Maximum tokens to generate.
- Returns:
Appropriate LLM adapter instance.
- Raises:
ValueError – If client type is not recognized.
Example
from openai import OpenAI llm = LLM.from_client(OpenAI(), model=”gpt-4o”)
from google import genai llm = LLM.from_client(genai.Client(), model=”gemini-2.0-flash”)
from anthropic import Anthropic llm = LLM.from_client(Anthropic(), model=”claude-3-sonnet-20240229”)
- static gemini(client: Any, model: str = 'gemini-2.0-flash', temperature: float = 0.0, max_tokens: int | None = None) GeminiLLM[source]¶
Create a Gemini LLM adapter explicitly.
- Parameters:
client – Google GenAI client instance.
model – Model name (default: “gemini-2.0-flash”).
temperature – Sampling temperature.
max_tokens – Maximum tokens to generate.
- Returns:
GeminiLLM adapter instance.
- static openai(client: Any, model: str = 'gpt-4o-mini', temperature: float = 0.0, max_tokens: int | None = None) OpenAILLM[source]¶
Create an OpenAI LLM adapter explicitly.
- Parameters:
client – OpenAI client instance.
model – Model name (default: “gpt-4o-mini”).
temperature – Sampling temperature.
max_tokens – Maximum tokens to generate.
- Returns:
OpenAILLM adapter instance.
Base LLM¶
Base LLM interface for aiobs.
- class aiobs.llm.base.BaseLLM(client: Any, model: str, temperature: float = 0.0, max_tokens: int | None = None)[source]¶
Bases:
ABCAbstract base class for LLM adapters.
Provides a unified interface for different LLM providers.
- abstractmethod complete(prompt: str, system_prompt: str | None = None, **kwargs: Any) LLMResponse[source]¶
Generate a completion synchronously.
- Parameters:
prompt – The user prompt.
system_prompt – Optional system prompt.
**kwargs – Additional provider-specific arguments.
- Returns:
LLMResponse with generated content.
- abstractmethod async complete_async(prompt: str, system_prompt: str | None = None, **kwargs: Any) LLMResponse[source]¶
Generate a completion asynchronously.
- Parameters:
prompt – The user prompt.
system_prompt – Optional system prompt.
**kwargs – Additional provider-specific arguments.
- Returns:
LLMResponse with generated content.
- complete_messages(messages: List[LLMMessage], **kwargs: Any) LLMResponse[source]¶
Generate a completion from a list of messages.
- Parameters:
messages – List of conversation messages.
**kwargs – Additional provider-specific arguments.
- Returns:
LLMResponse with generated content.
- async complete_messages_async(messages: List[LLMMessage], **kwargs: Any) LLMResponse[source]¶
Generate a completion from messages asynchronously.
- Parameters:
messages – List of conversation messages.
**kwargs – Additional provider-specific arguments.
- Returns:
LLMResponse with generated content.
- provider: str = 'base'¶
- class aiobs.llm.base.LLMMessage(*, role: str, content: str)[source]¶
Bases:
BaseModelA message in a conversation.
- content: str¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- role: str¶
- class aiobs.llm.base.LLMResponse(*, content: str, model: str, usage: Dict[str, int] | None = None, raw_response: Any | None = None)[source]¶
Bases:
BaseModelResponse from an LLM completion.
- content: str¶
- model: str¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- raw_response: Any | None¶
- usage: Dict[str, int] | None¶
OpenAI LLM¶
OpenAI LLM adapter.
- class aiobs.llm.openai.OpenAILLM(client: Any, model: str, temperature: float = 0.0, max_tokens: int | None = None)[source]¶
Bases:
BaseLLMLLM adapter for OpenAI and OpenAI-compatible APIs.
Works with: - OpenAI - Azure OpenAI - Groq - Together AI - Any OpenAI-compatible API
Example
from openai import OpenAI from aiobs.llm import LLM
client = OpenAI() llm = LLM.from_client(client, model=”gpt-4o”) response = llm.complete(“Hello!”)
- complete(prompt: str, system_prompt: str | None = None, **kwargs: Any) LLMResponse[source]¶
Generate a completion synchronously.
- Parameters:
prompt – The user prompt.
system_prompt – Optional system prompt.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- async complete_async(prompt: str, system_prompt: str | None = None, **kwargs: Any) LLMResponse[source]¶
Generate a completion asynchronously.
- Parameters:
prompt – The user prompt.
system_prompt – Optional system prompt.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- complete_messages(messages: List[LLMMessage], **kwargs: Any) LLMResponse[source]¶
Generate a completion from a list of messages.
- Parameters:
messages – List of conversation messages.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- classmethod is_compatible(client: Any) bool[source]¶
Check if client is OpenAI-compatible.
- Parameters:
client – Client instance to check.
- Returns:
True if client has OpenAI-compatible interface.
- provider: str = 'openai'¶
Gemini LLM¶
Google Gemini LLM adapter.
- class aiobs.llm.gemini.GeminiLLM(client: Any, model: str, temperature: float = 0.0, max_tokens: int | None = None)[source]¶
Bases:
BaseLLMLLM adapter for Google Gemini API.
Example
from google import genai from aiobs.llm import LLM
client = genai.Client() llm = LLM.from_client(client, model=”gemini-2.0-flash”) response = llm.complete(“Hello!”)
- complete(prompt: str, system_prompt: str | None = None, **kwargs: Any) LLMResponse[source]¶
Generate a completion synchronously.
- Parameters:
prompt – The user prompt.
system_prompt – Optional system prompt.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- async complete_async(prompt: str, system_prompt: str | None = None, **kwargs: Any) LLMResponse[source]¶
Generate a completion asynchronously.
- Parameters:
prompt – The user prompt.
system_prompt – Optional system prompt.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- complete_messages(messages: List[LLMMessage], **kwargs: Any) LLMResponse[source]¶
Generate a completion from a list of messages.
- Parameters:
messages – List of conversation messages.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- classmethod is_compatible(client: Any) bool[source]¶
Check if client is Gemini-compatible.
- Parameters:
client – Client instance to check.
- Returns:
True if client has Gemini-compatible interface.
- provider: str = 'gemini'¶
Anthropic LLM¶
Anthropic Claude LLM adapter.
- class aiobs.llm.anthropic.AnthropicLLM(client: Any, model: str, temperature: float = 0.0, max_tokens: int | None = 1024)[source]¶
Bases:
BaseLLMLLM adapter for Anthropic Claude API.
Example
from anthropic import Anthropic from aiobs.llm import LLM
client = Anthropic() llm = LLM.from_client(client, model=”claude-3-sonnet-20240229”) response = llm.complete(“Hello!”)
- complete(prompt: str, system_prompt: str | None = None, **kwargs: Any) LLMResponse[source]¶
Generate a completion synchronously.
- Parameters:
prompt – The user prompt.
system_prompt – Optional system prompt.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- async complete_async(prompt: str, system_prompt: str | None = None, **kwargs: Any) LLMResponse[source]¶
Generate a completion asynchronously.
- Parameters:
prompt – The user prompt.
system_prompt – Optional system prompt.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- complete_messages(messages: List[LLMMessage], **kwargs: Any) LLMResponse[source]¶
Generate a completion from a list of messages.
- Parameters:
messages – List of conversation messages.
**kwargs – Additional arguments passed to the API.
- Returns:
LLMResponse with generated content.
- classmethod is_compatible(client: Any) bool[source]¶
Check if client is Anthropic-compatible.
- Parameters:
client – Client instance to check.
- Returns:
True if client has Anthropic-compatible interface.
- provider: str = 'anthropic'¶