Documentation
¶
Overview ¶
Package model defines the LLM interface for Hector v2.
This package is strictly aligned with ADK-Go's model architecture:
- Unified GenerateContent method with stream boolean parameter
- Returns iter.Seq2[*Response, error] for both streaming and non-streaming
- Streaming uses Partial flag to distinguish chunks from aggregated response
- Aggregator pattern for accumulating streaming text
Index ¶
- type Content
- type FinishReason
- type GenerateConfig
- type LLM
- type Provider
- type Request
- type Response
- type StreamingAggregator
- func (s *StreamingAggregator) Close() *Response
- func (s *StreamingAggregator) ProcessTextDelta(text string) iter.Seq2[*Response, error]
- func (s *StreamingAggregator) ProcessThinkingComplete(content, signature string)
- func (s *StreamingAggregator) ProcessThinkingDelta(thinking string) iter.Seq2[*Response, error]
- func (s *StreamingAggregator) ProcessToolCall(tc tool.ToolCall) iter.Seq2[*Response, error]
- func (s *StreamingAggregator) SetFinishReason(reason FinishReason)
- func (s *StreamingAggregator) SetUsage(usage *Usage)
- func (s *StreamingAggregator) ThinkingText() string
- type ThinkingBlock
- type Usage
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Content ¶
type Content struct {
// Parts contains the content parts (text, data, files).
Parts []a2a.Part
// Role identifies the sender (agent/user).
Role a2a.MessageRole
}
Content represents the content of a response.
type FinishReason ¶
type FinishReason string
FinishReason indicates why generation stopped.
const ( FinishReasonStop FinishReason = "stop" FinishReasonLength FinishReason = "length" FinishReasonToolCalls FinishReason = "tool_calls" FinishReasonContent FinishReason = "content_filter" FinishReasonError FinishReason = "error" )
type GenerateConfig ¶
type GenerateConfig struct {
// Temperature controls randomness (0-2).
Temperature *float64
// MaxTokens limits the response length.
MaxTokens *int
// TopP controls nucleus sampling.
TopP *float64
// TopK controls top-k sampling.
TopK *int
// StopSequences terminates generation.
StopSequences []string
// ResponseMIMEType for structured output (e.g., "application/json").
ResponseMIMEType string
// ResponseSchema for structured output.
ResponseSchema map[string]any
// ResponseSchemaName identifies the schema for providers that require it.
// Used by OpenAI's json_schema format.
// Default: "response"
ResponseSchemaName string
// ResponseSchemaStrict enables strict schema validation.
// When true, the LLM is constrained to only output valid schema conforming JSON.
// Default: true (nil means true)
ResponseSchemaStrict *bool
// EnableThinking enables extended thinking (model-specific).
EnableThinking bool
// ThinkingBudget limits thinking tokens (model-specific).
ThinkingBudget int
// Metadata contains additional key-value pairs for LLM providers.
// Used for authentication tokens, custom headers, etc.
Metadata map[string]string
}
GenerateConfig contains configuration for generation.
func (*GenerateConfig) Clone ¶
func (c *GenerateConfig) Clone() *GenerateConfig
Clone creates a deep copy of the GenerateConfig. This is important for processor pipelines to avoid shared state between requests.
type LLM ¶
type LLM interface {
// Name returns the model identifier.
Name() string
// Provider returns the provider type (e.g., "openai", "anthropic", "gemini").
// Used for model-specific message formatting and content processing.
Provider() Provider
// GenerateContent produces responses for the given request.
//
// When stream=false:
// - Yields exactly one Response with complete content
// - Response.Partial will be false
//
// When stream=true:
// - Yields multiple partial Responses with Partial=true
// - Finally yields aggregated Response with Partial=false and full content
// - The aggregated response is for session persistence
// - Partial responses are for real-time UI updates
GenerateContent(ctx context.Context, req *Request, stream bool) iter.Seq2[*Response, error]
// Close releases any resources held by the LLM.
Close() error
}
LLM is the interface for language models.
This interface is aligned with ADK-Go's model.LLM interface. Key design principles:
- Single GenerateContent method handles both streaming and non-streaming
- Returns iter.Seq2 which yields one or more Response objects
- For non-streaming: yields exactly one Response
- For streaming: yields multiple partial Responses (Partial=true), then final aggregated (Partial=false)
type Provider ¶
type Provider string
Provider identifies the LLM provider. Used for model-specific message formatting and content processing.
const ( // ProviderOpenAI represents OpenAI models (GPT-4, etc.) // Tool results are separate function_call_output items. ProviderOpenAI Provider = "openai" // ProviderAnthropic represents Anthropic models (Claude) // Tool results must be paired with tool_use in the same message. ProviderAnthropic Provider = "anthropic" // ProviderGemini represents Google Gemini models // Similar to Anthropic - tool results paired with function calls. ProviderGemini Provider = "gemini" // ProviderOllama represents Ollama local models // Follows OpenAI-compatible format. ProviderOllama Provider = "ollama" // ProviderUnknown for unrecognized providers. ProviderUnknown Provider = "unknown" )
type Request ¶
type Request struct {
// Messages is the conversation history.
Messages []*a2a.Message
// Tools available for the model to call.
Tools []tool.Definition
// Config contains generation configuration.
Config *GenerateConfig
// SystemInstruction is prepended to the conversation.
SystemInstruction string
}
Request contains the input for an LLM call.
type Response ¶
type Response struct {
// Content is the generated content (text, tool calls, etc.)
Content *Content
// Partial indicates whether this is a streaming chunk (true) or final response (false).
// In streaming mode:
// - Partial=true: This is a delta chunk for real-time display
// - Partial=false: This is the aggregated final response for persistence
Partial bool
// TurnComplete indicates whether the model has finished its turn.
TurnComplete bool
// ToolCalls requested by the model.
ToolCalls []tool.ToolCall
// Usage statistics.
Usage *Usage
// Thinking contains the model's reasoning (if enabled).
Thinking *ThinkingBlock
// FinishReason indicates why generation stopped.
FinishReason FinishReason
// ErrorCode for provider-specific errors.
ErrorCode string
// ErrorMessage for provider-specific error messages.
ErrorMessage string
}
Response contains the result of an LLM call.
This is aligned with ADK-Go's LLMResponse structure.
func (*Response) HasToolCalls ¶
HasToolCalls returns whether the response contains tool calls.
func (*Response) TextContent ¶
TextContent extracts text from a response.
type StreamingAggregator ¶
type StreamingAggregator struct {
// contains filtered or unexported fields
}
StreamingAggregator aggregates partial streaming responses.
This is aligned with ADK-Go's streamingResponseAggregator. It accumulates content from partial responses and generates:
- Partial responses for real-time UI updates (Partial=true)
- Aggregated response for session persistence (Partial=false)
Usage:
aggregator := NewStreamingAggregator()
for chunk := range provider.Stream(ctx, req) {
for resp, err := range aggregator.ProcessChunk(chunk) {
yield(resp, err)
}
}
if final := aggregator.Close(); final != nil {
yield(final, nil)
}
func NewStreamingAggregator ¶
func NewStreamingAggregator() *StreamingAggregator
NewStreamingAggregator creates a new streaming aggregator.
func (*StreamingAggregator) Close ¶
func (s *StreamingAggregator) Close() *Response
Close generates the final aggregated response. This should be called after all streaming chunks are processed. The returned response has Partial=false and is suitable for persistence.
func (*StreamingAggregator) ProcessTextDelta ¶
ProcessTextDelta processes a text delta chunk. Returns a partial response for the UI.
func (*StreamingAggregator) ProcessThinkingComplete ¶
func (s *StreamingAggregator) ProcessThinkingComplete(content, signature string)
ProcessThinkingComplete processes a completed thinking block with signature.
func (*StreamingAggregator) ProcessThinkingDelta ¶
ProcessThinkingDelta processes a thinking delta chunk. Returns a partial response with thinking metadata.
func (*StreamingAggregator) ProcessToolCall ¶
ProcessToolCall processes a complete tool call. Returns a partial response with the tool call.
func (*StreamingAggregator) SetFinishReason ¶
func (s *StreamingAggregator) SetFinishReason(reason FinishReason)
SetFinishReason sets the finish reason.
func (*StreamingAggregator) SetUsage ¶
func (s *StreamingAggregator) SetUsage(usage *Usage)
SetUsage sets the usage statistics (typically from the done event).
func (*StreamingAggregator) ThinkingText ¶
func (s *StreamingAggregator) ThinkingText() string
ThinkingText returns the accumulated thinking text.
type ThinkingBlock ¶
type ThinkingBlock struct {
// ID uniquely identifies this thinking block in the conversation.
// Generated by the LLM provider or assigned during aggregation.
ID string
// Content is the thinking/reasoning text.
Content string
// Signature is used for multi-turn verification (e.g., Anthropic).
Signature string
}
ThinkingBlock contains the model's reasoning.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package anthropic provides an Anthropic Claude LLM implementation.
|
Package anthropic provides an Anthropic Claude LLM implementation. |
|
Package gemini implements the model.LLM interface for Google Gemini models.
|
Package gemini implements the model.LLM interface for Google Gemini models. |
|
Package ollama provides an Ollama LLM implementation.
|
Package ollama provides an Ollama LLM implementation. |
|
Package openai provides an OpenAI LLM implementation using the Responses API.
|
Package openai provides an OpenAI LLM implementation using the Responses API. |