compress

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 24, 2026 License: AGPL-3.0 Imports: 8 Imported by: 0

Documentation

Overview

Package compress provides semantic compression for LLM context. It reduces token count while preserving semantic meaning.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Compressor

type Compressor interface {
	// Compress reduces the token count of chunks while preserving meaning.
	Compress(ctx context.Context, chunks []types.Chunk, opts Options) ([]types.Chunk, Stats, error)
}

Compressor defines the interface for semantic compression.

type ExtractiveCompressor

type ExtractiveCompressor struct {
	// SentenceDelimiters defines characters that end sentences.
	SentenceDelimiters string
}

ExtractiveCompressor selects salient spans from verbose content. It uses sentence-level extraction based on embedding similarity to preserve the most semantically relevant portions.

func NewExtractiveCompressor

func NewExtractiveCompressor() *ExtractiveCompressor

NewExtractiveCompressor creates an extractive compressor with default settings.

func (*ExtractiveCompressor) Compress

func (e *ExtractiveCompressor) Compress(ctx context.Context, chunks []types.Chunk, opts Options) ([]types.Chunk, Stats, error)

Compress extracts salient spans from each chunk.

type Mode

type Mode string

Mode specifies the compression strategy.

const (
	// ModeExtractive selects salient spans from content.
	ModeExtractive Mode = "extractive"
	// ModePlaceholder replaces verbose outputs with compact summaries.
	ModePlaceholder Mode = "placeholder"
	// ModeHybrid combines extractive and placeholder strategies.
	ModeHybrid Mode = "hybrid"
)

type Options

type Options struct {
	// TargetReduction is the desired reduction ratio (e.g., 0.3 = reduce to 30% of original).
	TargetReduction float64

	// PreserveStructure keeps JSON/code structure intact when possible.
	PreserveStructure bool

	// Mode selects the compression strategy.
	Mode Mode

	// MinChunkLength is the minimum length to consider for compression.
	MinChunkLength int

	// MaxOutputTokens caps the total output tokens (0 = no limit).
	MaxOutputTokens int
}

Options configures compression behavior.

func DefaultOptions

func DefaultOptions() Options

DefaultOptions returns sensible defaults for compression.

type Pipeline

type Pipeline struct {
	// contains filtered or unexported fields
}

Pipeline chains multiple compression strategies.

func NewPipeline

func NewPipeline(compressors ...Compressor) *Pipeline

NewPipeline creates a compression pipeline with the given strategies.

func (*Pipeline) Compress

func (p *Pipeline) Compress(ctx context.Context, chunks []types.Chunk, opts Options) ([]types.Chunk, Stats, error)

Compress applies all compressors in sequence.

type PlaceholderCompressor

type PlaceholderCompressor struct {
	// PreserveKeys keeps these JSON keys in the output.
	PreserveKeys []string

	// MaxArrayItems limits array elements shown before truncation.
	MaxArrayItems int

	// MaxObjectDepth limits nested object depth.
	MaxObjectDepth int
}

PlaceholderCompressor replaces verbose tool outputs with compact summaries. It detects JSON, XML, tables, and other structured content and replaces them with descriptive placeholders.

func NewPlaceholderCompressor

func NewPlaceholderCompressor() *PlaceholderCompressor

NewPlaceholderCompressor creates a placeholder compressor with defaults.

func (*PlaceholderCompressor) Compress

func (p *PlaceholderCompressor) Compress(ctx context.Context, chunks []types.Chunk, opts Options) ([]types.Chunk, Stats, error)

Compress replaces verbose structured content with placeholders.

type Pruner

type Pruner struct {
	// FillerPhrases are phrases to remove entirely.
	FillerPhrases []string

	// RedundantPatterns are regex patterns for redundant content.
	RedundantPatterns []*regexp.Regexp

	// CollapseWhitespace normalizes multiple spaces to single space.
	CollapseWhitespace bool
}

Pruner removes low-importance tokens and filler phrases.

func NewPruner

func NewPruner() *Pruner

NewPruner creates a pruner with default filler phrases.

func (*Pruner) AddFillerPhrase

func (p *Pruner) AddFillerPhrase(phrase string)

AddFillerPhrase adds a custom filler phrase to remove.

func (*Pruner) AddRedundantPattern

func (p *Pruner) AddRedundantPattern(pattern string) error

AddRedundantPattern adds a custom regex pattern for removal.

func (*Pruner) Compress

func (p *Pruner) Compress(ctx context.Context, chunks []types.Chunk, opts Options) ([]types.Chunk, Stats, error)

Compress removes filler phrases and redundant patterns.

type Result

type Result struct {
	// Chunks are the compressed chunks.
	Chunks []types.Chunk

	// Stats contains compression metrics.
	Stats Stats
}

Result holds compressed chunks and statistics.

type Stats

type Stats struct {
	// InputTokens is the estimated token count before compression.
	InputTokens int

	// OutputTokens is the estimated token count after compression.
	OutputTokens int

	// ReductionPercent is the percentage of tokens removed.
	ReductionPercent float64

	// ChunksProcessed is the number of chunks that were compressed.
	ChunksProcessed int

	// ChunksSkipped is the number of chunks below MinChunkLength.
	ChunksSkipped int

	// Latency is the compression processing time.
	Latency time.Duration
}

Stats tracks compression metrics.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL