tokenizer

package
v0.0.0-...-e803270 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 19, 2025 License: MIT Imports: 4 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CalculateSimilarity

func CalculateSimilarity(s1, s2 string) int

CalculateSimilarity calculates Levenshtein distance between two strings

Types

type Token

type Token struct {
	Text     string
	Position int
}

Token represents a processed token with its position

type Tokenizer

type Tokenizer struct {
	// contains filtered or unexported fields
}

Tokenizer handles text tokenization and normalization

func NewTokenizer

func NewTokenizer() *Tokenizer

NewTokenizer creates a new tokenizer instance

func (*Tokenizer) Tokenize

func (t *Tokenizer) Tokenize(text string) []Token

Tokenize splits text into tokens with positions

func (*Tokenizer) TokenizeToStrings

func (t *Tokenizer) TokenizeToStrings(text string) []string

TokenizeToStrings returns just the token strings without positions

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL