gladia

package

v0.3.0 Latest Latest Go to latest Published: Mar 1, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/digitallysavvy/go-ai

Links

Open Source Insights

README ¶

Gladia Provider

The Gladia provider for the Go AI SDK contains transcription model support for the Gladia transcription API.

Gladia offers advanced speech recognition with async processing and multi-language support.

Installation

go get github.com/digitallysavvy/go-ai/pkg/providers/gladia

Setup

To use the Gladia provider, you need an API key from Gladia.

Provider Instance

You can create a new provider instance with your API key:

import "github.com/digitallysavvy/go-ai/pkg/providers/gladia"

provider := gladia.New(gladia.Config{
    APIKey: "your-api-key",
})

Example: Basic Transcription

package main

import (
    "context"
    "fmt"
    "log"
    "os"

    "github.com/digitallysavvy/go-ai/pkg/providers/gladia"
    "github.com/digitallysavvy/go-ai/pkg/provider"
)

func main() {
    // Create provider
    p := gladia.New(gladia.Config{
        APIKey: os.Getenv("GLADIA_API_KEY"),
    })

    // Get transcription model
    model, err := p.TranscriptionModel("whisper-v3")
    if err != nil {
        log.Fatal(err)
    }

    // Read audio file
    audioData, err := os.ReadFile("audio.mp3")
    if err != nil {
        log.Fatal(err)
    }

    // Transcribe
    result, err := model.DoTranscribe(context.Background(), &provider.TranscriptionOptions{
        Audio:    audioData,
        MimeType: "audio/mpeg",
        Language: "en",
    })
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("Transcription:", result.Text)
    fmt.Printf("Duration: %.2f seconds\n", result.Usage.DurationSeconds)
}

Example: Transcription with Timestamps

result, err := model.DoTranscribe(context.Background(), &provider.TranscriptionOptions{
    Audio:      audioData,
    MimeType:   "audio/mpeg",
    Language:   "en",
    Timestamps: true, // Request timestamps
})
if err != nil {
    log.Fatal(err)
}

// Print transcription with timestamps
for _, ts := range result.Timestamps {
    fmt.Printf("[%.2f-%.2f] %s\n", ts.Start, ts.End, ts.Text)
}

Configuration

Config Fields

APIKey (required): Your Gladia API key
BaseURL (optional): Custom API base URL (defaults to "https://api.gladia.io/v2")

TranscriptionOptions

Audio (required): Audio data as byte slice
MimeType (required): MIME type of the audio (e.g., "audio/mpeg", "audio/wav")
Language (optional): Language code (e.g., "en", "fr", "es")
Timestamps (optional): Whether to include word-level timestamps

Supported Audio Formats

Gladia supports various audio formats including:

MP3 (audio/mpeg)
WAV (audio/wav)
M4A (audio/m4a)
FLAC (audio/flac)
OGG (audio/ogg)

Documentation

For more information about the Gladia API, visit the Gladia documentation.

Documentation ¶

Overview ¶

Package gladia provides a Gladia AI speech-to-text provider for the Go AI SDK. Gladia offers advanced speech recognition with async processing and multi-language support.

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Config ¶

type Config struct {
	// API key for authentication
	APIKey string

	// Base URL for the Gladia API (optional, defaults to production)
	BaseURL string
}

Config holds configuration for the Gladia provider

type Provider ¶

type Provider struct {
	// contains filtered or unexported fields
}

Provider represents the Gladia AI provider

func New ¶

func New(config Config) *Provider

New creates a new Gladia provider instance

func (*Provider) EmbeddingModel ¶

func (p *Provider) EmbeddingModel(modelID string) (provider.EmbeddingModel, error)

EmbeddingModel returns an embedding model (not supported by Gladia)

func (*Provider) ImageModel ¶

func (p *Provider) ImageModel(modelID string) (provider.ImageModel, error)

ImageModel returns an image model (not supported by Gladia)

func (*Provider) LanguageModel ¶

func (p *Provider) LanguageModel(modelID string) (provider.LanguageModel, error)

LanguageModel returns a language model (not supported by Gladia)

func (*Provider) Name ¶

func (p *Provider) Name() string

Name returns the provider name

func (*Provider) RerankingModel ¶

func (p *Provider) RerankingModel(modelID string) (provider.RerankingModel, error)

RerankingModel returns a reranking model (not supported by Gladia)

func (*Provider) SpeechModel ¶

func (p *Provider) SpeechModel(modelID string) (provider.SpeechModel, error)

SpeechModel returns a speech synthesis model (not supported by Gladia)

func (*Provider) TranscriptionModel ¶

func (p *Provider) TranscriptionModel(modelID string) (provider.TranscriptionModel, error)

TranscriptionModel returns a speech-to-text model

type TranscriptionModel ¶

type TranscriptionModel struct {
	// contains filtered or unexported fields
}

TranscriptionModel represents a Gladia transcription model

func (*TranscriptionModel) DoTranscribe ¶

func (m *TranscriptionModel) DoTranscribe(ctx context.Context, opts *provider.TranscriptionOptions) (*types.TranscriptionResult, error)

DoTranscribe performs speech-to-text transcription

func (*TranscriptionModel) ModelID ¶

func (m *TranscriptionModel) ModelID() string

ModelID returns the model ID

func (*TranscriptionModel) Provider ¶

func (m *TranscriptionModel) Provider() string

Provider returns the provider name

func (*TranscriptionModel) SpecificationVersion ¶

func (m *TranscriptionModel) SpecificationVersion() string

SpecificationVersion returns the specification version

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL