burndown

package
v0.0.0-...-d1eea97 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 18, 2026 License: Apache-2.0, Apache-2.0 Imports: 30 Imported by: 0

README

Burndown Analysis

Preface

Understanding the evolution of a codebase is crucial for maintaining its health. Knowing how old the code is, who wrote it, and how actively it's being modified can reveal hidden risks and opportunities for improvement.

Problem

As projects grow, "knowledge silos" form. Developers leave, and their code remains, often becoming "legacy" that no one dares to touch. It's difficult to answer questions like:

  • "How much of the code is actively maintained?"
  • "Who are the main contributors to this module?"
  • "Is the project accumulating technical debt in the form of untouched, aging code?"

How analyzer solves it

The Burndown analyzer calculates "line burndown" statistics. It tracks every line of code through the history of the project, recording when it was introduced and by whom. It generates a matrix where one dimension is time (when the line was last touched) and the other is the current time. This allows visualizing how code "survives" over time.

Historical context

Code burndown charts are a well-established visualization in software engineering analytics. They provide a high-level view of code churn and stability, often used to assess project maturity and developer retention.

Real world examples

  • Bus Factor Estimation: By looking at code ownership (which developer "owns" which lines), you can identify modules that depend on a single person.
  • Refactoring Planning: Identify "stable" (old) parts of the code vs. "volatile" (frequently changed) parts. Volatile legacy code is often a good target for refactoring.
  • Onboarding: Help new developers identify who to ask about specific parts of the codebase.

How analyzer works here

The analyzer iterates through the Git commit history.

  1. Tree Diff: It compares the file tree of the current commit with its parents to find modified files.
  2. File Diff: For modified files, it computes the line-level differences (diffs).
  3. Ownership Tracking: It maintains a data structure (using RBTree for efficiency) that maps every line in every file to its original author and creation time.
  4. Sparse Matrix: It aggregates this data into sparse matrices to save memory, representing the "burndown" state at sampled intervals.
  5. Hibernation: To handle large repositories, it supports "hibernating" file structures to disk to keep memory usage low.

Limitations

  • Memory Usage: Tracking every line of code in a massive repository can be memory-intensive, although the hibernation feature mitigates this.
  • Binary Files: It only works effectively on text files.

Further plans

  • Improved visualization of the generated matrices.
  • Deeper integration with team structure (mapping authors to teams).

Documentation

Overview

Package burndown provides burndown functionality.

Index

Constants

View Source
const (
	// ConfigBurndownGranularity is the configuration key for the burndown band granularity.
	ConfigBurndownGranularity = "Burndown.Granularity"
	// ConfigBurndownSampling is the configuration key for the burndown sampling rate.
	ConfigBurndownSampling = "Burndown.Sampling"
	// ConfigBurndownTrackFiles is the configuration key for enabling per-file burndown tracking.
	ConfigBurndownTrackFiles = "Burndown.TrackFiles"
	// ConfigBurndownTrackPeople is the configuration key for enabling per-developer burndown tracking.
	ConfigBurndownTrackPeople = "Burndown.TrackPeople"
	// ConfigBurndownHibernationThreshold defines the hibernation memory threshold.
	ConfigBurndownHibernationThreshold = "Burndown.HibernationThreshold"
	// ConfigBurndownHibernationToDisk defines the hibernation to disk configuration constant.
	ConfigBurndownHibernationToDisk = "Burndown.HibernationOnDisk"
	// ConfigBurndownHibernationDirectory defines the hibernation directory configuration constant.
	ConfigBurndownHibernationDirectory = "Burndown.HibernationDirectory"
	// ConfigBurndownDebug defines the debug mode configuration constant.
	ConfigBurndownDebug = "Burndown.Debug"
	// ConfigBurndownGoroutines defines the goroutines configuration constant.
	ConfigBurndownGoroutines = "Burndown.Goroutines"
	// DefaultBurndownGranularity defines the default granularity in days.
	DefaultBurndownGranularity = 30
	// DefaultBurndownSampling defines the default sampling in ticks.
	// Matches Hercules: sampling equals granularity (30) for comparable output.
	DefaultBurndownSampling = 30
	// DefaultBurndownHibernationThreshold defines the default node count threshold for hibernation.
	DefaultBurndownHibernationThreshold = 1000
)
View Source
const (
	// TickSizeThresholdHigh is the maximum tick size in hours for burndown granularity.
	TickSizeThresholdHigh = 24
)

Configuration constants for burndown analysis.

Variables

View Source
var ErrInvalidReport = errors.New("invalid burndown report: expected DenseHistory")

ErrInvalidReport indicates the report doesn't contain expected data.

Functions

func RegisterPlotSections

func RegisterPlotSections()

RegisterPlotSections registers the burndown plot section renderer with the analyze package.

Types

type AggregateData

type AggregateData struct {
	TotalCurrentLines   int64   `json:"total_current_lines"   yaml:"total_current_lines"`
	TotalPeakLines      int64   `json:"total_peak_lines"      yaml:"total_peak_lines"`
	OverallSurvivalRate float64 `json:"overall_survival_rate" yaml:"overall_survival_rate"`
	AnalysisPeriodDays  int     `json:"analysis_period_days"  yaml:"analysis_period_days"`
	NumBands            int     `json:"num_bands"             yaml:"num_bands"`
	NumSamples          int     `json:"num_samples"           yaml:"num_samples"`
	TrackedFiles        int     `json:"tracked_files"         yaml:"tracked_files"`
	TrackedDevelopers   int     `json:"tracked_developers"    yaml:"tracked_developers"`
}

AggregateData contains summary statistics.

type AggregateMetric

type AggregateMetric struct {
	metrics.MetricMeta
}

AggregateMetric computes summary statistics.

func NewAggregateMetric

func NewAggregateMetric() *AggregateMetric

NewAggregateMetric creates the aggregate metric.

func (*AggregateMetric) Compute

func (m *AggregateMetric) Compute(input *ReportData) AggregateData

Compute calculates aggregate statistics.

type ComputedMetrics

type ComputedMetrics struct {
	Aggregate         AggregateData           `json:"aggregate"          yaml:"aggregate"`
	GlobalSurvival    []SurvivalData          `json:"global_survival"    yaml:"global_survival"`
	FileSurvival      []FileSurvivalData      `json:"file_survival"      yaml:"file_survival"`
	DeveloperSurvival []DeveloperSurvivalData `json:"developer_survival" yaml:"developer_survival"`
	Interaction       []InteractionData       `json:"interactions"       yaml:"interactions"`
}

ComputedMetrics holds all computed metric results for the burndown analyzer.

func ComputeAllMetrics

func ComputeAllMetrics(report analyze.Report) (*ComputedMetrics, error)

ComputeAllMetrics runs all burndown metrics and returns the results.

func (*ComputedMetrics) AnalyzerName

func (m *ComputedMetrics) AnalyzerName() string

AnalyzerName returns the analyzer identifier.

func (*ComputedMetrics) ToJSON

func (m *ComputedMetrics) ToJSON() any

ToJSON returns the metrics in JSON-serializable format.

func (*ComputedMetrics) ToYAML

func (m *ComputedMetrics) ToYAML() any

ToYAML returns the metrics in YAML-serializable format.

type DenseHistory

type DenseHistory = [][]int64

DenseHistory is a two-dimensional matrix of line counts over time intervals.

type DeveloperSurvivalData

type DeveloperSurvivalData struct {
	ID           int     `json:"id"            yaml:"id"`
	Name         string  `json:"name"          yaml:"name"`
	CurrentLines int64   `json:"current_lines" yaml:"current_lines"`
	PeakLines    int64   `json:"peak_lines"    yaml:"peak_lines"`
	SurvivalRate float64 `json:"survival_rate" yaml:"survival_rate"`
}

DeveloperSurvivalData contains survival data for a developer's code.

type DeveloperSurvivalInput

type DeveloperSurvivalInput struct {
	PeopleHistories    []DenseHistory
	ReversedPeopleDict []string
}

DeveloperSurvivalInput holds input for developer survival computation.

type DeveloperSurvivalMetric

type DeveloperSurvivalMetric struct {
	metrics.MetricMeta
}

DeveloperSurvivalMetric computes per-developer code survival.

func NewDeveloperSurvivalMetric

func NewDeveloperSurvivalMetric() *DeveloperSurvivalMetric

NewDeveloperSurvivalMetric creates the developer survival metric.

func (*DeveloperSurvivalMetric) Compute

Compute calculates developer survival statistics.

type FileSurvivalData

type FileSurvivalData struct {
	Path         string      `json:"path"                 yaml:"path"`
	CurrentLines int64       `json:"current_lines"        yaml:"current_lines"`
	Ownership    map[int]int `json:"ownership"            yaml:"ownership"`
	TopOwnerID   int         `json:"top_owner_id"         yaml:"top_owner_id"`
	TopOwnerName string      `json:"top_owner_name"       yaml:"top_owner_name"`
	TopOwnerPct  float64     `json:"top_owner_percentage" yaml:"top_owner_percentage"`
}

FileSurvivalData contains survival data for a single file.

type FileSurvivalInput

type FileSurvivalInput struct {
	FileHistories      map[string]DenseHistory
	FileOwnership      map[string]map[int]int
	ReversedPeopleDict []string
}

FileSurvivalInput holds input for file survival computation.

type FileSurvivalMetric

type FileSurvivalMetric struct {
	metrics.MetricMeta
}

FileSurvivalMetric computes per-file survival statistics.

func NewFileSurvivalMetric

func NewFileSurvivalMetric() *FileSurvivalMetric

NewFileSurvivalMetric creates the file survival metric.

func (*FileSurvivalMetric) Compute

Compute calculates file survival statistics.

type GlobalSurvivalMetric

type GlobalSurvivalMetric struct {
	metrics.MetricMeta
}

GlobalSurvivalMetric computes code survival time series.

func NewGlobalSurvivalMetric

func NewGlobalSurvivalMetric() *GlobalSurvivalMetric

NewGlobalSurvivalMetric creates the global survival metric.

func (*GlobalSurvivalMetric) Compute

func (m *GlobalSurvivalMetric) Compute(input *ReportData) []SurvivalData

Compute calculates global survival time series.

type HistoryAnalyzer

type HistoryAnalyzer struct {
	BlobCache *plumbing.BlobCacheAnalyzer

	Ticks                *plumbing.TicksSinceStart
	Identity             *plumbing.IdentityDetector
	FileDiff             *plumbing.FileDiffAnalyzer
	TreeDiff             *plumbing.TreeDiffAnalyzer
	HibernationDirectory string

	HibernationThreshold int
	Granularity          int
	PeopleNumber         int
	TickSize             time.Duration
	Goroutines           int

	Sampling          int
	GlobalMu          sync.Mutex
	Debug             bool
	TrackFiles        bool
	HibernationToDisk bool
	// contains filtered or unexported fields
}

HistoryAnalyzer tracks line survival rates across commit history.

func (*HistoryAnalyzer) ApplySnapshot

func (b *HistoryAnalyzer) ApplySnapshot(snap analyze.PlumbingSnapshot)

ApplySnapshot restores plumbing state from a snapshot.

func (*HistoryAnalyzer) Boot

func (b *HistoryAnalyzer) Boot() error

Boot performs early initialization before repository processing. Ensures per-shard tracking maps are ready for the next chunk.

func (*HistoryAnalyzer) CPUHeavy

func (b *HistoryAnalyzer) CPUHeavy() bool

CPUHeavy returns false because burndown tracks line ownership without UAST processing.

func (*HistoryAnalyzer) CheckpointSize

func (b *HistoryAnalyzer) CheckpointSize() int64

CheckpointSize returns an estimated size of the checkpoint in bytes.

func (*HistoryAnalyzer) Configure

func (b *HistoryAnalyzer) Configure(facts map[string]any) error

Configure sets up the analyzer with the provided facts.

func (*HistoryAnalyzer) Consume

func (b *HistoryAnalyzer) Consume(_ context.Context, ac *analyze.Context) error

Consume processes a single commit with the provided dependency results.

func (*HistoryAnalyzer) ConsumePrepared

func (b *HistoryAnalyzer) ConsumePrepared(prepared *analyze.PreparedCommit) error

ConsumePrepared processes a pre-prepared commit. This is used by the pipelined runner for parallel commit preparation.

func (*HistoryAnalyzer) Description

func (b *HistoryAnalyzer) Description() string

Description returns a human-readable description of the analyzer.

func (*HistoryAnalyzer) Descriptor

func (b *HistoryAnalyzer) Descriptor() analyze.Descriptor

Descriptor returns stable analyzer metadata.

func (*HistoryAnalyzer) Finalize

func (b *HistoryAnalyzer) Finalize() (analyze.Report, error)

Finalize completes the analysis and returns the result.

func (*HistoryAnalyzer) Flag

func (b *HistoryAnalyzer) Flag() string

Flag returns the CLI flag for the analyzer.

func (*HistoryAnalyzer) Fork

Fork creates a copy of the analyzer for parallel processing.

func (*HistoryAnalyzer) FormatReport

func (b *HistoryAnalyzer) FormatReport(report analyze.Report, writer io.Writer) error

FormatReport writes the formatted analysis report to the given writer.

func (*HistoryAnalyzer) GenerateChart

func (b *HistoryAnalyzer) GenerateChart(report analyze.Report) (components.Charter, error)

GenerateChart implements PlotGenerator interface.

func (*HistoryAnalyzer) GenerateSections

func (b *HistoryAnalyzer) GenerateSections(report analyze.Report) ([]plotpage.Section, error)

GenerateSections returns the sections for combined reports.

func (*HistoryAnalyzer) Hibernate

func (b *HistoryAnalyzer) Hibernate() error

Hibernate releases resources between processing phases. Clears per-shard tracking maps (mergedByID, deletionsByID) that are only needed within a chunk. Also compacts file timelines to reduce memory usage.

func (*HistoryAnalyzer) Initialize

func (b *HistoryAnalyzer) Initialize(repository *gitlib.Repository) error

Initialize prepares the analyzer for processing commits.

func (*HistoryAnalyzer) ListConfigurationOptions

func (b *HistoryAnalyzer) ListConfigurationOptions() []pipeline.ConfigurationOption

ListConfigurationOptions returns the configuration options for the analyzer.

func (*HistoryAnalyzer) LoadCheckpoint

func (b *HistoryAnalyzer) LoadCheckpoint(dir string) error

LoadCheckpoint restores the analyzer state from the given directory.

func (*HistoryAnalyzer) Merge

func (b *HistoryAnalyzer) Merge(branches []analyze.HistoryAnalyzer)

Merge combines results from forked analyzer branches.

func (*HistoryAnalyzer) Name

func (b *HistoryAnalyzer) Name() string

Name returns the name of the analyzer.

func (*HistoryAnalyzer) ReleaseSnapshot

func (b *HistoryAnalyzer) ReleaseSnapshot(_ analyze.PlumbingSnapshot)

ReleaseSnapshot is a no-op for burndown (no UAST resources).

func (*HistoryAnalyzer) SaveCheckpoint

func (b *HistoryAnalyzer) SaveCheckpoint(dir string) error

SaveCheckpoint writes the analyzer state to the given directory.

func (*HistoryAnalyzer) SequentialOnly

func (b *HistoryAnalyzer) SequentialOnly() bool

SequentialOnly returns true because burndown tracks cumulative per-file line state across all commits and cannot be parallelized.

func (*HistoryAnalyzer) Serialize

func (b *HistoryAnalyzer) Serialize(result analyze.Report, format string, writer io.Writer) error

Serialize writes the analysis result to the given writer.

func (*HistoryAnalyzer) SnapshotPlumbing

func (b *HistoryAnalyzer) SnapshotPlumbing() analyze.PlumbingSnapshot

SnapshotPlumbing captures the current plumbing state.

func (*HistoryAnalyzer) StateGrowthPerCommit

func (b *HistoryAnalyzer) StateGrowthPerCommit() int64

StateGrowthPerCommit returns the estimated per-commit memory growth in bytes.

type InteractionData

type InteractionData struct {
	AuthorID      int    `json:"author_id"      yaml:"author_id"`
	AuthorName    string `json:"author_name"    yaml:"author_name"`
	ModifierID    int    `json:"modifier_id"    yaml:"modifier_id"`
	ModifierName  string `json:"modifier_name"  yaml:"modifier_name"`
	LinesModified int64  `json:"lines_modified" yaml:"lines_modified"`
	IsSelfModify  bool   `json:"is_self_modify" yaml:"is_self_modify"`
}

InteractionData contains developer interaction statistics.

type InteractionInput

type InteractionInput struct {
	PeopleMatrix       DenseHistory
	ReversedPeopleDict []string
}

InteractionInput holds input for interaction computation.

type InteractionMetric

type InteractionMetric struct {
	metrics.MetricMeta
}

InteractionMetric computes developer interaction statistics.

func NewInteractionMetric

func NewInteractionMetric() *InteractionMetric

NewInteractionMetric creates the interaction metric.

func (*InteractionMetric) Compute

Compute calculates developer interaction data.

type PathID

type PathID uint32

PathID is a stable numeric id for an interned path. Used to index slice-backed state instead of map[string] so iteration is over a slice of active IDs, not map iteration.

type PathInterner

type PathInterner struct {
	// contains filtered or unexported fields
}

PathInterner maps path strings to stable PathIDs. Thread-safe. IDs are assigned sequentially (0, 1, 2, ...) so slice-backed state can use PathID as index.

func NewPathInterner

func NewPathInterner() *PathInterner

NewPathInterner creates an empty PathInterner.

func (*PathInterner) Intern

func (pi *PathInterner) Intern(path string) PathID

Intern returns the PathID for path, creating a new ID if path has not been seen. Safe for concurrent use.

func (*PathInterner) Len

func (pi *PathInterner) Len() int

Len returns the number of interned paths (next Intern will return PathID(Len())).

func (*PathInterner) Lookup

func (pi *PathInterner) Lookup(id PathID) string

Lookup returns the path string for id. Panics if id >= Len().

type ReportData

type ReportData struct {
	GlobalHistory      DenseHistory
	FileHistories      map[string]DenseHistory
	FileOwnership      map[string]map[int]int
	PeopleHistories    []DenseHistory
	PeopleMatrix       DenseHistory
	ReversedPeopleDict []string
	TickSize           time.Duration
	Sampling           int
	Granularity        int
	ProjectName        string
	EndTime            time.Time
}

ReportData is the parsed input data for burndown metrics computation.

func ParseReportData

func ParseReportData(report analyze.Report) (*ReportData, error)

ParseReportData extracts ReportData from an analyzer report.

type Shard

type Shard struct {
	// contains filtered or unexported fields
}

Shard holds per-file burndown data within a partition. Uses PathID-indexed slices and activeIDs so iteration is over a slice (touched list), not map iteration (Track B).

type SurvivalData

type SurvivalData struct {
	SampleIndex   int     `json:"sample_index"   yaml:"sample_index"`
	TotalLines    int64   `json:"total_lines"    yaml:"total_lines"`
	SurvivalRate  float64 `json:"survival_rate"  yaml:"survival_rate"`
	BandBreakdown []int64 `json:"band_breakdown" yaml:"band_breakdown"`
}

SurvivalData contains code survival statistics for a time period.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL