burndown

package

v0.0.0-...-d1eea97 Latest Latest Go to latest Published: Feb 18, 2026 License: Apache-2.0, Apache-2.0 Imports: 30 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Sumatoshi-tech/codefang

Links

Open Source Insights

README ¶

Burndown Analysis

Preface

Understanding the evolution of a codebase is crucial for maintaining its health. Knowing how old the code is, who wrote it, and how actively it's being modified can reveal hidden risks and opportunities for improvement.

Problem

As projects grow, "knowledge silos" form. Developers leave, and their code remains, often becoming "legacy" that no one dares to touch. It's difficult to answer questions like:

"How much of the code is actively maintained?"
"Who are the main contributors to this module?"
"Is the project accumulating technical debt in the form of untouched, aging code?"

How analyzer solves it

The Burndown analyzer calculates "line burndown" statistics. It tracks every line of code through the history of the project, recording when it was introduced and by whom. It generates a matrix where one dimension is time (when the line was last touched) and the other is the current time. This allows visualizing how code "survives" over time.

Historical context

Code burndown charts are a well-established visualization in software engineering analytics. They provide a high-level view of code churn and stability, often used to assess project maturity and developer retention.

Real world examples

Bus Factor Estimation: By looking at code ownership (which developer "owns" which lines), you can identify modules that depend on a single person.
Refactoring Planning: Identify "stable" (old) parts of the code vs. "volatile" (frequently changed) parts. Volatile legacy code is often a good target for refactoring.
Onboarding: Help new developers identify who to ask about specific parts of the codebase.

How analyzer works here

The analyzer iterates through the Git commit history.

Tree Diff: It compares the file tree of the current commit with its parents to find modified files.
File Diff: For modified files, it computes the line-level differences (diffs).
Ownership Tracking: It maintains a data structure (using RBTree for efficiency) that maps every line in every file to its original author and creation time.
Sparse Matrix: It aggregates this data into sparse matrices to save memory, representing the "burndown" state at sampled intervals.
Hibernation: To handle large repositories, it supports "hibernating" file structures to disk to keep memory usage low.

Limitations

Memory Usage: Tracking every line of code in a massive repository can be memory-intensive, although the hibernation feature mitigates this.
Binary Files: It only works effectively on text files.

Further plans

Improved visualization of the generated matrices.
Deeper integration with team structure (mapping authors to teams).

Documentation ¶

Overview ¶

Package burndown provides burndown functionality.

Index ¶

Constants
Variables
func RegisterPlotSections()
type AggregateData
type AggregateMetric
- func NewAggregateMetric() *AggregateMetric
- func (m *AggregateMetric) Compute(input *ReportData) AggregateData
type ComputedMetrics
- func ComputeAllMetrics(report analyze.Report) (*ComputedMetrics, error)
- func (m *ComputedMetrics) AnalyzerName() string
- func (m *ComputedMetrics) ToJSON() any
- func (m *ComputedMetrics) ToYAML() any
type DenseHistory
type DeveloperSurvivalData
type DeveloperSurvivalInput
type DeveloperSurvivalMetric
- func NewDeveloperSurvivalMetric() *DeveloperSurvivalMetric
- func (m *DeveloperSurvivalMetric) Compute(input DeveloperSurvivalInput) []DeveloperSurvivalData
type FileSurvivalData
type FileSurvivalInput
type FileSurvivalMetric
- func NewFileSurvivalMetric() *FileSurvivalMetric
- func (m *FileSurvivalMetric) Compute(input FileSurvivalInput) []FileSurvivalData
type GlobalSurvivalMetric
- func NewGlobalSurvivalMetric() *GlobalSurvivalMetric
- func (m *GlobalSurvivalMetric) Compute(input *ReportData) []SurvivalData
type HistoryAnalyzer
- func (b *HistoryAnalyzer) ApplySnapshot(snap analyze.PlumbingSnapshot)
- func (b *HistoryAnalyzer) Boot() error
- func (b *HistoryAnalyzer) CPUHeavy() bool
- func (b *HistoryAnalyzer) CheckpointSize() int64
- func (b *HistoryAnalyzer) Configure(facts map[string]any) error
- func (b *HistoryAnalyzer) Consume(_ context.Context, ac *analyze.Context) error
- func (b *HistoryAnalyzer) ConsumePrepared(prepared *analyze.PreparedCommit) error
- func (b *HistoryAnalyzer) Description() string
- func (b *HistoryAnalyzer) Descriptor() analyze.Descriptor
- func (b *HistoryAnalyzer) Finalize() (analyze.Report, error)
- func (b *HistoryAnalyzer) Flag() string
- func (b *HistoryAnalyzer) Fork(n int) []analyze.HistoryAnalyzer
- func (b *HistoryAnalyzer) FormatReport(report analyze.Report, writer io.Writer) error
- func (b *HistoryAnalyzer) GenerateChart(report analyze.Report) (components.Charter, error)
- func (b *HistoryAnalyzer) GenerateSections(report analyze.Report) ([]plotpage.Section, error)
- func (b *HistoryAnalyzer) Hibernate() error
- func (b *HistoryAnalyzer) Initialize(repository *gitlib.Repository) error
- func (b *HistoryAnalyzer) ListConfigurationOptions() []pipeline.ConfigurationOption
- func (b *HistoryAnalyzer) LoadCheckpoint(dir string) error
- func (b *HistoryAnalyzer) Merge(branches []analyze.HistoryAnalyzer)
- func (b *HistoryAnalyzer) Name() string
- func (b *HistoryAnalyzer) ReleaseSnapshot(_ analyze.PlumbingSnapshot)
- func (b *HistoryAnalyzer) SaveCheckpoint(dir string) error
- func (b *HistoryAnalyzer) SequentialOnly() bool
- func (b *HistoryAnalyzer) Serialize(result analyze.Report, format string, writer io.Writer) error
- func (b *HistoryAnalyzer) SnapshotPlumbing() analyze.PlumbingSnapshot
- func (b *HistoryAnalyzer) StateGrowthPerCommit() int64
type InteractionData
type InteractionInput
type InteractionMetric
- func NewInteractionMetric() *InteractionMetric
- func (m *InteractionMetric) Compute(input InteractionInput) []InteractionData
type PathID
type PathInterner
- func NewPathInterner() *PathInterner
- func (pi *PathInterner) Intern(path string) PathID
- func (pi *PathInterner) Len() int
- func (pi *PathInterner) Lookup(id PathID) string
type ReportData
- func ParseReportData(report analyze.Report) (*ReportData, error)
type Shard
type SurvivalData

Constants ¶

View Source

const (
	// ConfigBurndownGranularity is the configuration key for the burndown band granularity.
	ConfigBurndownGranularity = "Burndown.Granularity"
	// ConfigBurndownSampling is the configuration key for the burndown sampling rate.
	ConfigBurndownSampling = "Burndown.Sampling"
	// ConfigBurndownTrackFiles is the configuration key for enabling per-file burndown tracking.
	ConfigBurndownTrackFiles = "Burndown.TrackFiles"
	// ConfigBurndownTrackPeople is the configuration key for enabling per-developer burndown tracking.
	ConfigBurndownTrackPeople = "Burndown.TrackPeople"
	// ConfigBurndownHibernationThreshold defines the hibernation memory threshold.
	ConfigBurndownHibernationThreshold = "Burndown.HibernationThreshold"
	// ConfigBurndownHibernationToDisk defines the hibernation to disk configuration constant.
	ConfigBurndownHibernationToDisk = "Burndown.HibernationOnDisk"
	// ConfigBurndownHibernationDirectory defines the hibernation directory configuration constant.
	ConfigBurndownHibernationDirectory = "Burndown.HibernationDirectory"
	// ConfigBurndownDebug defines the debug mode configuration constant.
	ConfigBurndownDebug = "Burndown.Debug"
	// ConfigBurndownGoroutines defines the goroutines configuration constant.
	ConfigBurndownGoroutines = "Burndown.Goroutines"
	// DefaultBurndownGranularity defines the default granularity in days.
	DefaultBurndownGranularity = 30
	// DefaultBurndownSampling defines the default sampling in ticks.
	// Matches Hercules: sampling equals granularity (30) for comparable output.
	DefaultBurndownSampling = 30
	// DefaultBurndownHibernationThreshold defines the default node count threshold for hibernation.
	DefaultBurndownHibernationThreshold = 1000
)

View Source

const (
	// TickSizeThresholdHigh is the maximum tick size in hours for burndown granularity.
	TickSizeThresholdHigh = 24
)

Configuration constants for burndown analysis.

Variables ¶

View Source

var ErrInvalidReport = errors.New("invalid burndown report: expected DenseHistory")

ErrInvalidReport indicates the report doesn't contain expected data.

Functions ¶

func RegisterPlotSections ¶

func RegisterPlotSections()

RegisterPlotSections registers the burndown plot section renderer with the analyze package.

Types ¶

type AggregateData ¶

type AggregateData struct {
	TotalCurrentLines   int64   `json:"total_current_lines"   yaml:"total_current_lines"`
	TotalPeakLines      int64   `json:"total_peak_lines"      yaml:"total_peak_lines"`
	OverallSurvivalRate float64 `json:"overall_survival_rate" yaml:"overall_survival_rate"`
	AnalysisPeriodDays  int     `json:"analysis_period_days"  yaml:"analysis_period_days"`
	NumBands            int     `json:"num_bands"             yaml:"num_bands"`
	NumSamples          int     `json:"num_samples"           yaml:"num_samples"`
	TrackedFiles        int     `json:"tracked_files"         yaml:"tracked_files"`
	TrackedDevelopers   int     `json:"tracked_developers"    yaml:"tracked_developers"`
}

AggregateData contains summary statistics.

type AggregateMetric ¶

type AggregateMetric struct {
	metrics.MetricMeta
}

AggregateMetric computes summary statistics.

func NewAggregateMetric ¶

func NewAggregateMetric() *AggregateMetric

NewAggregateMetric creates the aggregate metric.

func (*AggregateMetric) Compute ¶

func (m *AggregateMetric) Compute(input *ReportData) AggregateData

Compute calculates aggregate statistics.

type ComputedMetrics ¶

type ComputedMetrics struct {
	Aggregate         AggregateData           `json:"aggregate"          yaml:"aggregate"`
	GlobalSurvival    []SurvivalData          `json:"global_survival"    yaml:"global_survival"`
	FileSurvival      []FileSurvivalData      `json:"file_survival"      yaml:"file_survival"`
	DeveloperSurvival []DeveloperSurvivalData `json:"developer_survival" yaml:"developer_survival"`
	Interaction       []InteractionData       `json:"interactions"       yaml:"interactions"`
}

ComputedMetrics holds all computed metric results for the burndown analyzer.

func ComputeAllMetrics ¶

func ComputeAllMetrics(report analyze.Report) (*ComputedMetrics, error)

ComputeAllMetrics runs all burndown metrics and returns the results.

func (*ComputedMetrics) AnalyzerName ¶

func (m *ComputedMetrics) AnalyzerName() string

AnalyzerName returns the analyzer identifier.

func (*ComputedMetrics) ToJSON ¶

func (m *ComputedMetrics) ToJSON() any

ToJSON returns the metrics in JSON-serializable format.

func (*ComputedMetrics) ToYAML ¶

func (m *ComputedMetrics) ToYAML() any

ToYAML returns the metrics in YAML-serializable format.

type DenseHistory ¶

type DenseHistory = [][]int64

DenseHistory is a two-dimensional matrix of line counts over time intervals.

type DeveloperSurvivalData ¶

type DeveloperSurvivalData struct {
	ID           int     `json:"id"            yaml:"id"`
	Name         string  `json:"name"          yaml:"name"`
	CurrentLines int64   `json:"current_lines" yaml:"current_lines"`
	PeakLines    int64   `json:"peak_lines"    yaml:"peak_lines"`
	SurvivalRate float64 `json:"survival_rate" yaml:"survival_rate"`
}

DeveloperSurvivalData contains survival data for a developer's code.

type DeveloperSurvivalInput ¶

type DeveloperSurvivalInput struct {
	PeopleHistories    []DenseHistory
	ReversedPeopleDict []string
}

DeveloperSurvivalInput holds input for developer survival computation.

type DeveloperSurvivalMetric ¶

type DeveloperSurvivalMetric struct {
	metrics.MetricMeta
}

DeveloperSurvivalMetric computes per-developer code survival.

func NewDeveloperSurvivalMetric ¶

func NewDeveloperSurvivalMetric() *DeveloperSurvivalMetric

NewDeveloperSurvivalMetric creates the developer survival metric.

func (*DeveloperSurvivalMetric) Compute ¶

func (m *DeveloperSurvivalMetric) Compute(input DeveloperSurvivalInput) []DeveloperSurvivalData

Compute calculates developer survival statistics.

type FileSurvivalData ¶

type FileSurvivalData struct {
	Path         string      `json:"path"                 yaml:"path"`
	CurrentLines int64       `json:"current_lines"        yaml:"current_lines"`
	Ownership    map[int]int `json:"ownership"            yaml:"ownership"`
	TopOwnerID   int         `json:"top_owner_id"         yaml:"top_owner_id"`
	TopOwnerName string      `json:"top_owner_name"       yaml:"top_owner_name"`
	TopOwnerPct  float64     `json:"top_owner_percentage" yaml:"top_owner_percentage"`
}

FileSurvivalData contains survival data for a single file.

type FileSurvivalInput ¶

type FileSurvivalInput struct {
	FileHistories      map[string]DenseHistory
	FileOwnership      map[string]map[int]int
	ReversedPeopleDict []string
}

FileSurvivalInput holds input for file survival computation.

type FileSurvivalMetric ¶

type FileSurvivalMetric struct {
	metrics.MetricMeta
}

FileSurvivalMetric computes per-file survival statistics.

func NewFileSurvivalMetric ¶

func NewFileSurvivalMetric() *FileSurvivalMetric

NewFileSurvivalMetric creates the file survival metric.

func (*FileSurvivalMetric) Compute ¶

func (m *FileSurvivalMetric) Compute(input FileSurvivalInput) []FileSurvivalData

Compute calculates file survival statistics.

type GlobalSurvivalMetric ¶

type GlobalSurvivalMetric struct {
	metrics.MetricMeta
}

GlobalSurvivalMetric computes code survival time series.

func NewGlobalSurvivalMetric ¶

func NewGlobalSurvivalMetric() *GlobalSurvivalMetric

NewGlobalSurvivalMetric creates the global survival metric.

func (*GlobalSurvivalMetric) Compute ¶

func (m *GlobalSurvivalMetric) Compute(input *ReportData) []SurvivalData

Compute calculates global survival time series.

type HistoryAnalyzer ¶

type HistoryAnalyzer struct {
	BlobCache *plumbing.BlobCacheAnalyzer

	Ticks                *plumbing.TicksSinceStart
	Identity             *plumbing.IdentityDetector
	FileDiff             *plumbing.FileDiffAnalyzer
	TreeDiff             *plumbing.TreeDiffAnalyzer
	HibernationDirectory string

	HibernationThreshold int
	Granularity          int
	PeopleNumber         int
	TickSize             time.Duration
	Goroutines           int

	Sampling          int
	GlobalMu          sync.Mutex
	Debug             bool
	TrackFiles        bool
	HibernationToDisk bool
	// contains filtered or unexported fields
}

HistoryAnalyzer tracks line survival rates across commit history.

func (*HistoryAnalyzer) ApplySnapshot ¶

func (b *HistoryAnalyzer) ApplySnapshot(snap analyze.PlumbingSnapshot)

ApplySnapshot restores plumbing state from a snapshot.

func (*HistoryAnalyzer) Boot ¶

func (b *HistoryAnalyzer) Boot() error

Boot performs early initialization before repository processing. Ensures per-shard tracking maps are ready for the next chunk.

func (*HistoryAnalyzer) CPUHeavy ¶

func (b *HistoryAnalyzer) CPUHeavy() bool

CPUHeavy returns false because burndown tracks line ownership without UAST processing.

func (*HistoryAnalyzer) CheckpointSize ¶

func (b *HistoryAnalyzer) CheckpointSize() int64

CheckpointSize returns an estimated size of the checkpoint in bytes.

func (*HistoryAnalyzer) Configure ¶

func (b *HistoryAnalyzer) Configure(facts map[string]any) error

Configure sets up the analyzer with the provided facts.

func (*HistoryAnalyzer) Consume ¶

func (b *HistoryAnalyzer) Consume(_ context.Context, ac *analyze.Context) error

Consume processes a single commit with the provided dependency results.

func (*HistoryAnalyzer) ConsumePrepared ¶

func (b *HistoryAnalyzer) ConsumePrepared(prepared *analyze.PreparedCommit) error

ConsumePrepared processes a pre-prepared commit. This is used by the pipelined runner for parallel commit preparation.

func (*HistoryAnalyzer) Description ¶

func (b *HistoryAnalyzer) Description() string

Description returns a human-readable description of the analyzer.

func (*HistoryAnalyzer) Descriptor ¶

func (b *HistoryAnalyzer) Descriptor() analyze.Descriptor

Descriptor returns stable analyzer metadata.

func (*HistoryAnalyzer) Finalize ¶

func (b *HistoryAnalyzer) Finalize() (analyze.Report, error)

Finalize completes the analysis and returns the result.

func (*HistoryAnalyzer) Flag ¶

func (b *HistoryAnalyzer) Flag() string

Flag returns the CLI flag for the analyzer.

func (*HistoryAnalyzer) Fork ¶

func (b *HistoryAnalyzer) Fork(n int) []analyze.HistoryAnalyzer

Fork creates a copy of the analyzer for parallel processing.

func (*HistoryAnalyzer) FormatReport ¶

func (b *HistoryAnalyzer) FormatReport(report analyze.Report, writer io.Writer) error

FormatReport writes the formatted analysis report to the given writer.

func (*HistoryAnalyzer) GenerateChart ¶

func (b *HistoryAnalyzer) GenerateChart(report analyze.Report) (components.Charter, error)

GenerateChart implements PlotGenerator interface.

func (*HistoryAnalyzer) GenerateSections ¶

func (b *HistoryAnalyzer) GenerateSections(report analyze.Report) ([]plotpage.Section, error)

GenerateSections returns the sections for combined reports.

func (*HistoryAnalyzer) Hibernate ¶

func (b *HistoryAnalyzer) Hibernate() error

Hibernate releases resources between processing phases. Clears per-shard tracking maps (mergedByID, deletionsByID) that are only needed within a chunk. Also compacts file timelines to reduce memory usage.

func (*HistoryAnalyzer) Initialize ¶

func (b *HistoryAnalyzer) Initialize(repository *gitlib.Repository) error

Initialize prepares the analyzer for processing commits.

func (*HistoryAnalyzer) ListConfigurationOptions ¶

func (b *HistoryAnalyzer) ListConfigurationOptions() []pipeline.ConfigurationOption

ListConfigurationOptions returns the configuration options for the analyzer.

func (*HistoryAnalyzer) LoadCheckpoint ¶

func (b *HistoryAnalyzer) LoadCheckpoint(dir string) error

LoadCheckpoint restores the analyzer state from the given directory.

func (*HistoryAnalyzer) Merge ¶

func (b *HistoryAnalyzer) Merge(branches []analyze.HistoryAnalyzer)

Merge combines results from forked analyzer branches.

func (*HistoryAnalyzer) Name ¶

func (b *HistoryAnalyzer) Name() string

Name returns the name of the analyzer.

func (*HistoryAnalyzer) ReleaseSnapshot ¶

func (b *HistoryAnalyzer) ReleaseSnapshot(_ analyze.PlumbingSnapshot)

ReleaseSnapshot is a no-op for burndown (no UAST resources).

func (*HistoryAnalyzer) SaveCheckpoint ¶

func (b *HistoryAnalyzer) SaveCheckpoint(dir string) error

SaveCheckpoint writes the analyzer state to the given directory.

func (*HistoryAnalyzer) SequentialOnly ¶

func (b *HistoryAnalyzer) SequentialOnly() bool

SequentialOnly returns true because burndown tracks cumulative per-file line state across all commits and cannot be parallelized.

func (*HistoryAnalyzer) Serialize ¶

func (b *HistoryAnalyzer) Serialize(result analyze.Report, format string, writer io.Writer) error

Serialize writes the analysis result to the given writer.

func (*HistoryAnalyzer) SnapshotPlumbing ¶

func (b *HistoryAnalyzer) SnapshotPlumbing() analyze.PlumbingSnapshot

SnapshotPlumbing captures the current plumbing state.

func (*HistoryAnalyzer) StateGrowthPerCommit ¶

func (b *HistoryAnalyzer) StateGrowthPerCommit() int64

StateGrowthPerCommit returns the estimated per-commit memory growth in bytes.

type InteractionData ¶

type InteractionData struct {
	AuthorID      int    `json:"author_id"      yaml:"author_id"`
	AuthorName    string `json:"author_name"    yaml:"author_name"`
	ModifierID    int    `json:"modifier_id"    yaml:"modifier_id"`
	ModifierName  string `json:"modifier_name"  yaml:"modifier_name"`
	LinesModified int64  `json:"lines_modified" yaml:"lines_modified"`
	IsSelfModify  bool   `json:"is_self_modify" yaml:"is_self_modify"`
}

InteractionData contains developer interaction statistics.

type InteractionInput ¶

type InteractionInput struct {
	PeopleMatrix       DenseHistory
	ReversedPeopleDict []string
}

InteractionInput holds input for interaction computation.

type InteractionMetric ¶

type InteractionMetric struct {
	metrics.MetricMeta
}

InteractionMetric computes developer interaction statistics.

func NewInteractionMetric ¶

func NewInteractionMetric() *InteractionMetric

NewInteractionMetric creates the interaction metric.

func (*InteractionMetric) Compute ¶

func (m *InteractionMetric) Compute(input InteractionInput) []InteractionData

Compute calculates developer interaction data.

type PathID ¶

type PathID uint32

PathID is a stable numeric id for an interned path. Used to index slice-backed state instead of map[string] so iteration is over a slice of active IDs, not map iteration.

type PathInterner ¶

type PathInterner struct {
	// contains filtered or unexported fields
}

PathInterner maps path strings to stable PathIDs. Thread-safe. IDs are assigned sequentially (0, 1, 2, ...) so slice-backed state can use PathID as index.

func NewPathInterner ¶

func NewPathInterner() *PathInterner

NewPathInterner creates an empty PathInterner.

func (*PathInterner) Intern ¶

func (pi *PathInterner) Intern(path string) PathID

Intern returns the PathID for path, creating a new ID if path has not been seen. Safe for concurrent use.

func (*PathInterner) Len ¶

func (pi *PathInterner) Len() int

Len returns the number of interned paths (next Intern will return PathID(Len())).

func (*PathInterner) Lookup ¶

func (pi *PathInterner) Lookup(id PathID) string

Lookup returns the path string for id. Panics if id >= Len().

type ReportData ¶

type ReportData struct {
	GlobalHistory      DenseHistory
	FileHistories      map[string]DenseHistory
	FileOwnership      map[string]map[int]int
	PeopleHistories    []DenseHistory
	PeopleMatrix       DenseHistory
	ReversedPeopleDict []string
	TickSize           time.Duration
	Sampling           int
	Granularity        int
	ProjectName        string
	EndTime            time.Time
}

ReportData is the parsed input data for burndown metrics computation.

func ParseReportData ¶

func ParseReportData(report analyze.Report) (*ReportData, error)

ParseReportData extracts ReportData from an analyzer report.

type Shard ¶

type Shard struct {
	// contains filtered or unexported fields
}

Shard holds per-file burndown data within a partition. Uses PathID-indexed slices and activeIDs so iteration is over a slice (touched list), not map iteration (Track B).

type SurvivalData ¶

type SurvivalData struct {
	SampleIndex   int     `json:"sample_index"   yaml:"sample_index"`
	TotalLines    int64   `json:"total_lines"    yaml:"total_lines"`
	SurvivalRate  float64 `json:"survival_rate"  yaml:"survival_rate"`
	BandBreakdown []int64 `json:"band_breakdown" yaml:"band_breakdown"`
}

SurvivalData contains code survival statistics for a time period.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL