Skip to content

Versioning and Release History

SmartKNN follows a clear, incremental versioning strategy focused on correctness, performance, and production readiness.

Major releases introduce architectural or behavioral changes.
Minor releases focus on stability, safety, and incremental improvements.

This page documents notable releases and their design intent.


Release 0.2.0 — SmartKNN v2

SmartKNN v2 marks a major architectural milestone.

This release introduces a fully optimized, scalable system supporting both classification and regression with ultra-low latency inference.


Major Changes

  • Full classification support restored
  • Introduction of a scalable ANN backend for fast neighbor search
  • Brute-force backend retained and optimized for small datasets
  • Designed to scale to millions of rows with predictable latency

New Features

Backend Architecture

  • Approximate Nearest Neighbor (ANN) backend
  • Optimized for large datasets
  • Optional GPU support for neighbor search
  • Tunable parameters:
    • nlist — number of coarse clusters
    • nprobe — number of clusters searched per query
  • Safe default values provided

  • Automatic backend selection

  • Brute backend for small datasets
  • ANN backend for medium and large datasets
  • Automatic fallback to brute-force if ANN quality checks fail

Core Capabilities

  • Full support for classification and regression
  • Distance-weighted voting for classification
  • Distance-weighted local regression for regression
  • Feature masking via weight thresholding
  • Robust internal handling of NaN / Inf values

Evaluation and Reporting

  • Unified automatic evaluation engine
  • Automatic task-type inference from target values
  • Built-in metrics:
  • Regression: MSE, RMSE, MAE, R²
  • Classification: Accuracy, Precision, Recall, F1, Confusion Matrix
  • Safe handling of NaN / Inf during evaluation
  • Supports non-numeric classification labels

Performance Improvements

  • Fully vectorized NumPy execution
  • Numba acceleration added for:
  • Distance computation
  • Core inner loops
  • Ultra-low latency inference compared to v1
  • Faster configuration and training compared to v1
  • Stress-tested on large and heavy datasets

Benchmarks

  • Extensive benchmarking across:
  • Classification tasks
  • Regression tasks
  • Large-scale datasets
  • Demonstrates:
  • Significant speedups over v1
  • Competitive CPU-only latency compared to tree-based models

Known Limitations

  • ANN quality depends on dataset characteristics and parameter tuning
  • GPU support is limited to neighbor search
  • Probability calibration is not yet available

Release 0.1.1 — SmartKNN v1.1

This release focused on stability and correctness.


Changes

  • Classification temporarily disabled due to correctness concerns
  • Introduction of explicit feature weight control parameters:
  • alpha
  • beta
  • gamma

Notes

  • Feature weight learning remained accuracy-focused but computationally expensive
  • This release prioritized correctness over functionality
  • Served as a stabilization phase before the v2 redesign

Release 0.1.0 — SmartKNN v1 (Initial Release)

SmartKNN is born.

This was the first public release introducing the core ideas behind SmartKNN.


Core Features

  • Automatic feature weight learning
  • Automatic preprocessing
  • Automatic detection of:
  • Classification
  • Regression
  • Internal handling of NaN / Inf values
  • Feature masking via weight thresholding
  • Pure Python + NumPy implementation

Design Characteristics

  • Accuracy-first approach
  • Not fully vectorized
  • No ANN or GPU support
  • Limited scalability for large datasets

Known Issues

  • Feature weight learning was slow and did not scale well
  • Classification outputs returned numeric values instead of labels
  • Focused primarily on correctness rather than performance

Versioning Philosophy

SmartKNN versioning follows these principles:

  • Correctness before performance
  • Explicit documentation of breaking changes
  • Stable inference behavior within a major version
  • Incremental evolution with clear intent

Users are encouraged to review release notes before upgrading between major versions.


Summary

SmartKNN has evolved from a correctness-focused prototype into a production-ready, high-performance nearest-neighbor system.

Each release reflects deliberate engineering decisions rather than incremental hacks, with transparency around trade-offs and limitations.