Skip to content

Prediction Logic

Prediction logic in SmartKNN defines how retrieved neighbors are transformed into final outputs.

It operates strictly after distance computation and neighbor retrieval, and does not modify feature weights, distance behavior, or backend configuration.


Design Principles

SmartKNN prediction logic is guided by the following principles:

  • Distance should matter more than raw neighbor count
  • Closer neighbors should contribute more strongly
  • Duplicate or near-duplicate points should not destabilize outputs
  • Aggregation should remain stable under noise and class imbalance

Feature weights influence distance computation only.
Prediction aggregation uses distance-derived neighbor weights.


Regression Prediction

For regression tasks, SmartKNN performs distance-weighted local regression.

Each neighbor contributes to the prediction in proportion to its proximity to the query point, as determined by the learned distance metric.

Aggregation Strategy

  • Feature weights are applied inside the distance computation
  • Neighbor influence is weighted by inverse distance
  • A distance floor is applied to prevent division by zero
  • Contributions are normalized for numerical stability

The prediction is computed as:

[ \hat{y}(x) = \frac{\sum_{i=1}^{K} \alpha_i \cdot y_i}{\sum_{i=1}^{K} \alpha_i} \quad \text{where} \quad \alpha_i = \frac{1}{d(x, x_i) + \epsilon} ]

This corresponds to a kernel-style local weighted regression estimator.

Key Properties

  • Stable under duplicate points
    Distance flooring prevents singularities and dominance by identical samples.

  • Robust to noise
    Distant or weakly related neighbors contribute less.

  • Locally adaptive behavior
    Predictions adapt to neighborhood structure rather than enforcing a global model.


Classification Prediction

For classification tasks, SmartKNN uses distance-weighted class voting rather than simple majority voting.

Voting Strategy

  • Each neighbor contributes a vote weighted by inverse distance
  • Votes are accumulated per class
  • The class with the highest total weighted score is selected

This avoids brittle majority voting and improves minority-class recall.


Numerical Stability Guarantees

Prediction logic enforces:

  • Distance flooring to prevent division-by-zero
  • Sanitized distance inputs
  • Bounded, deterministic aggregation

These guarantees ensure consistent behavior across datasets and execution modes.


Consistency Across Backends

Prediction logic is backend-agnostic.

Backend choice affects which neighbors are retrieved, not how predictions are computed.


Determinism and Reproducibility

Given identical input, configuration, and retrieved neighbors, SmartKNN guarantees:

  • Deterministic predictions
  • Reproducible outputs
  • Stable interpretability reports

No hidden state or adaptation occurs during inference.


Summary

SmartKNN prediction logic uses distance-weighted aggregation over a learned metric space.

By cleanly separating: - feature-weighted distance computation - distance-weighted prediction aggregation

SmartKNN achieves robust, interpretable, and production-safe predictions.