Prediction Logic
Prediction logic in SmartKNN defines how retrieved neighbors are transformed into final outputs.
It operates strictly after distance computation and neighbor retrieval, and does not modify feature weights, distance behavior, or backend configuration.
Design Principles
SmartKNN prediction logic is guided by the following principles:
- Distance should matter more than raw neighbor count
- Closer neighbors should contribute more strongly
- Duplicate or near-duplicate points should not destabilize outputs
- Aggregation should remain stable under noise and class imbalance
Feature weights influence distance computation only.
Prediction aggregation uses distance-derived neighbor weights.
Regression Prediction
For regression tasks, SmartKNN performs distance-weighted local regression.
Each neighbor contributes to the prediction in proportion to its proximity to the query point, as determined by the learned distance metric.
Aggregation Strategy
- Feature weights are applied inside the distance computation
- Neighbor influence is weighted by inverse distance
- A distance floor is applied to prevent division by zero
- Contributions are normalized for numerical stability
The prediction is computed as:
[ \hat{y}(x) = \frac{\sum_{i=1}^{K} \alpha_i \cdot y_i}{\sum_{i=1}^{K} \alpha_i} \quad \text{where} \quad \alpha_i = \frac{1}{d(x, x_i) + \epsilon} ]
This corresponds to a kernel-style local weighted regression estimator.
Key Properties
-
Stable under duplicate points
Distance flooring prevents singularities and dominance by identical samples. -
Robust to noise
Distant or weakly related neighbors contribute less. -
Locally adaptive behavior
Predictions adapt to neighborhood structure rather than enforcing a global model.
Classification Prediction
For classification tasks, SmartKNN uses distance-weighted class voting rather than simple majority voting.
Voting Strategy
- Each neighbor contributes a vote weighted by inverse distance
- Votes are accumulated per class
- The class with the highest total weighted score is selected
This avoids brittle majority voting and improves minority-class recall.
Numerical Stability Guarantees
Prediction logic enforces:
- Distance flooring to prevent division-by-zero
- Sanitized distance inputs
- Bounded, deterministic aggregation
These guarantees ensure consistent behavior across datasets and execution modes.
Consistency Across Backends
Prediction logic is backend-agnostic.
Backend choice affects which neighbors are retrieved, not how predictions are computed.
Determinism and Reproducibility
Given identical input, configuration, and retrieved neighbors, SmartKNN guarantees:
- Deterministic predictions
- Reproducible outputs
- Stable interpretability reports
No hidden state or adaptation occurs during inference.
Summary
SmartKNN prediction logic uses distance-weighted aggregation over a learned metric space.
By cleanly separating: - feature-weighted distance computation - distance-weighted prediction aggregation
SmartKNN achieves robust, interpretable, and production-safe predictions.