Evaluation and Reporting
SmartKNN includes a unified evaluation and reporting engine designed to support both regression and classification tasks in a consistent, production-friendly manner.
Rather than exposing fragmented metric utilities, SmartKNN treats evaluation as a first-class system component with explicit guarantees around correctness, safety, and downstream integration.
Unified Evaluation Model
SmartKNN automatically adapts its evaluation behavior based on the task type inferred during configuration.
This unified model ensures that: - The correct set of metrics is applied - Outputs are consistent across tasks - Evaluation logic remains centralized and auditable
Users do not need to manually switch evaluation modes or metric definitions.
Regression Metrics
For regression tasks, SmartKNN reports the following metrics:
-
Mean Squared Error (MSE)
Measures average squared prediction error. -
Root Mean Squared Error (RMSE)
Provides error magnitude in the same scale as the target variable. -
Mean Absolute Error (MAE)
Measures average absolute deviation between predictions and targets. -
R² Score
Indicates the proportion of variance explained by the model.
These metrics together provide a balanced view of error magnitude, variance, and overall fit.
Classification Metrics
For classification tasks, SmartKNN reports:
-
Accuracy
Overall fraction of correct predictions. -
Precision (Macro-Averaged)
Measures correctness of positive predictions across classes. -
Recall (Macro-Averaged)
Measures coverage of true class instances across classes. -
F1 Score (Macro-Averaged)
Balances precision and recall, especially under class imbalance. -
Confusion Matrix
Provides a detailed view of prediction outcomes across classes.
Macro-averaging is used by default to ensure fair treatment of all classes, including minority classes.
Input Validation and Safety
All evaluation routines enforce strict validation rules:
- Input arrays are sanitized before evaluation
- Length mismatches are detected and rejected
- Invalid or missing values are handled explicitly
These checks prevent silent errors and ensure that reported metrics accurately reflect model behavior.
Structured Reporting Outputs
Evaluation results are returned as structured outputs rather than unstructured logs.
This design enables: - Easy integration with logging systems - Compatibility with CI/CD pipelines - Programmatic inspection and comparison of results - Consistent experiment tracking
Metrics are designed to be machine-readable as well as human-interpretable.
Role in Model Development
The evaluation and reporting engine supports multiple stages of the SmartKNN lifecycle:
- Model validation during development
- Performance regression checks during iteration
- Benchmarking against alternative systems
- Production monitoring and analysis
By standardizing evaluation behavior, SmartKNN reduces ambiguity and improves confidence in reported results.
Design Intent
The evaluation system in SmartKNN is designed to be:
- Task-aware — adapts automatically to regression or classification
- Safe — validates inputs and prevents silent failures
- Consistent — produces comparable results across runs
- Production-ready — suitable for automated pipelines and monitoring
Evaluation is treated not as an afterthought, but as a core part of the system’s reliability and usability.