A retiring expert used to price every used machine by feel. This replaces that with a model that gives a price, says how sure it is, and shows why, learned from 412,698 historical auctions.
SHM buys and sells used heavy machinery. Pricing lived in one expert's head, and he is leaving. Predicting a price is only half the job. It also has to earn the trust he had: explain each valuation, admit uncertainty, and flag a mispriced lot. Because SHM both buys and sells, a calibrated, explainable price beats a slightly sharper point estimate.
The model prices machines sold tomorrow, so it is trained on the past and tested on the most recent sales. A random shuffle would let it peek at neighbouring-week prices and look far better than it is. We ran that shuffle too, to measure exactly how much it would have flattered us.
The neural net and RandomForest finish in a statistical tie at the top, with CatBoost just behind. We deploy Random Forest: it is interpretable, pairs cleanly with exact explanations, and calibrates without bias. LightGBM and XGBoost sit a little back on untuned defaults (with early stopping) - a hyperparameter search is what they would need; more trees and a different encoding were tried and neither closed the gap.
| Model | RMSLE | MAE | MAPE | R2 |
|---|---|---|---|---|
| Neural net (embeddings)best | 0.298 | $7,694 | 24.2% | 0.760 |
| Random Forestproduction | 0.302 | $7,567 | 22.0% | 0.762 |
| CatBoost | 0.308 | $7,889 | 21.6% | 0.743 |
| Ridge | 0.317 | $7,660 | 23.9% | 0.748 |
| XGBoost | 0.334 | $8,270 | 23.2% | 0.722 |
| LightGBM | 0.335 | $8,482 | 23.4% | 0.713 |
Age leads, then the machine's size class parsed out of its product description - the levers an appraiser pulls first. Every prediction breaks down per machine, so a buyer sees why a number came out the way it did.
Conformal prediction wraps each estimate in a calibrated band - the band is what turns a guess into a buy or sell decision. A flat band in log space is wider in dollars for expensive machines, which is the right shape.
Measured only on data it never trained on, the error climbs from about 0.25 to 0.34 across 2009-2012 as conditions shift away from the training window. That is the signal to retrain, and the chart catches it while the error is still small.
Enter a machine's specs and the model returns a price, its confidence band, the factors behind it, and the most similar past sales.