Abstract: Many Problems in the world require predicting arrival times of an object. It could be arrival of a container, arrival of a flight, arrival of a ride etc. These problems could become complex as they involve an object going through a journey which could be multi-hop. Measuring the performance here becomes dependant on the use-case being targeted. Although the machine learning problem here is generally solved by regression models, Big part of using these solutions and building trust of stakeholders requires presenting performance of the model in a very subjective way. This goes beyond just representing Mean/Median Absolute Error and also involves the platform /engine through which predictions are being generated/updated. The questions of interests are generally following: How frequently the predictions are updated in the journey? How many hours/days/weeks ahead in the journey could the model make first correct prediction? How reliable the predictions are, are they changing frequently for the same journey, Is this change good? How do these predictions compare with a predefined benchmark?
These questions would again change based on the end use-case. Some use-case might require capturing the performance at all the intermediate hops instead of just the final destination. Some use-case might prefer the predictions to be conservative (predicting late arrival by some threshold is okay) by design (for example customer service)
Apart from the performance , interpretability part kicks in when there is a change in the prediction by model in between the journey of an object. The most anticipated and important question is “What are the reasons the prediction was changed from x to y in the middle of the journey?”. Not able to provide reasons and answers to the questions above could easily lead to loss of trust in the system. In this talk, you will learn about different KPIs, visualisation techniques that can capture the behaviour of ETA models. We will also walk through why it is important to have interpretable model by design to be able to answer some of the questions.
Bio: Ghassen, currently data scientist at Portcast.io (real time predictive visibility and demand forecasting to optimize supply-chain), has close to four years of experience and extensive expertise in leading full-spectrum descriptive and predictive analyses towards supporting high-level decision-making.