List of publications

Computing confidence intervals for costeffectiveness ratios for renewal reward processes
In the context of maintenance and many other OR areas of application, the objective is often to choose the optimal policy within a class of policies using the socalled average cost criterion. Such a comparison, under certain assumptions, requires the computation of the costeffectiveness ratio for renewal reward processes (aka the longrun rate of cost). In some special cases of stylized models, this can be done theoretically, however often one relies on simulations. When performing a simulation, one is interested in computing with accuracy the ratio and a corresponding confidence interval. The estimation procedure of the costeffectiveness ratio at hand is straightforward and merely relies on the estimation of the two individual variables appearing on the ratio (expected cost/reward per cycle and expected cycle length). However, the construction of a corresponding confidence interval is a delicate matter and is often misunderstood. In this paper, we propose two approaches for the construction of confidence regions for costeffectiveness ratios: 1) A large sample confidence interval and 2) Fiellerâ€™s confidence region. For these two approaches and their respective estimators, we study their structural properties, such as asymptotic bias and consistency, and we provide sufficient conditions for them to hold. For illustration purposes, we compare the performance of both approaches in a small simulation experiment where we optimize a renewal reward process, which arises from a stylized maintenance model for which we can explicitly compute the costeffectiveness ratio.

Policies for the Dynamic Traveling Maintainer Problem with Alerts
Companies require modern industrial assets equipped with monitoring devices to be maintained right before failure to ensure maximum availability at minimum cost. To this purpose, two challenges need to be overcome: failure times of assets are unknown and assets can be part of a larger asset network. The first challenge is overcome nowadays by the availability of monitoring devices which emit alerts,typically triggered by the first signs of asset degradation. Regarding the second challenge, we study a maintenance planning problem in the presence of alerts, stochastic failures and multiple asset locations served by a single maintainer, named Dynamic Traveling Maintainer Problem with Alerts (DTMPA). Firstly, we develop a modeling framework for the DTMPA, aimed at minimizing discounted maintenance costs accrued over an infinite time horizon under different information levels from the alerts. Secondly, we propose three solution approaches to the DTMPA: (i) greedy heuristics that rank assets on proximity, urgency and economic risk; (ii) a novel Traveling Maintainer Heuristic optimizing nearfuture costs; (iii) a Deep Reinforcement Learning (DRL) method optimizing longterm costs, each leveraging different levels of information from the alerts. Our experiments with small asset networks show that all methods can approximate optimal policies with access to complete condition information. For larger networks, for which computing the optimal policy is intractable, the proposed methods yield competitive policies, with DRL consistently achieving the lowest costs.
DTMPA AI environment page
Link to ArXiv 
Scalable Policies for the Dynamic Traveling Maintainer Problem with Alerts
Downtime of industrial assets such as wind turbines and medical imaging devices is costly. To avoid such downtime costs, companies seek to initiate maintenance just before failure, which is challenging because: (i) Asset failures are notoriously difficult to predict, even in the presence of realtime monitoring devices which signal degradation; and (ii) Limited resources are available to serve a network of geographically dispersed assets. In this work, we study the dynamic traveling multimaintainer problem with alerts (KDTMPA) under perfect condition information with the objective to devise scalable solution approaches to maintain large networks with K maintenance engineers. Since such largescale KDTMPA instances are computationally intractable, we propose an iterative deep reinforcement learning (DRL) algorithm optimizing longterm discounted maintenance costs. The efficiency of the DRL approach is vastly improved by a reformulation of the action space (which relies on the Markov structure of the underlying problem) and by choosing a smart, suitable initial solution. The initial solution is created by extending existing heuristics with a dispatching mechanism. These extensions further serve as quality benchmarks for tailored instances. We demonstrate through extensive numerical experiments that DRL can solve single maintainer instances up to optimality, regardless of the chosen initial solution. Experiments with hospital networks containing up to 35 assets show that the proposed DRL algorithm is scalable. Lastly, the trained policies are shown to be robust against network modifications such as removing an asset or an engineer or yield a suitable initial solution.
KDTMPA AI environment page