Projet de thèse en Informatique, données, IA
Sous la direction de Leo Liberti, Antonio Frangioni et de Claudia D'ambrosio.
Thèses en préparation à l'Institut polytechnique de Paris en cotutelle avec l'Université de Pise , dans le cadre de École doctorale de l'Institut polytechnique de Paris , en partenariat avec LIX - Laboratoire d'informatique (laboratoire) depuis le 01-10-2017 .
[Voir page en anglais]
Algorithmic configuration by learning and optimization
THE AUTOMATIC PARAMETER TUNING PROBLEM ON OPTIMIZATION SOLVERS INTRODUCTION TO THE PROBLEM Given considerable advances in the fields of ML and AI of the last 20 years, it has been possible to reach ever growing predictive power in learning the complex relations underlying sets of data. However, for many applications, discovering such relations is just the first step of the learning process, inasmuch as they depend upon two distinct sets of elements, "features" and "controls". Features are unalterable attributes of the problem at hand, whereas controls can be freely selected (they represent actionable choices, possibly subject to constraints) in order to obtain a more desirable outcome. They can therefore influence how a problem is solved. In particular, even if the set of the controls is small and easily identifiable, the optimal values of the controls for a given set of features must be still selected. APPLICATION: MATHEMATICAL OPTIMIZATION SOLVERS We chose to focus on the parameter tuning of solvers for mathematical optimization problems. Numerically solving hard optimization problems with solvers typically means handling highly complex algorithms.Algorithmic parameters of the solvers influence how those algorithms work together and the quality of a solution. Solvers' default parameter settings are tested by their vendors to have average good performance on large sets of instances. Yet, when an instance pertains to a specific class of problems, default algorithmic configurations can prove to be suboptimal. When this happens, solver users are compelled to undertake an empirical quest for the best parameter values. However, manually tuning the parameters ("controls") for a given instance or class of instances ("features") is nearly always a hard, inaccurate and time-consuming process, even for experts. We want to investigate the feasibility of combining ML and Mathematical Optimization to achieve a double goal: automate and improve the process of learning a performance model of a complex system with many parameters, in function of the numerical and structural properties of the instance being solved; recommend a good algorithmic configuration of the solver. We tried to accomplish that in two steps: - a Learning Phase (or "Learning Process") (LeP), where Machine Learning is employed to learn a performance model of the solver, that can evaluate and predict the performance of a a pair (instance features, parameter configuration) - a Recommendation Phase (or "Recommendation Problem") (ReP) where the performance map delivered in the previous stage is used as an objective function that must be minimized over the linearly constrained space of all possible controls; ReP thus consists in an optimization model that relies on the performance function produced by LeP to drive the search for a good solver configuration. Keeping LeP and ReP in balance proves to be quite a delicate and challenging task. In fact, once a map has been learnt by means of ML, one is still left to handle a hard, nonconvex, possibly high-dimensional optimization problem (ReP). MSC DISSERTATION My dissertation provided a first working example of the system as envisioned, limited to a very narrow class of solvers (IBM ILOG CPLEX) and optimization problems (Hydro Unit Commitment): experiments focused on SVRfor LeP; this yielded a Mixed-Integer NonConvex optimization problem for ReP. OBJECTIVES OF THE PHD - apply and combine existing ML best practices and optimization algorithms in order to show that a carefully balanced LeP-ReP combination can outperform default, "general-purpose" algorithmic settings provided by the current tools and, potentially, even manually tuned settings provided by experts; - review various existing ML methods to try to find, combine or develop, the one(s) showing the best compromise between (automatic) learning precision and cost of the optimization problem - generalising our approach and extending its applicability beyond HUC, so as to provide a methodological and algorithmic reference for all the many possible applications of this framework; - construct a platform and an organized set of libraries for conducting tests - investigate ways to automate the process of identifying the right features to be used in the ML for a class of instances