Switch language:   English version   Version française  

jdhp.org

This page contains my academic papers written in English.

Some other articles written in French are available here.


List of my publications: PDF, HTML, BibTex.

My academic papers are also available on HAL - Inria (open archive).

Contents

Ph.D. Thesis in Computer Science  

Hybridization of dynamic optimization methodologies  

This thesis is dedicated to sequential decision making (also known as multistage optimization) in uncertain complex environments. Studied algorithms are essentially applied to power systems ("Unit Commitment" problems). Among my main contributions, I present a new algorithm named "Direct Value Search" (DVS), designed to solve large scale unit commitment problems with few assumptions on the model, and to tackle some new challenges in the Power Systems community. Noisy optimization (a key component of the DVS algorithm) is studied and a new convergence bound proved. Some variance reduction techniques aimed at improving the convergence rate of graybox noisy optimization problems are also presented.


Presented and publicly defended on November 28 2014 in Orsay, France.


Laboratory: Inria Saclay / LRI (Université Paris-Sud 11)
Advisor: Olivier Teytaud
Reviewers: Pierre-Olivier Malaterre and Liva Ralaivola

Reference:

Jérémie Decock. Hybridization of dynamic optimization methodologies. Theses, Université Paris-Sud, November 2014.

Open Archive : HAL
Download : Manuscript (PDF)   Slides (PDF)  

Refereed International Conference Publications  

Direct Model Predictive Control: A Theoretical and Numerical Analysis  

This paper focuses on online control policies applied to power systems management. In this study, the power system problem is formulated as a stochastic decision process with large constrained action space, high stochasticity and dozens of state variables. Direct Model Predictive Control has previously been proposed to encompass a large class of stochastic decision making problems. It is a hybrid model which merges the properties of two different dynamic optimization methods, Model Predictive Control and Stochastic Dual Dynamic Programming. In this paper, we prove that Direct Model Predictive Control reaches an optimal policy for a wider class of decision processes than those solved by Model Predictive Control (suboptimal by nature), Stochastic Dynamic Programming (which needs a moderate size of state space) or Stochastic Dual Dynamic Programming (which requires convexity of Bellman values and a moderate complexity of the random value state). The algorithm is tested on a multiple-battery management problem and two hydroelectric problems. Direct Model Predictive Control clearly outperforms Model Predictive Control on the tested problems.

Will be presented in:

PSCC 2018 (Power Systems Computation Conference), Dublin (Ireland), June 2018.

Evolutionary Cutting Planes  

The Cutting Plane method is a simple and efficient method for optimizing convex functions in which subgradients are available. This paper proposes several methods for parallelizing it, in particular using a typically evolutionary method, and compares them experimentally in a well-conditioned and ill-conditioned settings.

Presented in:

Artificial Evolution (EA2015), Lyon (France), 2015.

Open Archive : HAL
Download : Article (PDF)  

Variance Reduction in Population-Based Optimization: Application to Unit Commitment  

We consider noisy optimization and some traditional variance reduction techniques aimed at improving the convergence rate, namely (i) common random numbers (CRN), which is relevant for population-based noisy optimization and (ii) stratified sampling, which is relevant for most noisy optimization problems. We present artificial models of noise for which common random numbers are very efficient, and artificial models of noise for which common random numbers are detrimental. We then experiment on a desperately expensive unit commitment problem. As expected, stratified sampling is never detrimental. Nonetheless, in practice, common random numbers provided, by far, most of the improvement.

Presented in:

Genetic and evolutionary computation (GECCO), Madrid (Spain), 2015.

Open Archive : HAL
Download : Article (PDF)  

Optimization of Energy Policies Using Direct Value Search  

Direct Policy Search (DPS) is a widely used tool for reinforcement learning; however, it is usually not suitable for handling high-dimensional constrained action spaces such as those arising in power system control (unit commitment problems). We propose Direct Value Search, an hybridization of DPS with Bellman decomposition techniques. We prove runtime properties, and apply the results to an energy management problem.

Presented in:

9èmes Journées Francophones de Planification, Décision et Apprentissage (JFPDA'14), Liège (Belgique), May 2014.

Open Archive : HAL
Download : Slides (PDF)  

Direct model predictive control  

Due to simplicity and convenience, Model Predictive Control, which consists in optimizing future decisions based on a pessimistic deterministic forecast of the random processes, is one of the main tools for stochastic control. Yet, it suffers from a large computation time, unless the tactical horizon (i.e. the number of future time steps included in the optimization) is strongly reduced, and lack of real stochasticity handling. We here propose a combination between Model Predictive Control and Direct Policy Search.

Reference:

Jean-Joseph Christophe, Jérémie Decock, and Olivier Teytaud. Direct model predictive control. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges, Belgique, April 2014.

Open Archive : HAL
Download : Article (PDF)  

Linear Convergence of Evolution Strategies with Derandomized Sampling Beyond Quasi-Convex Functions  

We study the linear convergence of a simple evolutionary algorithm on non quasi-convex functions on continuous domains. Assumptions include an assumption on the sampling performed by the evolutionary algorithm (supposed to cover efficiently the neighborhood of the current search point), the conditioning of the objective function (so that the probability of improvement is not too low at each time step, given a correct step size), and the unicity of the optimum.

Reference:

Jérémie Decock and Olivier Teytaud. Linear Convergence of Evolution Strategies with Derandomized Sampling Beyond Quasi-Convex Functions. In EA - 11th Biennal International Conference on Artificial Evolution - 2013, Lecture Notes in Computer Science, Bordeaux, France, August 2013. Springer.

Open Archive : HAL
Download : Article (PDF)   Slides (PDF)  

Noisy Optimization Complexity Under Locality Assumption  

In spite of various recent publications on the subject, there are still gaps between upper and lower bounds in evolutionary optimization for noisy objective function. In this paper we reduce the gap, and get tight bounds within logarithmic factors in the case of small noise and no long-distance influence on the objective function.

Reference:

Jérémie Decock and Olivier Teytaud. Noisy Optimization Complexity Under Locality Assumption. In FOGA - Foundations of Genetic Algorithms XII - 2013, Adelaide, Australie, February 2013.

Open Archive : HAL
Download : Article (PDF)   Slides (PDF)  

Learning Cost-Efficient Control Policies with XCSF  

In this paper we present a method based on the "learning from demonstration" paradigm to get a cost-efficient control policy in a continuous state and action space. The controlled plant is a two degrees-of-freedom planar arm actuated by six muscles. We learn a parametric control policy with xcsf from a few near-optimal trajectories, and we study its capability to generalize over the whole reachable space. Furthermore, we show that an additional Cross-Entropy Policy Search method can improve the global performance of the parametric controller.

Reference:

Didier Marin, Jérémie Decock, Lionel Rigoux, and Olivier Sigaud. Learning Cost-Efficient Control Policies with XCSF: Generalization Capabilities and Further Improvement. In Proceedings of the 13th annual conference on Genetic and evolutionary computation (GECCO'11), pages 1235-1242, Dublin, Irlande, 2011.

Open Archive : HAL
Download : Article (PDF)   Slides (PDF)  

Apprentissage de politiques efficaces avec XCSF et CEPS  

Nous proposons dans cette contribution une méthode qui permet d'obtenir une politique efficace dans un cadre où l'état et l'action sont continus. Le système contrôlé est un bras à deux degrés de liberté actionné par six muscles. Nous apprenons par démonstration une politique paramétrique avec le système de classeurs xcsf à partir de trajectoires quasi-optimales et nous étudions la capacité d'xcsf à généraliser ce qu'il a appris le long de ces trajectoires sur l'ensemble de l'espace atteignable. De plus, nous montrons qu'une méthode d'optimisation stochastique appelée Cross-Entropy Policy Search permet d'améliorer encore la performance du contrôleur paramétrique.

Référence:

Didier Marin, Jérémie Decock, Lionel Rigoux, and Olivier Sigaud. Apprentissage de politiques efficaces avec XCSF et CEPS. In Sixièmes journées francophones MFI/JFPDA, pages 298-310, Rouen, France, 2011.

Open Archive : HAL
Download : Article (PDF)