Although a research area since at least the 1980s, RL has recently attracted interest due to breakthroughs such as Deepstack, a recent algorithm able to defeat professional poker players or an algorithm from DeepMind which is claimed to reduce the energy bill of Google data centres by 40%. However, RL can be complex and challenging to understand.
The application we created is aimed at making RL technology more accessible to industry. Our application explains in a friendly and understandable way how DQN and its main parameters work. Since the benefits of RL are potentially enormous, the goal is for companies to understand and become familiar with RL and to encourage them to look at implementing RL in their industries.
Reinforcement Learning (RL) is a machine learning paradigm whereby an agent learns to take actions in its environment to maximise a reward. RL is not yet widely used within the industry or in many real-world applications, but it shows promise for addressing many challenges ranging from energy conservation to autonomous driving.
CeADAR’s Reinforcement Learning project provides a state-of-the-art report on RL as well as a demonstrator application that demonstrates how RL can be applied in a financial trading scenario.
The aim of the project is to help industry partners to understand the capabilities of RL and to identify scenarios in which it may be applied.
The key features of the demonstrator are:
CEADAR’s Deep Q-Network (DQN) demonstrator is targeted towards people in the industry who has no previous knowledge of RL and want to become familiar with it by interacting with a friendly application.
Data set and parameter configuration: the user can upload their own time series dataset and configure the parameters and rewards for the DQN (Deep Q-Learning) algorithm.
Experiment overview: the effects of different parameters can be observed in terms of performance, training time, agent behaviour, etc.
RL is still in the early stages of adoption within the industry. CeADAR’s Reinforcement Learning project aims to provide an understanding of the technology, it’s advantages and where it can be applied. The most distinctive feature of the application is to present an application of RL applied to the stock market in a very intuitive way based on the results shown in four plots. An agent can be configured and the performance and learning progress are displayed as they are generated. The screen is updated every 30 seconds, so it is possible to see the evolution. The user can manipulate the agent’s behaviour through adjusting parameters and the agent rewards system.
Dr Luis Miralles, University College Dublin
Saad Shahid, University College Dublin
Shridhar Kulkarni, University College Dublin
Dr Oisín Boydell, University College Dublin