Skip to main content

· One min read
RangL

The Pathways to Net Zero challenge took place from 17th to 31st January 2022. After consideration of both the leaderboard and the provided executive summaries, the selection panel chose three joint winners:

  • Epsilon-greedy (Delft University of Technology)
  • Lanterne-Rouge-BOKU-AIT (University of Natural Resources and Life Sciences, Vienna, and Austrian Institute of Technology)
  • VUltures (Vrije Universiteit Amsterdam).

Additionally, Team AIM-Mate were highly commended for their efforts.

The Net Zero Technology Centre, who sponsored the challenge, will hold a webinar on 28th March 2022 with the winning teams.

The final leaderboards can be viewed at the challenge repository. Thank you to all who participated!

· One min read
RangL

As the environment had reached a suitable state of development, this week the group discussed possible evaluation criteria for the challenge. The RangL project aims to collect best practice in the application of RL (and also other optimisation approaches) to control problems in industry. Thus while it would be possible to use only leaderboard scores, we decided to ask participants to submit a one-page executive summary of their approach, which would be considered in the overall evaluation.

· One min read
RangL

In RL agents learn from experience. However if rewards occur only after a long sequence of actions, it can be difficult for an RL agent to associate this long sequence with the eventual reward. An example is a game of chess, if the reward is simply 1 for winning and 0 for losing. Similarly, if problem constraints are handled by awarding a large negative penalty when a constraint is breached, this can make reinforcement learning challenging. To address this, the group experimented with a Lagrangian reformulation to transform the constrained problem to an unconstrained one by modifying the reward function. As a result, the constraint on job numbers was replaced by adding a reward term proportional to the number of jobs created at each step.

· One min read
RangL

Typically a spreadsheet will not include any model of randomness – in other words, it is deterministic. In reinforcement learning it can be straightforward to include randomness, and this is one of the strengths of RL.

We will leave the choice between deterministic (without randomness) and stochastic (with randomness) modelling for another blog post. The interesting point for us this week was correlations: how we describe the extent to which random factors tend to move together. In deterministic modelling there is simply no need to think about how, for example, the prices of natural gas price and blue hydrogen (which is derived from natural gas) move together. In contrast, in stochastic modelling different correlations could even drive different solutions. Fortunately, in this project we have the luxury of discussions with some of the creators of the Integrated Energy Vision (the spreadsheet model on which the challenge is based), so correlations can be chosen in an informed way.

· One min read
RangL

Today the group agreed a modification to the challenge outline: the RL agent will now directly specify the rate of deployment of offshore wind, blue and green hydrogen. This is achieved by allowing the RL environment to interact directly and repeatedly with the IEV spreadsheet model, both simplifying the approach and increasing its transparency and interpretability. Initial results from training an RL agent with this environment were shared and sense-checked.

· One min read
RangL

Having decided the general shape of the challenge, today the project group agreed to begin working in an agile fashion through a GitHub project board.

Excitingly, RangL was also invited to be part of the Net Zero Technology Centre's virtual showcase "Road to Glasgow: Destination Net Zero" at the 26th UN Climate Change Conference of the Parties (COP26) in Glasgow in November 2021. A virtual exhibition booth will include "meet the developers" live sessions and a project video explaining the Pathways to Net Zero challenge.

· One min read
RangL

In Reinforcement Learning the agent learns to maximise the rewards it receives. In this way the reward function is an integral part of the problem statement, and this week’s efforts centred around finalizing its exact form. Given the aims of the study, the project group decided to include the cost of total carbon emissions in the reward alongside UK energy sector profits (that is, energy revenues minus capital, operating and decommissioning costs

The reward function can also be used to place constraints on acceptable solutions. While the shift to zero-carbon technologies can lead to increased employment in the long term under our modelling, it is important to ensure that job numbers are also managed in the short and medium term.

It was agreed that recent volatility in energy market prices highlights the importance of incorporating randomness in the RL environment. In addition to reflecting real-world uncertainty, this allows the agent to learn to adapt under a variety of scenarios.

· One min read
RangL

RangL aims at applying reinforcement learning (RL) to solve real-world industrial problems by involving participants from the wide AI community. Today, our focus was therefore on developing an appropriate RL environment for the Pathways to Net Zero challenge.

The objective is to find optimal deployments for technologies such as offshore wind, blue and green hydrogen, and carbon capture and storage. These technologies will be instrumental in reaching the UK’s target of net zero carbon by 2050.

After brainstorming we opted to take Breeze, Gale and Storm as baseline scenarios from which others can be built. An agent will interact with the RL environment by choosing a mix of those scenarios and also by varying the speed with which they are implemented. For instance, earlier deployment reduces lifetime emissions but generally implies higher capital costs. Solutions will also need to meet some non-monetary constraints, e.g. balancing job creation in new technologies against the loss of roles in decommissioned infrastructure. We will also work with the Net Zero Technology Centre and ORE to extend the Integrated Energy Vision appropriately, so that lifetime emissions and their social cost can be considered in the RL reward function.

· 2 min read
RangL

The RangL competition platform exists to accelerate progress in data-driven control problems for industry, and today marks the real beginning of that journey.

The purpose of the Net Zero Technology Centre is to develop and deploy technology for an affordable net zero energy industry, and the Offshore Renewable Energy (ORE) Catapult is the UK’s leading technology innovation and research centre for offshore renewable energy.

Today, colleagues from the Net Zero Technology Centre, ORE Catapult and RangL team gathered to make a start on the Pathways to Net Zero challenge. The agenda was focused on first introducing the competition platform and then understanding the Net Zero Technology Centre / ORE Catapult Integrated Energy Vision (IEV) model, on which the challenge will be based.

The IEV is the result of a major modelling exercise undertaken collaboratively by the Net Zero Technology Centre and ORE Catapult, and addresses the UK's vision of achieving net zero carbon emissions by 2050 for the North Sea offshore energy industry. The range of possibilities is illustrated by three imagined pathways, Breeze, Gale and Storm, each addressing the four main technology pillars of offshore energy: offshore wind, oil and gas, hydrogen, and carbon capture and storage (CCS).

The Pathways to Net Zero challenge aims to build on the IEV by first allowing a range of intermediate pathways between Breeze, Gale and Storm, then defining a criterion to measure the quality of each pathway in a specific sense. Challenge participants will be invited to apply reinforcement learning, or any other approach of their choice, to find the ‘best’ pathway. The challenge will be made more realistic and difficult by the inclusion of uncertainty over future parameters such as energy revenues and technological progress.

· 2 min read
RangL

From 18 to 25 January 2021 the RangL team fulfilled a long-held ambition: to run a generation scheduling challenge. The problem involves using continually updated forecasts for energy demand and renewable energy generation to schedule, and so to minimise, the use of fossil fuels. It is challenging partly because the observation space is large — at each step, the agent is given forecasts for all time periods — and also because the forecasts are updated as new information arrives, so are guaranteed to be superseded by better ones.

This ‘look-ahead mode’ generation scheduling was one of the first motivations for RangL, when the project was conceived in early 2019 during the Mathematics of Energy Systems research programme at the Isaac Newton Institute in Cambridge. While not directly connected, it’s interesting to note that the forthcoming special issue of Philosophical Transactions of the Royal Society A based on the MES programme has an article by Peter Glynn and Jacques de Chalendar on theoretical aspects of this kind of problem (titled “On incorporating forecasts into linear state space model Markov decision processes”).

The competition itself was heavily oversubscribed, with applicants from Argentina, Denmark, the Netherlands, Italy, France and the UK, drawn from academia, industry and the third sector. We’d like to thank all participating teams, who generated a fantastic atmosphere on our Slack channel throughout the week. It must have been good, as one competitor even joined the RangL team. The winners were team zeepkist with members from the Intelligent Electrical Power Grids group at TU Delft and Tennet, the Dutch power system operator. The final scores, and zeepkist’s winning code (which used RL), are here in the challenge repository.

We recently argued on the Turing blog that as the world reopens following the pandemic, we will need to make more flexible, responsive and data-driven decisions. Hopefully this first challenge illustrates a small part of the potential role that reinforcement learning can play.