DATAZERO: data centers powered 100% by renewable energies

DATAZERO: data centers powered 100% by renewable energies

What is the best configuration for a medium-sized data center (1 MW) that would be powered by renewable energy alone? How to optimize energy production to meet the needs of IT resources, and the flow of services according to the available electricity? These are the scientific questions that the DATAZERO research project is trying to answer.

Launched in 2015 by the National Research Agency and a Franco-American consortium, the DATAZERO project (DATA center with Zero CO2 Emission) aims to bring out of the ground data centers (data centers) powered solely by renewable energies thanks to a rigorous scientific approach. Meeting with Jean-Marc Pierson, teacher-researcher in computer sciences at the University of Toulouse III and coordinator of the project. It explains the scientific objectives, and presents the challenges related to the design and operation of carbon-free and self-sufficient data centers.

What are the main challenges that a company – a cloud provider, for example – faces when it wants to run a data center with renewable energy?

The first difficulty is to find the right interlocutors. Today, data center providers and specialized consulting and engineering firms do not always have the expertise to “design” data centers powered by renewable energies. They do not offer it spontaneously. After a few exchanges with various market players, I understood that the client had to express his will expressly and insist that such a project be carried out.

We are leaning towards an open source model for those who would like to provide expertise in the design of renewable energy data centers.

The sizing of the data center is the second challenge, with a real questioning of the size and durability of the infrastructure. The payback time of a carbon-free data center is long, at least eight to ten years.

Avoid oversizing the data center while allowing its scalability to guarantee its durability: this must be extremely complicated…

Absolutely ! How many solar panels or wind turbines will be needed? How many lithium batteries or fuel cells will be needed? Which renewable sources to favor and which storage systems to install? The sizing software that we have developed as part of the DATAZERO project allows us to answer these questions with recommendations regarding the type of equipment based on a given workload and depending on the location of the data center and the weather observed in previous years (typically, over a period of ten years). The challenge is of course to estimate the workload, because when you want to create a cloud from scratch, you often have only a vague idea of ​​the use of your data center or its annual growth.

What are the scientific objectives of the DATAZERO project?

The DATAZERO project was launched in 2015, starting from the following question: is it possible to manage a data center solely with renewable energies? Several players then announced that they were supplying their data centers with 100% renewable energy, which was not entirely true since they were still connected to the electricity distribution network. We wondered if it was possible to go further and become completely independent from the network.

Two main scientific questions arose. The first is that of the dimensioning of a self-sufficient infrastructure during the construction phase. The second is that of optimizing its functioning in the operation phase with a double challenge: optimizing the flow of electricity in the electrical system according to data and calculation flows on the one hand, optimizing the flow of services in the computer system according to the production of electricity on the other hand.

How did you answer the first question? Could you describe the dimensioning software used in the construction phase?

We used Linear Integer Programming (PLNE) to solve this problem. This type of optimization is known to be time-consuming, since one seeks to produce an optimal solution. But unlike the operation phase, where we are in real time (the algorithms must therefore be very fast), we can afford to take the time to find the best solution. Eventually, we were able to develop a relatively fast algorithm capable of achieving a result – the optimal sizing for a data center containing thousands of servers performing hundreds of thousands of tasks – in minutes. This result allows us to simulate several scenarios. For example, if the workload is x, we need this configuration; if this load is increased by 10% per year, it will be necessary to add x computer servers and x wind turbines.

In the operation phase, the idea is to optimize the energy production to meet the electrical demand of the IT resources on one side, and to adjust the flow of IT services according to the energy production the other. How do you do it?

There are several methods to solve this rather classic optimization problem in computer science. The traditional approach consists in creating a mathematical model which will take into account all the constraints, both of the electrical part and of the computer part. We have taken a different approach, in which the electrical part and the computer part are optimized separately, and then a negotiation loop takes place.

On the one hand, the electrical system is optimized according to a determined workload in order to address the demand of the servers for a given duration (three days in the context of this project) thanks to linear programming techniques. On the other hand, the computer system optimizes its operation according to the planned electricity supply thanks to heuristic algorithms (producing approximate results).

To do this, there are two options. First option: we can act at the server level by varying their speed. A typical computer will typically have a 3 GHz processor frequency, which can be scaled down. If it operates at 1 GHz, it will go three times slower, but save more than three times the energy consumed! It is also possible to act on the planning of computer tasks by delaying the execution of certain non-urgent tasks.

We then developed a trading algorithm using game theory to respond to constraints from both sides. Here are the three bricks used to optimize the use of renewable energies in the data center.

What are the novelties introduced in DATAZERO2, the second phase of the project which started in 2020?

As part of DATAZERO2, we work a lot on the notion of uncertainty. Weather forecasts and load predictions are inherently uncertain data. In DATAZERO1, when we realized that we had made a mistake, that we would not have the anticipated electrical power, we relaunched the optimization. For the second stage of the project, we wanted to take uncertainty into account from the outset.

We have attached an “uncertainty object” to power generation and computing load predictions and developed new optimization algorithms under uncertainties. We hope that the results will show us that the impact of errors is less important, and that we are no longer obliged to restart an optimization, because the system is able to anticipate them and adapt automatically.

How will the different algorithms you described be made available to companies to design and operate decarbonized data centers?

With DATAZERO2, we seek to increase the maturity of our software. Our goal is to reach a level 5 maturity, which would allow us to work with an IT development company to make the solution a commercial product.

We are leaning more towards an open source model, with the provision of software that can be used by anyone wishing to provide expertise in the design of renewable energy data centers. This is in discussion with our industrial partner, Eaton.

The maturation of software resulting from research is a major challenge and we are trying to find ways to promote our work. For example, we could be supported by a technology transfer acceleration company (SATT), a structure whose objective is to transfer research results to companies and bring inventions to a level of maturity close to that of the market.

Last question: what does an “ideal DATAZERO” data center look like to you?

First of all, it should be noted that the DATAZERO project targets medium-sized data centers, consuming up to 1 MW of electrical power. An ideal renewable energy data center is first and foremost a completely independent and autonomous infrastructure, which does not need to resort to an external electricity supplier.

However, today, we are faced with a psychological barrier. Customers are wary of the idea of ​​operating a data center that is not connected to the electricity grid. Our challenge is to show that it is possible, and this, by limiting redundancy as much as possible during dimensioning. If a data center operates with three wind turbines, we can estimate that it is necessary to install six of them to guarantee its resilience. But it is unlikely that all three wind turbines will fail at the same time. We therefore seek to determine the optimal configuration. In an ideal scenario, the data center would only need four wind turbines and we would have managed to convince the buyer that this is sufficient.