Computer Room Air Conditionings (CRACs)

Get Complete Project Material File(s) Now! »

Literature review

There have been a lot of research into data centers the last couple of years.
Research into multiple ways of saving energy. But as mentioned earlier this thesis will focus on improving of the thermal cooling of data centers, by the means of improving the control of the cooling equipment. So the pre-studies have been focused on reading papers and theses about control of the cooling equipment.
Like in [3] which explores optimization of the server fans i.e. nding the minimum cost to produce an air ow that still respects the server components thermal constraints. They do this by developing a control oriented model of a server, the model is based on the dynamics of the air ows inside the server but where the parameters of the model are identied and validated through Computational Fluid Dynamics (CFD) tools. For calculating the minimum cost air ow the server fans should produce they use an Model Predictive Control (MPC) controller. Their results indicates that a model based optimal control strategy for improving data center cooling performance is a viable option. This suggest that this type of control could also be applied to the CRACs in a positive way. Their methodology for developing a control oriented model will be used for developing one of our models. With the dierence that the model will be estimated an veried using measured data and not CFD tools.
A dierent approach for deriving a model is used in [4]. A paper where a transient data driven model for a data center is developed. The model can be used as the basis for model predictive control algorithms. For describing the thermal dynamics of the data center they choose a liner model. Then a large dataset is collected from a data center, where the rst part is used for training the model and second part used for validating the model. Their results indicate that data driven modelling can potentially be used to form the basis for model predictive control. This black box method of modelling a system will also be examined in this work.
Developing, testing and tuning model based controllers can be a time consuming task. So [5] describes a dynamical non-linear data center model on equipment level. For the purpose of quicker testing of cooling control and optimization algorithms.
There have also been master theses delving into improving data center cooling strategies. Like [7] a thesis looking into the use of a stochastic MPC for controlling data centers. A stochastic forecaster predicting the unknown future IT loads and a model describing the servers thermal dynamics is developed.
The control problem is then solved using a MPC scheme. Three dierent control strategies are simulated on the server model. One with only CRAC output temperature as control variable, while server fan speed is kept constant. A second strategy where both server fan speed and CRAC output temperature is used as control variable. In the third scenario only server fans are used as control variable and the CRAC output temperature is kept constant. All three strategies are then compared to a reference case where current practice is used. The simulations show that the stochastic MPC outperforms current practice, both in terms of constraints violations and in terms of power consumption. Much of what he purposes as future work is what will be done in this thesis, verifying a thermal model of a data center against a real world data center, testing model based controllers experimentally in a real world data center.
Another thesis looking into data center cooling is [6], testing LQRs and MPC controllers for the CRACs in a simulated data center. A Simulink model of a data center is developed, along with several controllers. The controllers consists of multiple LQR and a single MPC, some of the LQRs contains an integral term. The dierent LQR controllers investigates the performance using CRAC air ow, CRAC output temperature or both as control variable. The controllers are tested using either the maximum server temperature or the average server temperature as reference signal. They are then simulated in dierent server load scenarios. The results on which controller is the best were inconclusive, since dierent controllers performed the best in the dierent scenarios. But the author suggest using a controller using both CRAC air ow and CRAC output temperature as control variable for experiments in a real world data center.
The main way that this thesis will dier from what has been done in the past.
Like in the works mentioned above is that the model based control strategies will be tested on an actual data center. With models of the data center identied using data from the very same data center.

Models for the thermal dynamics

This section describes the general setup of a general air-cooled data center, presents a description of the dynamics of the volume ows and temperatures in this general data center setup, and eventually combines these dynamics in a discrete physical model.
All the notation is collected in Table 1 on page 7.

Graphic description of the considered system

The general structure of most air-cooled data centers is that the servers are situated in racks placed in rows that create dierent aisles with temperature gradients. Air-cooled servers are typically endowed with fans that cool these servers locally; the built environment moreover presents some CRACs controlling the ambient temperature in the server room. Figure 1 depicts this general
structure.

READ  Computational Neuroscience, Statistical Mechanics and the hippocam- pus 

CRACs

As show in Figure 1, the room can be logically divided in four zones. The rst zone corresponds to the one dened by the CRAC units. Consider then that an air ow fi;in with a temperature Ti;in enters the top of the various CRACs; the air ow is then altered by the CRAC’s internal fan (whose rotational speed is represented by ui), plus cooled down by the CRAC’s cooling coils, whose temperature is Ti. This eventually results in the CRACs outputting a new air ow fiout at the new temperature Ti;out.

From CRACs to servers

The second zone within the logical division that we made for the computer room is the space between the CRACs and the servers, as shown in Figure 3. Here the air ow fi;out with the temperature Ti;out exits the CRACs. This air gets then mixed on its way to the servers and a fraction of the air ow fj;k;in with the temperature Tj;k;in enters each server. This fraction is determined by the dynamics described by si!j;k.

Servers

The third zone in our logical division is represented by the servers (graphically shown in Figure 4). Here the air ow fj;k;in with the temperature Tj;k;in enters a server. This ow is then altered by the servers internal fan, whose rotational speed is represented by uj;k, and is moreover heated up by the servers internal temperature Tj;k. The nal result is that the servers output a new air ow fj;kout with a new temperature Tj;k;out.
Figure 3: Photo of the space that lies between the CRACs and the servers for the system that we used in our experiments. CRACs seen to the left and server racks seen to the right.
Figure 4: A Dell R430 server. We see the 12 cooling fans and the two cooling blocks for the two CPUs.
Figure 5: Photo of the hot aisle of the data center we used in our experiments. Where the heated air from the servers rises and returns to the CRACs.

From servers to CRACs

The fourth zone is the space between the servers and the CRACs as shown in ow fj;k;out with the temperature Tj;k;out exiting each server gets mixed in its way towards the various CRACs. The resulting air ow can be represented through its ow fi;in and temperature Ti;in. In this way the cycle of the air ow is closed.

Standing hypotheses

In our following derivations we assume that:
there are no air leakages within servers and within the computer room (notice that high air leakage leads to greater non-linearity);
at least theoretically, each air ow from each CRAC unit aects each server; the degree of aecting is captured by some parameters that we have to identify from real data.
Notice that some other case-specic assumptions will be introduced when deriving the dierent models of the thermal dynamics within the data center.

Dynamics of the volumes of the ows

This subsection describes what happens to the volumes of the ows when they pass through the various zones of the data center room. Referring to Figure 1, the volumes of the ows are modied in four specic points:
1. inside the CRACs;
2. from the CRACs to the servers;
3. inside the servers;
4. from the servers to the CRACs;

Table of contents :

1 Introduction 
1.1 Project outline
1.2 Literature review
2 Models for the thermal dynamics 
2.1 Graphic description of the considered system
2.1.1 Computer Room Air Conditionings (CRACs)
2.1.2 From CRACs to servers
2.1.3 Servers
2.1.4 From servers to CRACs
2.2 Standing hypotheses
2.3 Dynamics of the volumes of the ows
2.4 Dynamics of the temperatures of the ows
2.5 Discretization
2.6 ARX model
3 State space representations for the case constant ow 
3.1 Notation for State Space (SS) models
3.2 Model 1 – No time delays, air fractions constant in the whole data center
3.3 Model 2 – No time delays, but air fractions depending on the server
3.4 Model 3 – Complete
3.5 Model 4 – ARX model
4 Control strategies design 
4.1 PID Control
4.1.1 General theory
4.1.2 PID controllers for our specic CRAC control problem
4.1.3 P control
4.1.4 PI control
4.2 LQR Control
4.2.1 LQR for our CRAC control problem
4.2.2 LQRi control
5 Results 
5.1 SICS ICE data center
5.1.1 Module 2
5.1.2 Servers
5.1.3 Sensors
5.2 Description of the native CRAC control strategy
5.3 Limitations in control
5.4 System identication
5.5 In silico tests
5.6 Field experiments
5.6.1 Open loop
5.6.2 P control
5.6.3 PI control
5.6.4 LQR control
5.6.5 LQRi control
6 Discussion 
6.1 Conclusions
6.2 Future works
A Appendix
A.1 Derivation of model a case specic time-delay (5)
A.2 Solution to the LQR-problem via Batch approach
A.2.1 Following reference signals
A.3 Datacollection/closing the control loop
A.3.1 Datacollection required for system identication
A.3.2 Datacollection required for system control
A.3.3 OPC
A.3.4 Modbus
A.3.5 SNMP

GET THE COMPLETE PROJECT

Related Posts