Data centres mean different things to different people. For the IT manager, it will be a space to host servers, routers, network firewalls etc. but for the facilities manager, the focus will be on space, security, utilities, power and cooling.
Building a new data centre brings a number of challenges. Having audited a large number of data centres from both an engineering and operational perspective, I have pulled together what I think are the ten most important things to get right.
As with every major project, planning is essential. Design your data centre to accommodate that future growth, not just in terms of space, but also for power and cooling systems; a modular approach may be the most effective. Don’t ignore swing space for when old systems get replaced by new ones, both will need space and power and cooling to operate alongside each other until the old can be de-commissioned and hopefully removed.
Consider what the operational costs will be and get the balance right – lower priced equipment may use more power and/or require more cooling, making it far more costly over its lifetime than the higher priced alternative. Give due consideration to the on-going equipment maintenance requirements. Cheapest is not always ideal; as Neil Armstrong once joked “all the mechanical parts in the Saturn V were made by the lowest bidder.”
If you are debating whether to adapt an existing building or build from new, you may find the adaptation may cost more and be more restricted in its scope for growth than a new-build. Ensure you carry out a full due diligence review including adjacent threats and risks to the day to day operations of the facility in that location.
Cooling is an essential part of keeping a data centre operational. Placing your data centre in a cold and dry climate will help with energy efficiency, which must have been a factor in Microsoft’s decision to locate its new data centre in Finland.
A subterranean location can also help with temperature by reducing solar gains, but this may be counterbalanced with an increased risk of flooding as well as problematic access for removing failed plant from below ground. Avoiding areas prone to flooding or with fault lines is sensible!
3. Project management
Given the more critical nature of the build and fit out, first rate project management is essential (and Riskenomics is extremely good for this!) Make sure that one of the first questions a project manager asks, and continuously follows up on, is a detailed testing and commissioning programme.
It is essential to get this planned from the start so that the various testing scenarios can be considered and included in the programme, I always find it’s never sufficient, and the commissioning critical path will tend to drive the installation timing too as certain aspects of the installation need to be in place and linked to the commissioning before the next stage can progress.
Depending on the criticality of the facility, the need to have two incoming power supplies is normally considered best practice but may not be appropriate in all circumstances. Two diverse power supplies will likely be prohibitively expensive.
When deciding the appropriate level of redundancy, balance the cost to the business of system failure against the cost of risk mitigation. Evaluate the business objectives and requirements and get business agreement to the level of systems failure the business can tolerate and what levels of continuous support it needs. Design the systems to have the right levels of system redundancy to meet the business objectives.
Prepare failure scenarios to prove the operational integrity of the design to meet the business needs. Dependency models in Riskenomics work very well in demonstrating this and can be used for desk top training.
5. Power and efficiency
As operational costs far exceed build costs, optimising your PUE (power usage effectiveness) as high as possible will generate ongoing savings.
Keeping your power usage as low as possible for the engineering systems that support the IT equipment is the goal for a good design. It is very unlikely that you will be able to meet the likes of Google or Microsoft who can throw vast sums of money and take different business risks by having very diversified operational platforms from which to operate.
Always strive for the lowest PUE value you can get from your design given the constraints of the site from a space and cost perspective. If you can get below 1.7 in practice this will generally be a good result.
The cooling systems will consume a large part of the power of your facility, so it makes sense to choose systems that are as efficient as possible. Where possible try to get the design to incorporate free cooling. There have been many studies carried out to show that you can go for most of the year in the UK using free cooling with the right design solution. It depends how comfortable you are with this design solution from a business risk perspective.
Cooling requirements can be calculated across the whole space used. Discuss with IT how they plan to deploy their equipment and what loads they are likely to require in their racks and consider the need to have various sections of the data centres physically divided to cater for different rack loads. This will enable the system design to be tailored to each zone so that the systems will be more efficient and cheaper to run. You can also use thermal zone mapping to view 3D images of the hot and cold zones so you can optimally position equipment.
High rack loads from both a power and weight perspective is a requirement for HPC (high performance computing) and will require in-rack cooling. The whole floor and the access route in needs to be able to take the high weights expected. At one time water in a data centre was strictly forbidden but HPC and similar high density equipment is driving change which needs to be accommodated.
Data centre security, to prevent unauthorised access is essential, and so is the physical security of the site.
Control access to the building – minimise the number of access points and use tools such as retractable bollards and vehicle locks to police vehicle access – and secure the perimeter. Construct external walls of suitable strength to resist impact by vehicles and other means of unauthorised entry where needed. Consider installing steel grids within large supply and extract ductwork to prevent man access.
There are now sophisticated biometric readers to assist in controlling access to all areas within the data centre and all should be reviewed and considered on their merits. CCTV and physical security guards will also be a must have.
And it is not a good idea to advertise that the building is a data centre for the company – don’t make it too easy for those looking to cause trouble!
There will be statutory regulations that your data centre will have to comply with, especially fire protection and prevention systems. These can range from physical firewalls built around the centre to sprinklers to clean agent fire suppression gas. Remember that the regulations may differ from country to country, so local expertise should be involved.
Make sure all statutory compliance requirements are fully recorded and tested with all documentation in place and continue with that philosophy throughout the life of the DC. You can use Riskenomics to manage this very effectively to avoid missing expiry dates.
Some data centres are operated on a “lights out” basis, in which case many processes such as provisioning, configuration and release management will need to be automated as far as possible and remote controlled where human intervention is required. This makes monitoring and management information even more critical.
10. Management of information
There is a great deal of information generated in a data centre. Different information and detail will be required by the data centre manager, the IT manager and the facilities manager, plus high-level strategic information at C-level.
DCIM (data center information management) is one solution to connect all the different systems and provide the range and scope of management information required. However, DCIM is still in its infancy, can be costly and there is not yet any standard for implementation, so many organisations are relying on existing, often separate systems. This is where Riskenomics really comes into its own, incorporating live feeds from other systems and providing highly visual information via dashboards and email alerts.