What Can AI Teach Us About Data Centers? Part 3: Economic, Political, and Environmental Concerns, and New Developments

ODSC - Open Data Science
18 min readJul 25, 2023

Environmental, political, and economic concerns can easily derail or complicate decisions about how to invest in or use data center technology. In this article, I report my interactions with ChatGPT to learn more about these. Some of the environmental and political issues described are also related to security, so we start the interaction there. Security and environmental concerns will influence how data centers are likely to evolve over the next decade, as ChatGPT points out below too.

Next, like every technology, there is a chance that data centers will become obsolete at some point, so I asked ChatGPT to speculate about how they may evolve.

Several political, economic, and environmental issues about data centers are controversial. Data scientists and business leaders should consider these before investing. ChatGPT found some of these controversies, and I expand on that later in the Discussion section.

While questions about these topics complete my three-part series with ChatGPT about data centers, keep in mind that recommendations for how to use ChatGPT and other large language model (LLM)-based chatbots continue to evolve. I therefore add some guidance that users can adopt to make sure they get the most out of their interactions with these chatbots.

Our Conversation

The following questions and answers are copied nearly verbatim from my interactions with ChatGPT. I did not correct any logic or substantive errors in this section, so you can see what to expect from this collaboration. I did correct a few spelling and grammar mistakes without changing the meaning of any responses. My questions are in bold italics, and ChatGPT’s answers follow.

What are the major security and other concerns associated with data centers?

Data centers have a number of security and other concerns that must be addressed in order to protect the data and systems they house. Some of the major security and other concerns associated with data centers include:

1. Physical security: Data centers are often targeted by thieves and other criminals, who may attempt to steal equipment or cause damage to the facility. Physical security measures such as biometric access controls, security cameras, and fire suppression systems are used to protect data centers from physical attacks.

2. Cybersecurity: Data centers are also vulnerable to cyber-attacks, including hacking, malware, and denial of service attacks. These attacks can result in the loss of sensitive data, as well as the disruption of critical systems. Data center operators must implement robust cybersecurity measures such as firewalls, intrusion detection systems, and encryption to protect against cyber-attacks.

3. Compliance: Data centers must comply with a variety of regulations and standards, such as HIPAA, SOC2, PCI-DSS, etc. which dictate how data should be stored, processed, and transmitted. Data center operators must implement policies and procedures to meet these regulatory requirements and be prepared to undergo regular audits.

4. Power and cooling: Data centers require a constant supply of power and cooling to keep the equipment running. Power outages, air conditioning failures, and other cooling issues can result in equipment damage and data loss. Data center operators must implement backup power and cooling systems to ensure the availability of the data center.

5. Environmental risks: Data centers are also vulnerable to natural disasters and other environmental risks such as floods, earthquakes, storms, and fires. These risks can cause damage to the facility and equipment, resulting in data loss and system downtime. Data center operators must implement disaster recovery and business continuity plans to minimize the impact of these risks.

6. Network security: Data centers need to be properly connected to the rest of the organization, as well as to the internet, and they need to be protected from unauthorized access or attacks. Network security includes the use of firewalls and intrusion detection systems.

What are the major cybersecurity threats for a data center?

Data centers, by virtue of being repositories of vast amounts of sensitive and valuable information, are prime targets for cyber-attacks. Here are some of the major cybersecurity threats they face:

1. Distributed Denial of Service (DDoS) Attacks: DDoS attacks are designed to overload a network with excessive traffic, causing servers to crash and making the network inaccessible to legitimate users. Such attacks can disrupt services and cause significant downtime.

2. Ransomware: This type of malware encrypts data and holds it hostage, with the attackers demanding a ransom in exchange for a decryption key. If a data center is hit with ransomware, it could potentially lose access to the data it hosts, which can be disastrous.

3. Insider Threats: Disgruntled employees, or those with malicious intent, can cause significant harm. They might misuse their access privileges to steal data, corrupt systems, or introduce malware. In some cases, they might inadvertently cause damage by falling for phishing attacks or by not following proper cybersecurity protocols.

4. Physical Security Breaches: While not a cyber threat in the traditional sense, physical security breaches can lead to significant cyber-related consequences. Unauthorized access to the data center could allow for the theft or damage of hardware, data tampering, or even the manual shutdown of servers.

5. Advanced Persistent Threats (APTs): These are sophisticated, long-term attacks where the attacker gains access to the network and remains undetected for a long time. The goal of APTs is typically to steal information over a prolonged period, rather than to cause immediate damage.

6. Supply Chain Attacks: These occur when an attacker infiltrates the data center through a third-party supplier with weaker security. For instance, they might compromise a software update from a trusted vendor to gain access to the data center’s systems.

7. Zero-Day Exploits: These are attacks on software vulnerabilities that are unknown to the software vendor, and therefore unpatched. These “zero-day” vulnerabilities can be exploited by attackers to gain unauthorized access or cause disruptions.

To protect against these threats, data centers typically employ a range of security measures, including firewalls, intrusion detection systems, regular system updates and patches, access controls, encryption, physical security measures, and employee training programs. Additionally, many data centers have disaster recovery and incident response plans in place to mitigate the impact of any successful attacks.

Photo by Marco Oriolesi on unsplash.com

What are the political and economic issues about data centers?

There are several political and economic issues related to data centers, including:

1. Data sovereignty: Data sovereignty refers to the legal and regulatory framework that governs the collection, storage, and use of data. Some governments have laws and regulations in place that require certain types of data to be stored within their borders, while others have laws that prohibit the export of certain types of data. This can create challenges for companies that operate data centers in multiple countries.

2. Internet censorship: Some governments have laws and regulations in place that restrict access to certain websites or types of content. This can create challenges for companies that operate data centers in those countries, as they may be required to comply with those restrictions.

3. Cybersecurity: Data centers store large amounts of sensitive information and are vulnerable to cyber-attacks. Governments have implemented regulations to ensure data centers meet certain security standards to protect sensitive information.

4. Energy consumption: Data centers require large amounts of electricity to power and cool the equipment. This can lead to economic and environmental concerns, such as the cost of electricity and the carbon footprint of the data center.

5. Job creation: Data centers can create jobs in the areas where they are located, but they can also lead to the displacement of jobs in other sectors, such as in manufacturing and retail.

6. Competition: Data centers can create intense competition among companies, as they are constantly looking to build bigger and more efficient data centers to attract customers and gain a competitive edge.

7. Privacy: Data centers store large amounts of personal information, which can raise concerns about privacy and data security. Governments have regulations in place to protect the privacy of individuals and their data.

Are cities and states providing tax breaks for data centers that are too high?

Cities and states may offer tax breaks and other incentives to attract data center companies to their locations. These incentives can include property tax exemptions, sales tax exemptions, and abatements, as well as grants and other forms of financial assistance. The reasoning behind offering these incentives is that data centers can bring jobs and economic development to a community.

However, some critics argue that these tax breaks and incentives are too high and that they result in a net loss for the local government, as the revenue generated by the data center may not be enough to offset the lost revenue from the tax breaks. Additionally, some critics argue that these incentives are not always distributed and that they are disproportionately awarded to larger companies, while smaller companies and entrepreneurs may not have access to them.

On the other hand, some governments and economic development agencies argue that these incentives are necessary to compete with other regions and to attract data centers to their locations, as they believe they bring economic benefits to the community, such as job creation and increased tax revenue.

Ultimately, whether or not the tax breaks and incentives offered by cities and states are too high is a matter of opinion and depends on the specific circumstances of each case. It’s important for governments to consider the potential long-term economic benefits and drawbacks before offering these incentives.

What are the major environmental concerns about data centers?

Data centers have several environmental impacts, some of which are more prominent than others. Here are some of the major environmental concerns related to data centers:

1. Energy Consumption: Data centers are massive energy consumers. As per some estimates, they account for about 1% of global electricity use. The servers, cooling systems, power distribution equipment, and auxiliary systems all consume significant amounts of electricity. This high level of energy consumption contributes to greenhouse gas emissions, particularly when the electricity is produced from fossil fuels.

2. Water Use: Many data centers use water cooling systems to manage the heat generated by servers. This can lead to significant water use, which is a concern in regions where water scarcity is an issue. Furthermore, the hot water or steam released into the environment can affect local ecosystems.

3. Electronic Waste: Data centers are frequently updated with new equipment to keep up with the latest technology and increased demand for storage and processing power. This leads to a significant amount of electronic waste (e-waste), which is often not disposed of properly. E-waste contains toxic elements like lead, mercury, and cadmium that can pollute the environment and harm human health.

4. Heat Emission: Data centers generate a lot of heat, which needs to be managed to keep the servers and other equipment operating optimally. If not managed properly, this heat can contribute to localized warming and potentially exacerbate urban heat island effects.

5. Land Use: Data centers require large amounts of physical space, especially considering the need for infrastructure redundancy and future expansion. This can contribute to land use pressures in densely populated areas or lead to the conversion of natural areas, impacting local ecosystems.

6. Chemical Use: To prevent fires, data centers often use fire suppressant systems that rely on chemical agents. These chemicals can have environmental and health impacts if they’re not managed and disposed of properly.

To mitigate these impacts, data centers are increasingly turning to renewable energy, implementing energy-efficient technologies, designing innovative cooling methods that reduce water use, and improving e-waste management. However, as data consumption and the demand for cloud services continue to increase, these environmental concerns will continue to be important challenges.

What about noise pollution — is that a concern about data centers too?

While noise pollution is generally less of a concern with data centers than energy consumption, water usage, and electronic waste, it can still be an issue under certain circumstances.

The major sources of noise in a data center are the HVAC (heating, ventilation, and air conditioning) systems, power equipment, and server fans, all of which are designed to keep the servers cool and the electronic components running smoothly. Typically, these systems are enclosed in sound-dampening infrastructure to minimize noise pollution.

Most data centers are located in industrial zones, away from residential areas, reducing the impact of any noise generated. However, if a data center is located near a residential area or a sensitive ecological habitat, noise could potentially be a concern.

In these cases, noise pollution could affect local wildlife or disturb people living nearby, particularly if the noise is continuous and at a high level. Therefore, it is important to consider noise control measures during the design and site selection process for data centers.

Some strategies for mitigating noise pollution from data centers could include soundproofing equipment, maintaining an appropriate distance from sensitive areas, and implementing noise barriers or acoustic fencing. It’s also essential to adhere to local noise pollution regulations and standards.

Photo by Ehimetalor Akhere on unsplash.com

How are data centers likely to evolve in the next 5–10 years?

Data centers are likely to evolve in several ways over the next 5–10 years:

1. Increased Automation and Artificial Intelligence (AI): Automation and AI will be increasingly used to optimize data center operations, such as monitoring and management of the infrastructure, power, and cooling. This will allow for more efficient use of resources and improved reliability.

2. Edge computing and IoT: With the growth of IoT devices and the increasing amount of data generated, there will be a growing need for edge data centers. Edge data centers are located closer to where data is generated and consumed and will be used to process and analyze data in real time, reducing the need for centralized data centers.

3. Increased use of modular designs and prefabricated data centers: With the need for more data centers and the increasing demand for more energy-efficient and sustainable data centers, prefabricated data centers and modular designs will become more common. These designs allow for faster deployment and the ability to expand capacity as needed.

4. More use of renewable energy: Data centers consume a significant amount of energy, and many are looking to increase the use of renewable energy sources, such as solar and wind power, to reduce their carbon footprint and operating costs.

5. Emphasis on security: Data center operators will increasingly focus on security, with emphasis on physical security, network security, and cyber security. They will look for ways to improve the security of the data center itself and the data it contains, to meet the growing concern of cyber threats.

6. Greater adoption of the cloud: With the growth in cloud computing, data centers will increasingly be used to support cloud services, and more companies will adopt cloud services and move their data centers to the cloud. This will enable them to scale up or down their data center capacity as needed, and gain access to the latest data center technology without having to invest in building or upgrading their own data centers.

7. More energy-efficient and sustainable designs: With the increasing pressure to reduce energy consumption, data centers will adopt more energy-efficient and sustainable designs, including improved power and cooling systems, and the use of renewable energy sources.

It’s important to note that these are just a few examples, and the developments may vary depending on the region and the industry. The data center industry is a rapidly evolving field, and it’s likely that new technologies and trends will emerge in the next 5–10 years.

What would make having a data center obsolete?

Several factors could make having a data center obsolete, including:

1. Advancement in Cloud Computing and Edge Computing: With the increasing popularity of cloud computing, more and more organizations are turning to cloud service providers such as Amazon Web Services, Microsoft Azure, and Google Cloud Platform to store and process their data. With the increasing adoption of 5G networks and edge computing, the ability to process data at the edge of the network, closer to the source, becomes more feasible. This can make maintaining a data center less necessary for some companies.

2. Lower Costs and Increased Efficiency of Cloud Services: Cloud services offer a more cost-effective and efficient way to store and process data, as they allow companies to pay only for the resources they use, and they do not have to worry about maintaining their own data center infrastructure. This can make it less necessary for companies to have their own data center.

3. Internet of Things (IoT) and the Edge: IoT is making it possible for many more devices to be connected, generate, and process data at the source. This means that data can be processed at the edge of the network, near the device itself. This can eliminate the need for a central data center for some use cases.

4. Artificial Intelligence and Machine Learning: the ability for AI and Machine Learning models to be deployed at the edge and the improvements in technology that allows them to process a large amount of data in a more efficient way can make it less necessary for companies to have a data center.

5. Development of new technologies: other technologies such as Quantum computing, and other yet-to-be-developed technologies may also make having a data center unnecessary.

It’s worth noting that while some companies may not need their own data center, they may still rely on data centers, either through cloud services or through co-location arrangements. Data centers will continue to play a role in maintaining and protecting critical data and services for many organizations, but the way data centers are implemented, and the way companies use them is evolving.

Limitations

The major limitations associated with ChatGPT have been noted in Parts 1 and 2 of this series, overall and as pertaining to data center use. A very important one is that large language models are largely retrospective in nature; they are trained on published written material. With ChatGPT that means training was based upon material from September 2021 or earlier. While one can still use these models to create new information or speculate about the future, the most recent events of interest may not contribute to such creative or speculative exercises because ChatGPT won’t know about them yet.

Thus, a key lesson when using ChatGPT is to supplement it with other/more recent source material, to generate a more complete knowledge set.

Fortunately, the ongoing development of LLMs may help get around this limitation. For example, the LLM underpinning the newest version of another chatbot, Claude 2, is trained on data going into early 2023 and can be used with internet search tools as well. Claude 2 is also less likely to hallucinate than ChatGPT but is not perfect in that regard either. In addition, Google’s new chatbot (Bard) connects with the web, and so does Pi; both may generate insights based on more recent information.

More limitations of ChatGPT have been described by Dr. Mary Reagan. These include transparency problems arising from the huge corpus of knowledge ChatGPT is trying to summarize. That huge corpus sometimes leads to hallucinations (i.e., stuff ChatGPT makes up out of whole cloth) which are hard to spot.

Another limitation Dr. Reagan pointed out relates to ChatGPT’s environmental impacts. Since its evolving deep learning model is now rumored to have trillions of parameters to estimate, ChatGPT-4 has a very large carbon footprint associated with the huge amount of electricity required to support it. The data and computations required to run ChatGPT use the services of several data centers around the globe.

Regarding data centers, another limitation is that ChatGPT did not comment on location issues related to local or US history (though I did not ask about that specifically either). For example, a recent article in The Guardian describes frustrations about how plans for a large data center may encroach on the Civil War battlefield in Manassas, Virginia. Firms thinking about building or investing in data centers should search for and investigate tangential issues like this that may be quite vexing or costly to address.

This historical concern is an example of a larger issue that is not really ChatGPT’s fault; it’s mine. I took ChatGPT down the data center road of greatest interest to me. Your data center interests may differ, so the road you take with ChatGPT may differ as well. Keep in mind that chatbots will respond to your guidance; send them down the paths of most interest to you. Don’t rely on them to cover the waterfront on their own since their LLMs may not know which waterfronts you fully want to explore. Guide GPT where you want to go and continue prompting it to provide more useful information if it doesn’t from the get-go.

How to Tailor Your Search

Matthew Mayo recently released a guide showing how to tailor conversations with ChatGPT to get the most value from your search. He offers several principles for “prompting” or “prompt engineering” and covers the basics of this topic. Then he moves on to advanced prompt engineering strategies and provides “power tips” for using ChatGPT. These should work well with other LLM-based chatbots too, such as Bard, Claude 2, and others coming soon.

The basic topics Mayo covered describe the value of being succinct and specific with search queries. He also suggests creating roles for ChatGPT to play and providing information for it to consider about the goals you want to achieve with your interaction. Positively and negatively worded prompts also can help frame the output ChatGPT will provide.

Mayo then describes another nine ways to prompt ChatGPT that vary in complexity. The ones I used in this series were simpler ones such as zero-shot or one-shot prompting, iterative prompting, and chain-of-thought prompting. These provided varying degrees of specificity about data centers. An important lesson for users is that the more detailed and illustrative you can be as you type queries into ChatGPT, the more focused its answers will be. At some point in the future though, AI may be able to generate its own prompts iteratively, based upon initially well-crafted human guidelines.

Other New Developments

In addition to prompting advice, a few more developments now make ChatGPT easier to use. Two examples include:

– Shared links — Quoting from the OpenAI help website, “shared links are a new feature that allows users to generate a unique URL for a ChatGPT conversation, which can then be shared with friends, colleagues, and collaborators. Shared links offer a new way for users to share their ChatGPT conversations, replacing the old and burdensome method of sharing screenshots.” Thus, if you want to share or collaborate with others on your data center search, the shared links function can be a great help.

– Plug-ins — A new feature of ChatGPT is its Web Browsing mode. According to Mary Newhauser, a senior data scientist at Wiley, the web browsing plug-in “allows you to conduct research and search the internet without using a search engine and clicking on links. The purpose of this new functionality is to streamline your search process and internet browsing experience, cutting out manual tasks like vetting results, clicking on links, and scanning those web pages for the desired information.” This web browsing feature provides an interactive experience while surfing the web, helping to get around the limitation of old source data mentioned above. More plug-in features are described in her article as well, along with their pros and cons.

Finally, interactive chats are not the only useful features supported by large language models. Many more are described here.

Conclusion

Large language models such as the one which underpins ChatGPT continue to evolve. The resulting competition among them should increase chatbot utility and decrease their subscription or other costs. I focused on ChatGPT for this series because it was the first easy-to-use LLM-derived offering. Arguably it is still King of the Hill, though more climbers are coming and legal and regulatory issues may complicate its use at some point.

Users who want to dig deep about any topic may at some point find value in querying multiple chatbots or in supplementing chatbots with other AI techniques designed to produce images as well. Dall-e is Open AI’s image generator. Here is the picture it produced when I typed “a large data center” into its query box.

Image by Dall-e using artificial intelligence to generate a picture from text input

Finally, as I noted in parts 1 and part 2 of this series, readers are advised to use multiple platforms to find information about any topic of interest to them. AI’s utility for data centers is no exception. Looking beyond ChatGPT for information about its pros and cons and for information about your subject matter of interest will generate a more complete knowledge base, leading to better decision-making. Despite its limitations, ChatGPT still is likely to be a helpful tool because of its ability to provide several quick and reasonably detailed insights, as it did regarding data centers. It won’t be perfect but what query or knowledge-gathering process ever is?

Article by Ron Ozminkowski, PhD and ChatGPT

Ron is an internationally recognized consultant, analytics leader, and chief scientist whose published research has been viewed by people in over 90 countries. Educated in the SUNY system and The University of Michigan. Specializing in healthcare analytics, business strategy, economics, policy, and program evaluation. Founder & President of Analytics Strategy & Consulting LLC. Known for exceptional service to clients in The White House, federal and state agencies, life sciences companies, large employers and insurers, industry groups, and health systems. Over 100 endorsements each for leadership, analytic skills, and industry knowledge, on LinkedIn (see https://www.linkedin.com/in/ronozminkowski/). More info also can be found on https://www.analyticstrategiesconsulting.com and ronozminkowski.wordpress.com.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.