South African organisations are facing mounting pressure to reduce their power consumption, a challenge that’s growing more complex by the day. With the National Energy Regulator of South Africa (Nersa) revising its price determination, electricity tariffs are now set to rise by 8.8% annually for the next two years, nearly triple the anticipated inflation rate. This sharp increase comes at a time when businesses are already grappling with intensified Environmental, Social, and Governance (ESG) reporting requirements, carbon pricing mechanisms, and investor-driven sustainability expectations. Add to that South Africa’s commitment to achieving net zero emissions by 2050, and the urgency to rethink energy strategies becomes undeniable.
To complicate matters, often technologies that promise operational efficiency contribute to the strain on energy use. AI adoption in South Africa has reached global parity, with 72% of employees using it several times a week. From predictive analytics to generative tools like ChatGPT, which alone consumes over 500,000 kilowatt-hours daily, AI’s energy footprint is becoming impossible to ignore.
To lower costs, meet sustainability targets and stave off the worst effects of climate change, organisations must reduce energy use, increase reliance on renewable and carbon-free energy sources, find ways of sequestering carbon or capturing it before it escapes into the atmosphere, and figure out how to reuse resources and eliminate waste. In short, a holistic approach is necessary to achieve sustainable AI.
Four strategies for more energy efficient AI
Researchers are currently investigating how resources can be used efficiently, whether AI workloads happen on the cloud, supercomputers or on-premises data centres. Different types of resources can more efficiently run certain workloads than others. If an organisation needs to run certain types of workloads, how should those workflows be broken up? Some workloads may be best suited for the cloud, some may be best suited for a supercomputer or an on-premises data centre.
Analogue accelerators – For decades, digital circuits have been the preferred option. They’re fast, powerful and enable processing enormous amounts of data in record time. But with widespread AI use, those digital circuits have a limitation: they need massive amounts of power to operate. But as the saying goes: what’s old is new again. Analogue circuits are comprised of components like resistors, capacitors and inductors that operate in the analogue domain and are a promising alternative to reduce energy consumption. Instead of using a binary system of zeros and ones, analogue circuits replace binary logic with a range of continuous signals. When constructed with components like memristors that can store data, data movement from memory to the accelerator is reduced which reduces energy consumption. Special-purpose accelerators that target specific workloads are generally more efficient than general purpose ones. This different approach achieves the same goal, but it can yield significant improvement in energy consumption at the chip level.
Digital twins – A virtual representation of a physical system, like a cloud, supercomputer or on-premises data centre. This virtual representation stays up-to-date and can be used for real-time optimisation of the physical system. Digital twins are generally classified according to how they interact with the physical system. The most elemental form of a digital twin is a simulation of the physical system, an example is a simulation of the cooling infrastructure in a data centre used in the design of layout of the facility. These types of “twins” have been used for decades in engineering. If the simulation exchanges data with the physical system to remain current, for example by sensing the power consumption of the physical components in a data centre and updating the model accordingly, it can evolve with the physical system and can be used to, among other things, maintain the operational efficiency of the system.
Geo-distributed workloads – Energy and water availability vary and are hyperlocal. Using optimisation algorithms, it’s possible to examine the variation in carbon intensity, water availability and energy cost in certain areas. That data can then be used to determine the best location to run a workload, like generative AI, to optimise resource usage hyperlocally and create significant savings in power and water consumption while reducing costs.
Leveraging waste heat – 100% of electrical energy that goes into a data centre ultimately gets converted to heat, which must be cooled and transferred. Data centres have traditionally used air cooling, but it is less efficient and more expensive than direct liquid cooling (DLC), which has grown in favour in the age of AI. DLC is a cooling method where liquid is pumped directly into a server to absorb heat emitted by all the systems components including processors, GPUs and memory and then sent to a heat exchange system outside a data centre. It cools more efficiently, since water has four times more heat capacity than air. Liquid is also easier to contain and transport while minimising heat loss more efficiently, improving waste heat utilisation. Capturing waste heat is important because it can be used for other purposes, such as warming greenhouses to create the ideal conditions for growing tomatoes or heating buildings.
As AI grows in popularity, it’s opening vast opportunities to increase creativity, unearth new applications and improve productivity across industries. But AI’s popularity is also requiring more energy to acquire and train the models that will change the way we all work and live. Minimising the impacts of this energy use will be crucial for South African organisations as they navigate the AI era.




