The past 50 years has seen a dramatic increase in the amount of computing capability per person, in particular, those enabled by AI. Despite the positive societal benefits, AI technologies can come with significant environmental implications. To scale AI sustainably, we need to make AI and computing more broadly efficient and flexible. Going beyond efficiency, we look into designing and optimizing AI models and computing infrastructures across the lifecycle. This repository provides resources from our research innovations and artifacts to talks and education materials to sustainably advance towards artificial general intelligence.
The past 50 years has seen a dramatic increase in the amount of computing capability per person. Since the introduction of the first commercially available microprocessor -- the Intel 4004 released in 1971 – the number of transistors in a computer die has increased by over a million fold. Accelerating knowledge discovery in science and engineering demands even higher computation capability beyond what a single microprocessor can offer.
High-performance computing infrastructures, from supercomputers solving weather forecasting, molecular modeling, and other complex computational problems to AI training clusters, connect tens of thousands of processors using advanced networking gears to further scale up computation capability. For example, the Frontier supercomputer -- the world’s first exascale supercomputer -- comes with more than 37,000 Graphics Processing Units (GPUs) to deliver more than 1 quintillion calculations per second. And, to propel our reach to artificial general intelligence, AI superclusters, such as Google’s Tensor Processing Unit (TPU) based AI Hypercomputer or Meta’s Research SuperCluster (RSC), also rely on horizontal scaling to achieve high performance, scalability, and efficiency.
With data centers of domain-specific hardware specialization, the computing industry has achieved many orders-of-magnitude efficiency improvements through decades of technological innovations. Such efficiency improvement is key to keep the energy use of data centers globally constant. Between 2010 and 2018, the compute instances in global data centers increased by 5.5 times while the overall energy use increased by merely 0.06 times.
AI is revolutionizing the entire industry, from education and medicine to e-commerce, finance, and entertainment. OpenCatalyst uses AI to discover new electrocatalysis for efficient renewable energy storage. AlphaFold uses AI to predict protein structures rapidly that has the potential to revolutionize the entire biological science domain. FarmBeats uses AI to improve farming efficiency. AI is changing the way we live, learn, communicate, and interact with each other in a profound manner. Despite the positive societal benefits, the development of AI technologies necessitates an increase in data center energy use and associated greenhouse gas emissions, from the rising demand of computing resources. Between 2019 and 2021, the amount of data used for AI model training increased by more than two times, corresponding to a more than 20 times model size increase. In fact, since the inception of AlexNet in 2012, the number of parameters of the state-of-the-art AI models have been increasing exponentially into the scale of trillions of parameters and requiring terabytes of memory capacity for storage. The AI scaling trend is pushing the frontier of computing infrastructures.
The first step is to understand AI’s carbon impact across its lifecycle holistically. AI’s lifecycle carbon impact comes from use of AI systems, called operational carbon footprint, as well as from manufacturing of AI systems and data center construction materials, called embodied carbon footprint.
Operational carbon footprint is defined as the carbon emissions (defined as carbon dioxide equivalent (CO2e)) associated with the electricity consumed. For a key production recommendation model, the energy consumption breakdown is roughly 30:30:40 over the key phases of Data, Experimentation / Training, and Inference. For example, the operational carbon footprint of llama3 model training is estimated to produce 2,290 tonnes of carbon emissions using the GPU Thermal Design Power (TDP) of 700W and the average emission factor of the US grids. In reality, AI workflows typically involve dozens, if not hundreds, of training runs to produce a final model. Once a model is trained to meet the desirable accuracy level, it is further optimized to serve various product use cases. In case of the language translation model, the inference footprint can double the training carbon footprint over its entire model lifetime.
AI’s carbon impact goes beyond emissions associated with the electricity needed to power the models (operational energy use) to include the embodied emissions of the required infrastructure, such as, semiconductor manufacturing, and steel and cement used for data center construction. AI systems used for model training and inference at scale come with manufacturing carbon emissions that are produced during the production of system hardware. Carbon embodied in AI system hardware constitutes a substantial portion to AI’s overall carbon footprint. Taking the multilingual translation task as an example, the model’s overall lifecycle carbon footprint is approximately four times higher than the operational carbon footprint of model training.
For consumer electronics, embodied carbon footprint is a more significant than operational carbon footprint over the computing device’s lifecycle, with a rough breakdown of 80 to 20. As consumer electronics evolve into wearables that germinate the next wave of computing, additional demand in data centers will be required to support the new computing paradigm. And, we expect a similar embodied to operational carbon footprint breakdown for wearables, such as smart glasses. Tomorrow, we will have augmented reality with contextual AI capability. This is an important time for us, computer system designers, to innovate with sustainability in mind. What do sustainability-first design principles look like for the next wave of computing?
https://github.com/facebookresearch/CATransformers
To enable carbon-guided AI model-hardware design space exploration, we design CATransformers. CATransformers employs joint neural network and hardware architecture search and optimizes for the total carbon footprint. By incorporating both operational and embodied carbon metrics into early design space exploration of domain-specific hardware accelerators, CATransformers demonstrates that optimizing for carbon yields design choices distinct from those optimized solely for latency or energy efficiency.
https://github.com/facebookresearch/ACT
To enable carbon-guided computer system design, we design ACT --- an architectural carbon modeling tool --- to enable sustainable computer system design. ACT comprises an analytical, architectural carbon footprint model and use-case dependent optimization metrics to estimate the carbon footprint of hardware --- embodied carbon. ACT estimates greenhouse gas (GHG) emissions from hardware manufacturing based on workload characteristics, hardware specifications, semiconductor fab characteristics, and environmental factors.
ACT addresses a crucial gap in quantifying and enabling sustainability-driven hardware design space exploration for sustainability to be considered as a first-order design objective, alongside performance, power, and area.
https://github.com/facebookresearch/CarbonExplorer
To enable carbon-guided datacenter computing, we design Carbon Explorer ---
To develop AI -- this century’s most important technology -- sustainably, we must make AI and computing, more broadly, efficient and flexible. Ample efficiency optimization opportunities are present across the entire AI model life cycle, spanning data, algorithms, and system hardware.
AI models train on a large amount of data. When designed well, data scaling, data sampling and selection strategies can lead to faster training time and higher model quality at the same time. Realizing these data efficiency optimization potential is not straightforward in the world, where data formats are highly fragmented, data modality is diverse, and data quality of training samples is uncertain. This is why having a common metadata format for AI, such as Croissant, is of paramount importance to sustain the increasing infrastructure demands of data. Complementary to AI data optimization, the data storage and ingestion pipeline for AI demands significant power capacity. An optimized composite data storage infrastructure using novel application-aware cache policies can absorb more than 3 times higher IO than a baseline LRU flash cache, reducing power demand in a petabyte-scale production AI training cluster by 29%.
Making machine learning models parameter-efficient can contribute to more effective use of energy. Taking the family of foundational language models, Llama, as an example, LlaMA-13B outperforms GPT-3 (175B) for a variety of tasks and, at the same time, consumes approximately 24 times lower energy than GPT-3. Llama as a parameter-efficient model for language tasks is superior across the key design dimensions of accuracy, training time, energy consumption and operational carbon footprint. Model optimization for efficient inference is a fertile ground. By clustering redundant attention heads over tokens in Transformer-based large language models (LLMs), a simple inference-time model pruning technique, such as CHAI or LayerSkip, can improve inference efficiency significantly. Model efficiency is more important now than ever to help bend AI’s rising energy demand.
By enabling agile design space exploration for large model acceleration on distributed systems, the MAD-Max performance modeling framework enables efficient acceleration. Machine learning researchers can use the framework to navigate the design space of model parallelization and hardware deployment strategies, by navigating at the pareto frontier of training time performance and energy use holistically. For the multilingual language model, a combination of data locality optimization, GPU acceleration, low precision data format use, and algorithmic optimization can effectively bring over 800 times energy efficiency improvement. In addition, optimization frameworks, such as Zeus, can enable model developers to navigate the tradeoff between operational energy consumption and performance optimization by automatically finding optimal job- and GPU-level configurations for recurring DNN training jobs, thus reducing inefficient energy usage in model training. Going Beyond Efficiency
Making AI more flexible and resilient to the changing environment helps. Scale matters. As the scale of model training increases from hundreds to tens of thousands of GPUs, training workflow failures occur more frequently, with substantial training time degradation. Thus, fault-tolerant, resilient distributed training frameworks are becoming more important, not only for large scale machine learning models but also for distributed training environments, such as federated learning at the edge. While existing software system stack does not yet support flexible computation shifting, when available, such flexibility feature can provide an important lever to application developers and data center operators to explicitly annotate and expose flexibility at the levels of functions, programs, services, or workloads. It enables better control and management decisions to dynamic signals, such as compute resource availability, electricity availability, or carbon intensity of energy for improved resiliency and sustainability.
AI can be part of the solution space by accelerating grid decarbonization and renewable energy storage technology advancements. A significant fraction -- 5 to 15% -- of renewable energy generated in the grids around the world today goes into waste due to curtailment. More accurate energy demand and emission forecasting methods provide power grid operators effective signals to improve grid-level energy demand-response management. However, emission forecasting is a dynamic yet complex problem space. Predicting energy demand in a highly-interconnected grid transmission network as well as renewable energy availability in the presence of changing weather conditions requires building and running computationally-intensive physical models and simulations. This is where AI can help -- to predict when and where renewable energy is stranded. With the capability to accurately forecast when and where renewable surplus happens can help improve renewable energy utilization, resulting in more effective demand-response management at the grid level and thereby accelerating grid decarbonization. What other complex climate problems can we leverage AI to solve?
Last but not least, lowering carbon embodied in hardware helps mitigate AI’s carbon impact. Taking an Apple iPhone3 from 2009 and iPhone11 from 2019 released a decade later as an example, the operational carbon footprint improved by 1.6 times while the manufacturing carbon footprint increased by 3 times. This is primarily driven by more advanced hardware architectures with a much larger collection of application-specific accelerators and a higher semiconductor manufacturing environmental footprint. More than 80% of the iPhone11’s lifecycle carbon footprint is embodied in the hardware while less than 20% comes from operational uses. This shift opens up a new design space for computer system designers. How do we minimize lifecycle carbon emissions of AI and computing through lowering carbon embodied in the hardware? What do sustainability-driven computer systems look like when we keep end-of-life processing in mind at the design time? How do we weigh and co-design operational and embodied carbon footprint to minimize AI and computing’s lifecycle carbon and environmental footprint?
The past decades witnessed impressive technology advances that bootstrapped the AI revolution. The laser focus on performance and energy efficiency optimization has made computing capable, cost-effective, and prevalent. In the presence of the growth of AI and its applications, we must continue to focus on efficiency optimization across the AI model development cycle as well as across the hardware-software system stack holistically.
To sustainably scale this century’s most important technology, we must go beyond efficiency. We need AI to be flexible, be part of the solution space, and come with minimal manufacturing environmental footprints to achieve an environmentally sustainable future for computing. To do so, we must be able to measure. However, it is not going to be straightforward. There is a need to develop a standard, scientifically rigorous emission accounting methodology for AI. Having a common, easy-to-use carbon accounting standard is a key step to enable systematic and transparent assessment for modern AI use cases and incentivize meaningful climate actions.
Characterizing and analyzing carbon emissions is a complex process. While there are initial efforts, such as the MLPerf Power that proposes a way to measure power consumption of AI systems, there are not yet metrics or standard tools. This is an area where active research is needed. Carbon Explorer and ACT are just the initial steps to arm the computing industry, such as Meta, and the research community with design space exploration tools that put carbon as a first-class design principle. The role of carbon, along with performance, power, and cost efficiency opens new opportunities for the AI system stack design. While enabling a more sustainable scaling for AI, there are also plenty of impactful opportunities to deploy AI to tackle climate challenges.
-
Chasing Carbon: The Elusive Environmental Footprint of Computing. Udit Gupta, Young Geun Kim, Sylvia Lee, Jordan Tse, Hsien-Hsin Lee, Gu-Yeon Wei, David Brooks, Carole-Jean Wu. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2021. [IEEE MICRO Top Picks]
-
Sustainable AI: Environmental Implications, Challenges and Opportunities. Carole-Jean Wu, Ramya Raghavendra, Udit Gupta, Bilge Acun, Newsha Ardalani, Kiwan Maeng, Fiona Aga Behram, James Huang, Charles Bai, Michael Gschwind, Anurag Gupta, Myle Ott, Anastasia Melnikov, Salvatore Candido, David Brooks, Geeta Chauhan, Benjamin Lee, Hsien-Hsin S. Lee, Bugra Akyildiz, Maximilian Balandat, Joe Spisak, Ravi Jain, Mike Rabbat, Kim Hazelwood. In Proceedings of the Conference on Machine Learning and Systems (MLSys), 2022.
-
Carbon Dependencies in Datacenter Design and Management. Bilge Acun, Benjamin C. Lee, Fiodar Kazhamiaka, Aditya Sundarrajan, Manoj Chakkaravarthy, Kiwan Maeng, David Brooks, Carole-Jean Wu. In Proceedings of the 1st Workshop on Sustainable Computer Systems Design and Implementation (HotCarbon), 2022.
-
ACT: Designing Sustainable Computer Systems with an Architectural Carbon Modeling Tool. Udit Gupta, Mariam Elgamal, Gage Hills, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, Carole-Jean Wu. In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA), 2022. [IEEE MICRO Top Picks]
-
Carbon Explorer: A Holistic Framework for Designing Carbon Aware Datacenters. Bilge Acun, Benjamin Lee, Fiodar Kazhamiaka, Kiwan Maeng, Udit Gupta, Manoj Chakkaravarthy, David Brooks, Carole-Jean Wu. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023. [IEEE MICRO Top Picks Honorable Mention]
-
Towards Green, Accurate, and Efficient AI Models Through Multi-Objective Optimization. Udit Gupta, Daniel Jiang, Maximilian Balandat, Carole-Jean Wu. In the International Conference on Machine Learning (ICML) Climate Change AI Workshop, 2023.
-
Carbon-Efficient Design Optimization for Computer Systems. Mariam Elgamal, Doug Carmean, Elnaz Ansari, Okay Zed, Ramesh Peri, Srilatha Manne, Udit Gupta, Gu-Yeon Wei, David Brooks, Gage Hills, Carole-Jean Wu. In Proceedings of the 2nd Workshop on Sustainable Computer Systems Design and Implementation (HotCarbon), 2023.
-
Unlocking the Potential of Renewable Energy Through Curtailment Prediction. Bilge Acun, Brent Morgan, Carole-Jean Wu, Henry Richardson, Nat Steinsultz. In the Neural Information Processing Systems (NeurIPS) Climate Change AI Workshop, 2023.
-
Beyond Efficiency: Scaling AI Sustainably. Carole-Jean Wu, Bilge Acun, Ramya Raghavendra, Kim Hazelwood. In Proceedings of the IEEE MICRO Special Issue on the Past, Present, and Future of Warehouse-Scale Computing, 2024.
-
Scaling AI Sustainably. Carole-Jean Wu, Bilge Acun, Ramya Raghavendra, Kim Hazelwood. In the Winter Issue of the Bridge of the National Academy of Engineering.
-
CORDOBA: Carbon-Efficient Optimization Framework for Computing Systems. Mariam Elgamal, Doug Carmean, Elnaz Ansari, Okay Zed, Ramesh Peri, Srilatha Manne, Udit Gupta, Gu-Yeon Wei, David Brooks, Gage Hills, Carole-Jean Wu. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2025. [Best Paper Award Honorable Mention]
-
CATransformers: Carbon Aware Transformers Through Joint Model-Hardware Optimization. Irene Wang, Newsha Ardalani, Mostafa Elhoushi, Daniel Jiang, Samuel Hsia, Ekin Sumbul, Divya Mahajan, Carole-Jean Wu, Bilge Acun. In the conference on Neural Information Processing Systems (NeurIPS), 2025.
We would like to thank many colleagues at Meta and collaborators from the academic research community and from the computing industry. It is upon us — each and everyone of us — to contribute to a sustainable future for computing and the society.
This repository is CC-BY-NC 4.0 licensed, as found in the LICENSE file. The tools in the sub-repositories have their own individual licenses.


