Time for modelers to think about hardware as part of the model

Author khanal1990
PublishedFebruary 24, 2026February 24, 2026
in AI, Energy

Most of the energy/systems modeling tools I have used, from engineering school to today, have been desktop (mostly Windows, followed by Linux) applications. Typically, I’d use a popular desktop software to define my model within its platform and compiled and ran it on the same PC to get the results. I reported the numbers and never had to think about the hardware. Thinking about hardware was not a part of the modeler’s job description.

Many models these days run on vendor’s cloud, with an API linking the user to the software that sits within a hardware controlled by the vendor. The modeler’s PC is only a terminal. The vendor thinks about the hardware, the modeler models. Thinking about hardware is still not part of a modeler’s job description. Then again, modeling used to be a high-level endeavor in which modelers think about the problem and its mathematical formulation within a modeling platform.

The accessibility in definition enabled by general purpose programming languages

Modeling is not merely a high-level abstraction anymore. For me, the switch to lower-level programming for modeling came with increasing use of data science and machine learning based approaches in energy and systems planning/modeling. Using Python/R packages instead of a GUI unlocked capabilities to not only build data-based models, but to define simple energy system models outside the confines of proprietary software. Models increasingly developed using open-source packages in Python or Julia (and to some extent R, JavaScript, and Rust among others) have become competitive against expensive proprietary models.

The accessibility in computing enabled by availability of GPUs and HPCs

Lately, I have also been tinkering with energy modeling using a GPU. See, for example, this blog post on benchmarking power system optimization using a GPU vs a CPU. Using GPU or HPCs, as they have become more accessible, enables the big leap in computing with massive parallelization.

However, parallelization is not only about using N-times the processing units simultaneously. The architecture and system design trade-offs in using GPUs or HPCs forces choices from the modeler. In utilizing the parallelization, the GPU changes the arithmetic, more specifically the precision. While CPUs use FP64 (64-bit floating point precision, GPUs use FP32. For reference, FP64 can distinguish between the numbers 1.000000000000000 and 1.000000000000001, but FP32 can only distinguish between 1.000000 and 1.000001. At first glance, the difference seems negligible. But, with thousands of matrix multiplications that go into a model, the rounding errors accumulate. Slightly different gradients lead to slightly different positioning on solution space. The final optimal output provided by the model differs. Meaningfully.

For a simple demonstration, I tested this on a gas power plant model. Specifically, a heat rate curve, which is a simple formula that tells you how efficiently a plant burns fuel at different power levels. Think of it as a U-shaped curve: the plant has a sweet spot where it burns fuel most efficiently and gets less efficient as you push it harder or back it off. That sweet spot, and the shape of the curve around it, directly determines the plant’s fuel bill and its bid into the energy market. The model is a quadratic equation fit to 350 observations of simulated plant operating data. I used the same optimizer, the same loss function, the same data. The only thing I changed between the two runs was the arithmetic precision: FP64 on one run, FP32 on the other.

The two models disagreed with where the plant runs most efficiently. FP64 said 346.6 MW. FP32 said 345.4 MW. About one megawatt apart on a 500 MW plant. That sounds small until you price it out: the difference in estimated fuel cost between the two models at a typical dispatch point of 350 MW comes to $545,000 per year, at $4.50/MMBtu gas running 8,000 hours a year, for one plant.

I ran 200 bootstrap samples on each platform to check whether any of this was noise. It was not. FP32 was consistently less certain about where the optimum efficiency was, with a spread of estimates across 200 resamples was 18.1 MW wide versus 11.9 MW for FP64. More uncertainty about the number that determines dispatch bids. The interactive demonstration, with all charts and results, is available here.

The accessibility in upskilling and software development enabled by agentic coding

Agentic coding has lowered the floor for building low-level software. Modelers are well placed to take advantage of this because of their understanding of numbers, constraints, and system behavior translates directly into thinking like a software developer. The development cycle is shorter, upskilling is faster, and unlocking massive parallelization on GPUs and HPCs no longer requires a deep systems programming background.

But with that accessibility comes a responsibility that did not exist when modeling was a desktop endeavor. While we generally think of a model as a mathematical artifact, the hardware has been a neutral substrate so far. Something the model runs on, not something that shapes what the model is. Hardware behaves more like an implicit hyperparameter: not set, not tuned, not documented, but active in every gradient computation. The emerging paradigm asks modelers to become more aware of the speed-precision tradeoff, to take reproducibility seriously across hardware environments, and to think harder about what a model output means when a different machine might return a different answer. Before we can fully specify the model, perhaps we need to specify the machine.