Skip to content

Modified _parse_gdx_results in GAMS.py to replace _parse_special_value and Updated Import Statement #3642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 1, 2025

Conversation

AnhTran01
Copy link
Contributor

Fixes #3624

Summary/Motivation:

When parsing the results of a model after it has been solved, the level and dual value are obtained through a series of if statements in _parse_special_values that may cause slowdowns. This PR added GAMS existing functions to handle data parser for these special values in _parse_gdx_results.

Changes proposed in this PR:

  • Replaced _parse_special_values with GAMS special value parser in _parse_gdx_results.
  • Updated attempt_import to have fallback to pre-GAMS-45.0 API if gams.core.gdx is not available.

Legal Acknowledgement

By contributing to this software project, I have read the contribution guide and agree to the following terms and conditions for my contribution:

  1. I agree my contributions are submitted under the BSD license.
  2. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

@blnicho blnicho moved this from Todo to Review In Progress in July 2025 Release Jun 24, 2025
@blnicho blnicho requested a review from jsiirola June 24, 2025 18:39
Comment on lines 42 to 43
from pyomo.common.dependencies import attempt_import
import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a hard dependency on numpy. Importing numpy from dependencies will keep this as a "soft" dependency:

Suggested change
from pyomo.common.dependencies import attempt_import
import numpy as np
from pyomo.common.dependencies import attempt_import, numpy as np

It will still make numpy a hard dependency if you are going to use GAMS, but that is OK (although GAMS's available() should probably check both numpy_available and gdxcc_available)

Norman_Tran and others added 2 commits June 25, 2025 11:21
Copy link
Member

@jsiirola jsiirola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor nits and I think this is good to merge (once we get the testing infrastructure back up and working)!

specVals[gdxcc.GMS_SVIDX_UNDEF] = float("nan")
specVals[gdxcc.GMS_SVIDX_PINF] = float("inf")
specVals[gdxcc.GMS_SVIDX_MINF] = float("-inf")
specVals[gdxcc.GMS_SVIDX_NA] = struct.unpack(">d", bytes.fromhex("fffffffffffffffe"))[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not just float('nan')?

Copy link
Contributor Author

@AnhTran01 AnhTran01 Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adam or Artharv can correct me on this. NA special value used by GAMS means “initialized, but no numerical value assigned” has the byte pattern fffffffffffffffe above, while float('nan') and UNDEF/UNDF has this byte pattern 7ff8000000000000.

This is where I found the conversation about handling nan https://forum.gams.com/t/handling-nan-missing-values/7907/2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I didn't realize that Python supported different "flavors" of nan. This is fine. Out of curiosity, is there a way that a user could differentiate between the two nans in Python (UNDEF vs NA)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsiirola the special values in gdxSetSpecialValues must be unique (unique bytes). this NA value is still a nan in python so that tests like nan == nan are still False. but we abide by the rule of GDX.

Copy link
Member

@jsiirola jsiirola Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add utility methods to the GAMS interface? Something like:

def is_UNDEF(val):
    return val != val and struct.pack(">d", float('nan')) == b'\x7f\xf8\x00\x00\x00\x00\x00\x00'

def is_NA(val):
    return val != val and struct.pack(">d", float('nan')) == b'\xff\xff\xff\xff\xff\xff\xff\xfe'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am thinking about it, it might be useful to make class attributes for the special values [UNDEV, NA, PINF, NINF, EPS]. Then this code could return those class attributes. If we do that, then the user could (legitimately) use is to distinguise NA from UNDEF

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not straightforward to test for two different nans but it is possible... it is also not guaranteed that other packages like pandas will maintain the bytes during their own operations. I believe numpy might make stronger claims about maintaining bytes.

import struct
NAN = float("nan")
NA = struct.unpack(">d", bytes.fromhex("fffffffffffffffe"))[0]
print(struct.pack('>d', NAN).hex())
print(struct.pack('>d', NA).hex())

print(NAN == NA)
print(NAN == NAN)
print(NA == NA)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add utility methods to the GAMS interface? Something like:

def is_UNDEF(val):
    return val != val and struct.pack(">d", float('nan')) == b'\x7f\xf8\x00\x00\x00\x00\x00\x00'

def is_NA(val):
    return val != val and struct.pack(">d", float('nan')) == b'\xff\xff\xff\xff\xff\xff\xff\xfe'

We have these utilities in our gamsapi... so we could always import them somewhere. We'd probably need a little guidance on where would be appropriate though.

The methods above work but can be slow for large data so we've (i.e. Steve D.) came up with a clever method to reinterpret the NA bits as a UINT64. Then we can do a straightforward integer comparison rather than a string comparison.

These methods look something like this:

import struct

import numpy as np

NA = struct.unpack(">d", bytes.fromhex("fffffffffffffffe"))[0]
_NA_INT64 = np.float64(NA).view(np.uint64)

print(f"verify same bytes: {struct.pack('>Q', _NA_INT64).hex()}")

arr = np.array([float("nan"), float("nan"), float("nan"), NA])

which gives:

In [1]: arr.view(np.uint64) == _NA_INT64
Out[1]: array([False, False, False,  True])

Copy link

@boxblox boxblox Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we are going to go down this road then it's probably worth mentioning that we also use -0.0 as our EPS... rather than sys.float_info.min (or sys.float_info.epsilon). The GAMS interpretation of EPS is mathematically zero... but we use the sign bit to detect this special value. I'm not sure if pyomo needs all this detail, might be implications elsewhere?

@AnhTran01
Copy link
Contributor Author

@boxblox
@abhosekar

@AnhTran01
Copy link
Contributor Author

The codes markdowned below are used to generate a simple Pyomo model to obtain the gdx file and test GAMS special value mapping against the _parse_special_values method.

The simplePyomoModel.py saves the gdx file to a subdirectory within the current working directory based on the number of variables (ex: 10 variables will save to ~/10, 100 variables will save to ~/100, etc).

The timing.py file will iterate through the subfolder in the current working directory to search and aggregate files ending with p.gdx, extract the time for _parse_special_value and GAMS special value parser to complete, and generate the figure below. The reading process is repeated N times to reduce the hardware noise.

The packages required to run timing.py are numpy, matplotlib.pyplot, and gams.core

performance_comparison

@AnhTran01
Copy link
Contributor Author

simplePyomoModel.py

import os
from pyomo.environ import (
    Var,
    Constraint,
    Param,
    Set,
    Objective,
    Binary,
    ConcreteModel,
    SolverFactory,
    RangeSet
)

def _createPyomoModel(nVar):
    """
    Construct a simple pyomo model with 1 variable, 1 constraint
    The model is fixed to remove the solver time and quickly generate the gdx file

    Args:
        nVar (int): Number of 

    Returns:
        None
    """

    m = ConcreteModel()

    m.dummySet1 = Set(initialize=RangeSet(nVar))

    m.dummyVar1 = Var(
        m.dummySet1,
        within=Binary,
    )

    @m.Constraint(m.dummySet1)
    def dummyConstraint2(m, i, j):
        return m.dummyVar1[i] >= 1
 
    @m.Objective()
    def dummyObj(m):
        return sum(m.dummyVar1[i] for i in m.dummySet1)
    
    m.dummyVar1.fix(1)

    # reference: https://www.gams.com/latest/docs/UG_OptionStatement.html
    opt=SolverFactory('gams')
    add_options = ['option iterLim=1;'] # iteration limit
    add_options = ['option resLim=1;'] # solver time

    status = opt.solve(m,
                tee=False,
                keepfiles=True,
                tmpdir=os.getcwd()+f'/{nVar}',
                add_options=add_options)
        
def main():
    num_vars = [10, 100, 1000, 10000, 100000, 1000000, 5000000, 10000000]
    for n in num_vars:
        _createPyomoModel(n)
        
main()

@AnhTran01
Copy link
Contributor Author

timing.py

import struct
import sys
import time
import os
import re

import numpy as np
import matplotlib.pyplot as plt
from gams.core import gdx

def gams_like(file_path):
    model_soln = {}

    system_directory = "/Library/Frameworks/GAMS.framework/Versions/49/Resources/"
    gdxHandle = gdx.new_gdxHandle_tp()
    gdx.gdxCreateD(gdxHandle, system_directory, gdx.GMS_SSSIZE)
    gdx.gdxOpenRead(gdxHandle, file_path)
    ret, symCount, _ = gdx.gdxSystemInfo(gdxHandle)

    # only setting SVs
    specVals = gdx.doubleArray(gdx.GMS_SVIDX_MAX)
    rc = gdx.gdxGetSpecialValues(gdxHandle, specVals)

    specVals[gdx.GMS_SVIDX_EPS] = sys.float_info.min
    specVals[gdx.GMS_SVIDX_UNDEF] = float("nan")
    specVals[gdx.GMS_SVIDX_PINF] = float("inf")
    specVals[gdx.GMS_SVIDX_MINF] = float("-inf")
    specVals[gdx.GMS_SVIDX_NA] = struct.unpack(">d", bytes.fromhex("fffffffffffffffe"))[
        0
    ]
    gdx.gdxSetSpecialValues(gdxHandle, specVals)
    # end set SVs

    start = time.time()
    ret, symCount, _ = gdx.gdxSystemInfo(gdxHandle)
    for n in range(1, symCount + 1):
        ret, sym_name, typ, subtype = gdx.gdxSymbolInfo(gdxHandle, n)
        if not ret:
            raise RuntimeError("GAMS GDX failure (gdxSymbolInfo).")

        ret, nrRecs = gdx.gdxDataReadRawStart(gdxHandle, n)
        if not ret:
            raise RuntimeError("GAMS GDX failure (gdxDataReadRawStart).")

        ret, keys, vals, dimfirst = gdx.gdxDataReadRaw(gdxHandle)
        if not ret:
            raise RuntimeError("GAMS GDX failure (gdxDataReadRaw).")

        ret = gdx.gdxDataReadDone(gdxHandle)
        if not ret:
            raise RuntimeError("GAMS GDX failure (gdxDataReadDone).")

        model_soln[sym_name] = (vals[0], vals[1])
    finish = time.time()

    gdx.gdxClose(gdxHandle)
    gdx.gdxFree(gdxHandle)
    gdx.gdxLibraryUnload()

    return finish - start

def _parse_special_values(value):
    if value == 1.0e300 or value == 2.0e300:
        return float("nan")
    if value == 3.0e300:
        return float("inf")
    if value == 4.0e300:
        return -float("inf")
    if value == 5.0e300:
        return sys.float_info.epsilon
    return value

def pyomo_like(file_path):
    system_directory = "/Library/Frameworks/GAMS.framework/Versions/49/Resources/"
    gdxHandle = gdx.new_gdxHandle_tp()
    gdx.gdxCreateD(gdxHandle, system_directory, gdx.GMS_SSSIZE)
    gdx.gdxOpenRead(gdxHandle, file_path)
    ret, symCount, _ = gdx.gdxSystemInfo(gdxHandle)

    i = 0
    model_soln = {}
    start = time.time()
    while True:
        i += 1
        ret = gdx.gdxDataReadRawStart(gdxHandle, i)
        if not ret[0]:
            break

        ret = gdx.gdxDataReadRaw(gdxHandle)
        if not ret[0] or len(ret[2]) < 2:
            raise RuntimeError("GAMS GDX failure (gdxDataReadRaw).")
        level = _parse_special_values(ret[2][0])
        dual = _parse_special_values(ret[2][1])

        ret = gdx.gdxSymbolInfo(gdxHandle, i)
        if not ret[0]:
            break
        if len(ret) < 2:
            raise RuntimeError("GAMS GDX failure (gdxSymbolInfo).")
        model_soln[ret[1]] = (level, dual)

    finish = time.time()

    gdx.gdxDataReadDone(gdxHandle)
    gdx.gdxClose(gdxHandle)
    gdx.gdxFree(gdxHandle)
    gdx.gdxLibraryUnload()

    return finish - start

def gdx_extractor(base_dir = os.getcwd()):
    gdx_file = []
    gms_size = []

    # Iterate through each subdirectory
    for entry in os.listdir(base_dir):
        subdir_path = os.path.join(base_dir, entry)
        model_size = re.findall('\d+',subdir_path.split('/')[-1])
        
        if os.path.isdir(subdir_path):

            # search for file name
            for file_name in os.listdir(subdir_path):    
                if file_name.endswith("p.gdx"):

                    file_path = os.path.join(subdir_path, file_name)
                    
                    gdx_file.append(file_path)
                    gms_size.append(model_size)

    # sort the gdx file in increasing number of variables
    temp = {}
    for i in range(len(gdx_file)):
        temp[float(gms_size[i][0])] = gdx_file[i]

    gdx_file = dict(sorted(temp.items()))
    return gdx_file

gdxFiles = gdx_extractor()

repeats = 4
tictoc_gams = {}
tictoc_pyomo = {}

for gmsSize, gdxPath in gdxFiles.items():
    print(gmsSize, gdxPath)
    tictoc_gams[gmsSize] = []
    tictoc_pyomo[gmsSize] = []

    # the process is repeated N times to reduce hardware noise
    for n in range(repeats):
        tictoc_gams[gmsSize].append(gams_like(gdxPath))
        tictoc_pyomo[gmsSize].append(pyomo_like(gdxPath))


plt.figure(figsize=(8, 6))

# Convert dictionary to sorted arrays
x_Gvals = np.array(sorted(tictoc_gams.keys()))
y_Gvals = np.array([np.mean(tictoc_gams[k]) for k in x_Gvals])
y_Gerrs = np.array([np.std(tictoc_gams[k]) for k in x_Gvals])

x_Pvals = np.array(sorted(tictoc_pyomo.keys()))
y_Pvals = np.array([np.mean(tictoc_pyomo[k]) for k in x_Pvals])
y_Perrs = np.array([np.std(tictoc_pyomo[k]) for k in x_Pvals])

x_log = [i+1 for i in range(len(x_Gvals))]
perf_improved = float(np.abs(y_Gvals[-1] - y_Pvals[-1]) / y_Pvals[-1]) * 100

# Plot with error bars
plt.errorbar(x_log, y_Gvals, yerr=y_Gerrs, fmt='o-', capsize=5, label='gams')
plt.errorbar(x_log, y_Pvals, yerr=y_Perrs, fmt='*-', capsize=5, label='pyomo')

plt.xticks(x_log, ['1.e+01', '1.e+02', '1.e+03', '1.e+04', '1.e+05', '1.e+06', '5.e+06', '1.e+07'])

plt.xlabel('Number of Variables')
plt.ylabel('Time (second)')
plt.title(f'Mean Reading Time of GAMS vs. Pyomo - {perf_improved:.2f}% Improved')
plt.legend()
plt.tight_layout()

plt.savefig('performance_comparison.png', dpi = 1000)

plt.show()


@mrmundt
Copy link
Contributor

mrmundt commented Jul 1, 2025

@AnhTran01 - Please run black -S -C on your files with the most recent version of black.

@github-project-automation github-project-automation bot moved this from Review In Progress to Reviewer Approved in July 2025 Release Jul 1, 2025
@blnicho blnicho merged commit 6b3d1cb into Pyomo:main Jul 1, 2025
59 of 63 checks passed
@github-project-automation github-project-automation bot moved this from Reviewer Approved to Done in July 2025 Release Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Updating the Import Statement to Use GAMS new API
6 participants