Skip to content

Conversation

@lpirl
Copy link
Contributor

@lpirl lpirl commented Oct 24, 2025

Querying imbalance volumes [1] can lead to a stack trace [2] if responded time series overlap [3].

This PR presents my hotfix for the issue. I am neither sure if this is semantically correct, nor if this the pandas way to do it.

Feel free to reference and close this PR in favor of a more conventional fix.

[1] query

https://web-api.tp.entsoe.eu/api?documentType=A86&controlArea_Domain=10Y1001A1001A39I&periodStart=202510222100&periodEnd=202510232100&securityToken=…

[2] stack trace
Traceback (most recent call last):
  File "/home/lukas/uni/electricity-watch/website/./electricity-watch", line 143, in <module>
    main()
    ~~~~^^
  File "/home/lukas/uni/electricity-watch/website/./electricity-watch", line 117, in main
    area_main(area)
    ~~~~~~~~~^^^^^^
  File "/home/lukas/uni/electricity-watch/website/./electricity-watch", line 68, in area_main
    rating = module.main(area)
  File "/home/lukas/uni/electricity-watch/website/src/criteria/imbalance.py", line 119, in main
    average = get_average(area)
  File "/home/lukas/uni/electricity-watch/website/src/criteria/imbalance.py", line 57, in get_average
    data = get_historic_data(area)
  File "/home/lukas/uni/electricity-watch/website/src/criteria/imbalance.py", line 31, in get_historic_data
    results = criteria.for_code_lists_by_precedence(
      area, _get_historic_data_for_code
    )
  File "/home/lukas/uni/electricity-watch/website/src/criteria/__init__.py", line 74, in for_code_lists_by_precedence
    result = func(code)
  File "/home/lukas/uni/electricity-watch/website/src/criteria/imbalance.py", line 26, in _get_historic_data_for_code
    return criteria.entsoe_query(
           ~~~~~~~~~~~~~~~~~~~~~^
      criteria.query_history_for_seasonal_data, area, days_count,
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      client.query_imbalance_volumes, code
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/lukas/uni/electricity-watch/website/src/criteria/__init__.py", line 128, in entsoe_query
    out = func(*args, **kwargs)
  File "/home/lukas/uni/electricity-watch/website/src/criteria/__init__.py", line 198, in query_history_for_seasonal_data
    out = concat(
      entsoe_query(func, *args, start=start, end=end, **kwargs)
      for start, end in intervals
    )
  File "…/venv/lib/python3.13/site-packages/pandas/core/reshape/concat.py", line 382, in concat
    op = _Concatenator(
        objs,
    ...<8 lines>...
        sort=sort,
    )
  File "…/venv/lib/python3.13/site-packages/pandas/core/reshape/concat.py", line 445, in __init__
    objs, keys = self._clean_keys_and_objs(objs, keys)
                 ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "…/venv/lib/python3.13/site-packages/pandas/core/reshape/concat.py", line 504, in _clean_keys_and_objs
    objs_list = list(objs)
  File "/home/lukas/uni/electricity-watch/website/src/criteria/__init__.py", line 199, in <genexpr>
    entsoe_query(func, *args, start=start, end=end, **kwargs)
    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lukas/uni/electricity-watch/website/src/criteria/__init__.py", line 128, in entsoe_query
    out = func(*args, **kwargs)
  File "/home/lukas/uni/electricity-watch/entsoe-py/entsoe/decorators.py", line 129, in year_wrapper
    frame = func(*args, start=_start, end=_end, **kwargs)
  File "/home/lukas/uni/electricity-watch/entsoe-py/entsoe/entsoe.py", line 2020, in query_imbalance_volumes
    df = parse_imbalance_volumes_zip(zip_contents=archive, include_resolution=include_resolution)
  File "/home/lukas/uni/electricity-watch/entsoe-py/entsoe/parsers.py", line 668, in parse_imbalance_volumes_zip
    df = pd.concat(frames)
  File "…/venv/lib/python3.13/site-packages/pandas/core/reshape/concat.py", line 382, in concat
    op = _Concatenator(
        objs,
    ...<8 lines>...
        sort=sort,
    )
  File "…/venv/lib/python3.13/site-packages/pandas/core/reshape/concat.py", line 445, in __init__
    objs, keys = self._clean_keys_and_objs(objs, keys)
                 ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "…/venv/lib/python3.13/site-packages/pandas/core/reshape/concat.py", line 504, in _clean_keys_and_objs
    objs_list = list(objs)
  File "/home/lukas/uni/electricity-watch/entsoe-py/entsoe/parsers.py", line 664, in gen_frames
    frame = parse_imbalance_volumes(xml_text=arc.read(f), include_resolution=include_resolution)
  File "/home/lukas/uni/electricity-watch/entsoe-py/entsoe/parsers.py", line 310, in parse_imbalance_volumes
    df = pd.concat(frames, axis=1)
  File "…/venv/lib/python3.13/site-packages/pandas/core/reshape/concat.py", line 395, in concat
    return op.get_result()
           ~~~~~~~~~~~~~^^
  File "…/venv/lib/python3.13/site-packages/pandas/core/reshape/concat.py", line 680, in get_result
    indexers[ax] = obj_labels.get_indexer(new_labels)
                   ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "…/venv/lib/python3.13/site-packages/pandas/core/indexes/base.py", line 3892, in get_indexer
    raise InvalidIndexError(self._requires_unique_msg)
pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects
[3] response

Search for quantities 15.123 and 2.897:

<?xml version="1.0" encoding="UTF-8"?>
  <Balancing_MarketDocument …>
    …
    <revisionNumber>1</revisionNumber>
    <type>A86</type>
    <process.processType>A16</process.processType>
    <sender_MarketParticipant.mRID codingScheme="A01">10X1001A1001A450</sender_MarketParticipant.mRID>
    <sender_MarketParticipant.marketRole.type>A32</sender_MarketParticipant.marketRole.type>
    <receiver_MarketParticipant.mRID codingScheme="A01">10X1001A1001A450</receiver_MarketParticipant.mRID>
    <receiver_MarketParticipant.marketRole.type>A33</receiver_MarketParticipant.marketRole.type>
    <docStatus>
      <value>A01</value>
    </docStatus>
    <controlArea_Domain.mRID codingScheme="A01">10Y1001A1001A39I</controlArea_Domain.mRID>
    <period.timeInterval>
      <start>2025-10-22T21:00Z</start>
      <end>2025-10-23T21:00Z</end>
    </period.timeInterval>
      <TimeSeries>
        <mRID>1</mRID>
        <businessType>A19</businessType>
        <flowDirection.direction>A01</flowDirection.direction>
        <quantity_Measure_Unit.name>MWH</quantity_Measure_Unit.name>
        <curveType>A03</curveType>
          …
          <Period>
            <timeInterval>
              <start>2025-10-23T17:00Z</start>
              <end>2025-10-23T21:00Z</end>
            </timeInterval>
            <resolution>PT15M</resolution>
              …
              <Point>
                <position>2</position>
                <quantity>2.897</quantity>
                <secondaryQuantity>0</secondaryQuantity>
              </Point>
          </Period>
      </TimeSeries>
      <TimeSeries>
        <mRID>2</mRID>
        <businessType>A19</businessType>
        <flowDirection.direction>A02</flowDirection.direction>
        <quantity_Measure_Unit.name>MWH</quantity_Measure_Unit.name>
        <curveType>A03</curveType>
          …
          <Period>
            <timeInterval>
              <start>2025-10-23T17:30Z</start>
              <end>2025-10-23T21:00Z</end>
            </timeInterval>
            <resolution>PT15M</resolution>
              …
              <Point>
                <position>14</position>
                <quantity>15.123</quantity>
                <secondaryQuantity>5.27</secondaryQuantity>
              </Point>
          </Period>
      </TimeSeries>
  </Balancing_MarketDocument>

@colingrd
Copy link
Contributor

I ran into this problem but did not take the time to study it fully yet, but I saw that there are overlapping values because in the .zip there are two files, one with docStatus A01 and one with docStatus A02.
I found that A01 means 'intermediate' and A02 means 'final', so maybe values from the A02 file should take precedence over values from the A01 file ?

@lpirl
Copy link
Contributor Author

lpirl commented Oct 27, 2025

I found that A01 means 'intermediate' and A02 means 'final', so maybe values from the A02 file should take precedence over values from the A01 file ?

If that is the case, I agree. Will try to propose a fix.

@lpirl
Copy link
Contributor Author

lpirl commented Oct 28, 2025

Thanks @colingrd. I just checked with the query/response linked in the initial post. In there, we have overlapping time series within a single docStatus (A01 in that case). The overlaps seem to have different flowDirection.direction. I guess, this PR should thus still fix the original issue.

Nevertheless, the issue mentioned by @colingrd (overlaps with different docStatus) should probably addressed as well elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants