Separate FW version read API for Hisense 2x100G AOC cable + enable image validity check for dual bank modules#631
Separate FW version read API for Hisense 2x100G AOC cable + enable image validity check for dual bank modules#631pnakka28 wants to merge 6 commits intosonic-net:masterfrom
Conversation
…age validity check for dual bank modules - Add CmisAocSingleBankApi class for single-bank modules - Override get_module_fw_info() to handle single-bank - Add vendor routing for Hisense modules - Add tests for single-bank firmware info and factory routing Signed-off-by: Pavan Kalyan Nakka <pnakka@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR adjusts CMIS firmware version reporting so dual-bank modules no longer report an “inactive” version when the inactive image is invalid, while preserving the needed single-bank behavior via a Hisense-specific CMIS API override routed through XcvrApiFactory.
Changes:
- Revert single-bank inactive FW fallback in the default CMIS
get_module_fw_info()so invalid images correctly yieldN/Aon dual-bank modules. - Add
CmisAocSingleBankApi(Hisense 2x100G AOC) to read inactive FW from EEPROM for single-bank modules. - Add factory routing + unit tests intended to cover the new vendor override path.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
sonic_platform_base/sonic_xcvr/api/public/cmis.py |
Restores dual-bank semantics by always using Image A/B version (which becomes N/A when invalid) for inactive FW. |
sonic_platform_base/sonic_xcvr/api/hisense/aoc_2x100g.py |
Introduces Hisense single-bank override that reads inactive FW from EEPROM. |
sonic_platform_base/sonic_xcvr/xcvr_api_factory.py |
Routes Hisense DEF8504-2x100G part numbers to the single-bank API implementation. |
tests/sonic_xcvr/test_xcvr_api_factory.py |
Adds a Hisense routing test (currently not asserting routing correctness). |
tests/sonic_xcvr/test_aoc_2x100g.py |
Adds unit tests for the new Hisense single-bank get_module_fw_info() override (assertions are currently too weak). |
setup.py |
Ensures the new sonic_platform_base.sonic_xcvr.api.hisense package is included in the distribution. |
Signed-off-by: Pavan Kalyan Nakka <pnakka@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| logger = logging.getLogger(__name__) | ||
|
|
||
| class CmisAocSingleBankApi(CmisApi): | ||
| def get_module_fw_info(self): |
There was a problem hiding this comment.
@prgeor get_module_fw_info() is the method called by sfputil show fwversion and is also called internally by get_transceiver_info_firmware_versions() in cmis.py. If we override get_firmware_info(), all these calls would still use the base get_module_fw_info() (current) which reads Image B from invalid CDB registers on single-bank modules. Overriding get_module_fw_info() ensures correct single-bank behavior.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.
You can also share your feedback on Copilot code review. Take the survey.
|
@pnakka28 please address copilot review |
- Replaced CDB byte parsing with CdbFwHandler.get_firmware_info() - Updated tests to mock CdbFwHandler instead of raw CDB responses Signed-off-by: Pavan Kalyan Nakka <pnakka@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
You can also share your feedback on Copilot code review. Take the survey.
| RunningImage = 'A' | ||
| ActiveFirmware = ImageA | ||
| # In case of single bank module, inactive firmware version can be read from EEPROM | ||
| InactiveFirmware = self.get_module_inactive_firmware() + ".0" |
There was a problem hiding this comment.
get_module_inactive_firmware() reads from EEPROM registers. This code path only executes when ImageA is running, so the module is running with CDB support not a flat memory. EEPROM registers will always return valid values here.
There was a problem hiding this comment.
@pnakka28 "EEPROM registers will always return valid values here." -- is this correct even if there are i2c read errors?
There was a problem hiding this comment.
@prgeor i2c read errors could cause get_module_inactive_firmware() to return N/A. will add guard to append for only valid string.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.
You can also share your feedback on Copilot code review. Take the survey.
Signed-off-by: Pavan Kalyan Nakka <pnakka@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| if fw_info is False or fw_info is None: | ||
| return {'status': False, 'info': "Failed to get firmware info", 'result': None} | ||
|
|
||
| fw_status = fw_info.get(cdb_consts.CDB1_FIRMWARE_STATUS, {}) |
- Refactor to use CdbFwHandler for parsing firmware info instead of decoding raw CDB reply bytes - Added subclass single-bank firmware modules that read inactive firmware from EEPROM Signed-off-by: Pavan Kalyan Nakka <pnakka@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@copilot review the code |
| self._cdb_fw_hdlr = self._create_cdb_fw_handler() | ||
| if self._cdb_fw_hdlr is None: | ||
| return {'status': False, 'info': "CDB FW handler init failed", 'result': None} | ||
|
|
||
| fw_info = self._cdb_fw_hdlr.get_firmware_info() |
There was a problem hiding this comment.
get_module_fw_info() now instantiates CdbFwHandler and calls get_firmware_info(), but _create_cdb_fw_handler() constructs CdbFwHandler which asserts inside __init__ when initFwHandler() fails (e.g., module busy / CDB failure). Since this method doesn’t catch exceptions, callers can get an unhandled AssertionError instead of a clean {'status': False, ...} response. Wrap handler creation / firmware-info reads in exception handling (and avoid relying on assert for runtime control flow) so get_module_fw_info() fails gracefully.
| self._cdb_fw_hdlr = self._create_cdb_fw_handler() | |
| if self._cdb_fw_hdlr is None: | |
| return {'status': False, 'info': "CDB FW handler init failed", 'result': None} | |
| fw_info = self._cdb_fw_hdlr.get_firmware_info() | |
| try: | |
| self._cdb_fw_hdlr = self._create_cdb_fw_handler() | |
| except AssertionError as e: | |
| logger.error("CDB FW handler init assertion failed: %s", e) | |
| return {'status': False, 'info': "CDB FW handler init failed", 'result': None} | |
| except Exception as e: | |
| logger.exception("Unexpected error during CDB FW handler init: %s", e) | |
| return {'status': False, 'info': "CDB FW handler init failed", 'result': None} | |
| if self._cdb_fw_hdlr is None: | |
| return {'status': False, 'info': "CDB FW handler init failed", 'result': None} | |
| try: | |
| fw_info = self._cdb_fw_hdlr.get_firmware_info() | |
| except Exception as e: | |
| logger.exception("Exception while getting firmware info: %s", e) | |
| return {'status': False, 'info': "Failed to get firmware info", 'result': None} |
| return {'status': False, 'info': "CDB FW handler init failed", 'result': None} | ||
|
|
||
| fw_info = self._cdb_fw_hdlr.get_firmware_info() | ||
| if fw_info is False or fw_info is None: |
There was a problem hiding this comment.
The previous get_module_fw_info() logic retried on the CDB “need to enter password” status (0x46) by calling module_enter_password() and re-issuing the command. With the new _cdb_fw_hdlr path, password-required cases will just return “Failed to get firmware info” (and may trip the handler init assert). Consider preserving the password-handling behavior either in CdbFwHandler or in get_module_fw_info() so modules requiring password don’t regress.
| if fw_info is False or fw_info is None: | |
| if fw_info is False or fw_info is None: | |
| # Attempt to handle password-required cases by entering the module password | |
| enter_password = getattr(self, 'module_enter_password', None) | |
| if callable(enter_password): | |
| # If password entry fails explicitly, propagate failure | |
| if enter_password() is False: | |
| return {'status': False, 'info': "Failed to enter module password", 'result': None} | |
| # Retry fetching firmware info after entering password | |
| fw_info = self._cdb_fw_hdlr.get_firmware_info() | |
| if fw_info is False or fw_info is None: |
There was a problem hiding this comment.
@prgeor Do we still need to preserve the password entering logic or okay to remove it?
Signed-off-by: Pavan Kalyan Nakka <pnakka@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Description
Revert single-bank firmware handling from the default CMIS
get_module_fw_info()and add a vendor-specific override for single-bank modules. This change moves single-bank handling out of the default CMIS API into a vendor-specific override and fixes incorrect inactive firmware version reporting for dual-bank modules during firmware upgrades.Changes:
Motivation and Context
While the previous implementation worked correctly for single-bank modules, it caused incorrect inactive firmware reporting for dual-bank modules during firmware upgrades. When a firmware download was interrupted or an image on the inactive bank was invalid, the inactive firmware version would still be reported from EEPROM instead of N/A.
To fix this, the single-bank handling code in the default CMIS API has been reverted so that dual-bank modules correctly report N/A for invalid images. A separate vendor-specific override has been added for single-bank modules, where the single-bank firmware handling is actually needed. This approach ensures:
How Has This Been Tested?
sfputil show fwversionon single-bank modules shows inactive firmware read from EEPROMsfputil show fwversionon dual-bank CMIS modules still shows both Image A and Image B correctlyAdditional Information (Optional)
MSFT ADO - 29333553