Skip to content

Conversation

@willg-nv
Copy link

@willg-nv willg-nv commented Dec 17, 2025

What does this PR do?

Type of change: new feature

Overview: This PR integrates an automatical QDQ placment tool into ModelOpt.

This PR is the 1/4 parts of the change, it contains the following changes:

  1. Defines common types: Region, RegionType, Error types
  2. Defines InsertionPoints (the logical localtion to place QDQ pairs), InsertionScheme (a set of insertion points)
  3. Unit tests for new types

Part 1: #701
Part 2: #702
Part 3: #703
Part 4: #704

Usage

        # Region type usage:
        region = Region(region_id=1, level=0, region_type=RegionType.LEAF)
        assert region.get_id() == 1
        assert region.get_level() == 0
        region.add_node(1) # 1 is the index of ONNX graph node
        ...

        point = NodeInputInsertionPoint(node_index=0, input_index=2)
        assert point.node_index == 0 # relative node index in region
        assert point.input_index == 2 # relative input tensor index in specific node
        resolved = point.resolve(region, graph)
        ...

Testing

Implement unit tests, all tests could get passed.

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: Yes
  • Did you add or update any necessary documentation?: No, document change will be included in part 4.
  • Did you update Changelog?: No, this could be done when all parts of the change are merged.

Additional Information

Summary by CodeRabbit

  • New Features

    • Added foundational autotuner infrastructure for quantization optimization, including region hierarchies and insertion scheme management.
    • Introduced insertion point system for managing quantize/dequantize operation placement across ONNX graph regions.
    • Added utility functions for tensor consumer mapping and boolean operation identification.
  • Tests

    • Added comprehensive test coverage for autotuner components, insertion points, and region management.

✏️ Tip: You can customize this high-level summary in your review settings.

@willg-nv willg-nv requested a review from a team as a code owner December 17, 2025 06:18
@willg-nv willg-nv requested a review from gcunhase December 17, 2025 06:18
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 17, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch from 9c53783 to f872e70 Compare December 19, 2025 05:32
@willg-nv
Copy link
Author

Hi @gcunhase, could you help me review this PR? thanks!

@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch 5 times, most recently from bbbc98b to 80792fa Compare January 7, 2026 07:12
@ajrasane
Copy link
Contributor

ajrasane commented Jan 7, 2026

LGTM from my side. Will wait for @gcunhase review.

@gcunhase
Copy link
Contributor

gcunhase commented Jan 8, 2026

LGTM, added a few comments, thanks.

@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch from 80792fa to 66ef3ad Compare January 9, 2026 02:30
@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch from 66ef3ad to 01b383a Compare January 12, 2026 01:30
@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch 2 times, most recently from 4545a57 to be965aa Compare January 15, 2026 02:40
@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch 4 times, most recently from 843fc1f to 5f7844b Compare January 26, 2026 05:58
@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch 2 times, most recently from 3fc1ffa to e07b7e4 Compare January 27, 2026 11:12
@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch from 64817f4 to ef95c8d Compare January 29, 2026 09:18
Copy link
Contributor

@ajrasane ajrasane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@gcunhase gcunhase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, other changes can be done in #702.

@ajrasane ajrasane enabled auto-merge (squash) January 30, 2026 01:39
auto-merge was automatically disabled January 30, 2026 02:05

Head branch was pushed to by a user without write access

@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch from ef95c8d to 79ee540 Compare January 30, 2026 02:05
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part1 branch from 79ee540 to 09a91a8 Compare January 30, 2026 02:11
@codecov
Copy link

codecov bot commented Jan 30, 2026

Codecov Report

❌ Patch coverage is 91.28065% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.16%. Comparing base (81b67dd) to head (09a91a8).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/onnx/quantization/autotune/common.py 81.81% 24 Missing ⚠️
...opt/onnx/quantization/autotune/insertion_points.py 96.61% 7 Missing ⚠️
modelopt/onnx/quantization/graph_utils.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #701      +/-   ##
==========================================
+ Coverage   73.82%   74.16%   +0.33%     
==========================================
  Files         193      195       +2     
  Lines       19745    20111     +366     
==========================================
+ Hits        14577    14915     +338     
- Misses       5168     5196      +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants