RDOQ: Effective Fine-Grained Quantization for DNNs via Rate-Distortion Optimization

Official implementation of the TPAMI submission: "RDOQ: Effective Fine-Grained Quantization for DNNs via Rate-Distortion Optimization".

This repository contains code for RDOQ, a novel fine-grained channel-wise quantization approach for DNNs via channel-wise / channel group-wise mixed-precision quantization that achieves a state-of-the-art combination of quantization quality and compression rate. RDOQ formulates as a rate-distortion optimization problem, and addresses the problem of efficiently exploring the hyperparameter space of channel-wise bit widths.

RDOQ: Effective Fine-Grained Quantization for DNNs via Rate-Distortion Optimization
Table of Contents
Technical Details
- 1. First-order Approach
- 2. Second-order Approach (*NEW*)
Project Structure
- ImageNet classification
  - First-order approach
  - Second-order approach
- 3D LiDAR based Object Detection
  - Preparation:
  - Running

Technical Details

1. First-order Approach

We discovered and utilized a first-order additivity property on DNNs, that is able to breakdown the model global output distortion into the individual contribution of weights/activations groups. Adopting the Taylor series expansion approximation of the model global output up to first-order item, the first-order additivity property can be expressed as:

which allows us to efficiently solve the fine-grained bit allocation problem using efficient integer programming algorithms. We propose two integer programming solvers to solve the above allocation problem, namely Lagrangian Formulation and Dynamic Programming.

2. Second-order Approach (NEW)

Noted that in practice, to make first-order approach workds well, we repeatedly samples the model output from all calibration images to form the distortion curves. So in our new proposed second-order approach, we managed to significantly reduce the calibration overhead, by properly approximate the real-world distortion curves required in previous first-order approach with second-order approximation. We also did some complexity optimization on the hessian-related calculation to support fast implementation, which finally gives the following formulations:

Approx. weight distortion curves:
Approx. activation distortion curves:

Project Structure

This repository contains the post-training quantization implementation of both ImageNet classification and 3D LiDAR based object detection models, organized in the subdirectory respectively:

ImageNet classification: ./classification
3D LiDAR based Object Detection: ./openpcdet

ImageNet classification

Example scripts to evaluate the accuracy on ImageNet validation dataset:

First-order approach

Generate rate-distortion curves for channel groups in ViT layer weights.

CUDA_VISIBLE_DEVICES=0 python3 generate_RD_curves_batch.py \
 --archname deit_base_patch16_224 \
 --datapath=/imagenet/ \
 --modeid=0 --gpuid=0 \
 --nchannelbatch 128 \
 --testsize=64 \
 --batchsize 64 
```-

Generate rate-distortion curves for channel groups in ViT layer activations.

CUDA_VISIBLE_DEVICES=0 python3 generate_RD_curves_act_out_nchan.py \
 --archname deit_base_patch16_224 \
 --datapath=/imagenet/ \
 --modeid=0 --gpuid=0 \
 --nchannelbatch 128 \
 --testsize=64 \
 --batchsize 64

(Option #1) Lagrangian Formulation method for bit allocation:

CUDA_VISIBLE_DEVICES=0 python3 generate_RD_frontier_batch_distortion_act_out_nchan.py \
 --archname=deit_base_patch16_224 \
 --maxslopesteps 196 \
 --pathrdcurve=deit_base_patch16_224_ndz_0010_nr_0011_ns_0064_nf_0128_rdcurves_channelwise_opt_dist/ \
 --gpuid=0 \
 --datapath=/imagenet \
 --nchannelbatch=128 \
 --maxrates=11 \
 -bcw -bca

(Option #2) Dynamic Programming method for bit allocation:

CUDA_VISIBLE_DEVICES=0 python3 generate_DP_batch_distortion_act_out_nchan.py \
 --archname=deit_base_patch16_224 \
 --maxslopesteps 196 \
 --pathrdcurve=deit_base_patch16_224_ndz_0010_nr_0011_ns_0064_nf_0128_rdcurves_channelwise_opt_dist/ \
 --gpuid=0 \
 --datapath=/imagenet \
 --nchannelbatch=128 \
 --maxrates=11 \
 --target_rates 4 6 8 \
 -bcw -bca

Second-order approach

We provide an automatic script to perform end-to-end quantization including calibration and bit allocation for second-order approach.

CUDA_VISIBLE_DEVICES=0,1,2,3 bash gen_rd_hessian_curves_deit_b_224_all.sh

USAGE: To run different settings, config the following variables in the above bash script:

NCHANNELBATCH: The number of channels in a channel-group
TESTSIZE: Calibration dataset size
BATCHSIZE: Batch size
MODELNAME: pretrained model name to load from timm.create_modal()
SOLVER: "dp" or "lg", for Dynamic Programming and Lagrangian Formulation respectively.
DATASET: path to imagenet

3D LiDAR based Object Detection

Preparation:

Follow installation instructions in OpenPCDet

Running

Generate rate-distortion curves for channel groups in detection model layer weights.

CUDA_VISIBLE_DEVICES=0 python3 generate_rd_curves_batch.py \
   --batch_size 2 --cfg_file cfgs/nuscenes_models/cbgs_voxel0075_voxelnext_vtc.yaml \
   --pretrained_model path/to/model.pth.tar \
   --workers 1

Dynamic Programming method for bit allocation:

CUDA_VISIBLE_DEVICES=0 python3 generate_DP_frontier_batch_act_nchan.py \
 --batch_size 32 --workers 4 \
 --cfg_file cfgs/nuscenes_models/cbgs_voxel0075_voxelnext_vtc.yaml \
 --pretrained_model path/to/model.pth.tar \
 --nchannelbatch 128 \
 --pathrdcurve VoxelNeXt_ndz_0010_nr_0011_ns_0001_nf_0128_rdcurves_channelwise_opt_dist/ \
 --target_rates 2 4 6 8 -bcw

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
OpenPCDet @ 4d35b59		OpenPCDet @ 4d35b59
assets		assets
classification		classification
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RDOQ: Effective Fine-Grained Quantization for DNNs via Rate-Distortion Optimization

Table of Contents

Technical Details

1. First-order Approach

2. Second-order Approach (NEW)

Project Structure

ImageNet classification

First-order approach

Second-order approach

3D LiDAR based Object Detection

Preparation:

Running

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RDOQ: Effective Fine-Grained Quantization for DNNs via Rate-Distortion Optimization

Table of Contents

Technical Details

1. First-order Approach

2. Second-order Approach (*NEW*)

Project Structure

ImageNet classification

First-order approach

Second-order approach

3D LiDAR based Object Detection

Preparation:

Running

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Second-order Approach (NEW)

Packages