Skip to content

support layernorm int8#4132

Open
tpoisonooo wants to merge 41 commits intoTencent:masterfrom
tpoisonooo:vit-add-layernorm-int8
Open

support layernorm int8#4132
tpoisonooo wants to merge 41 commits intoTencent:masterfrom
tpoisonooo:vit-add-layernorm-int8

Conversation

@tpoisonooo
Copy link
Contributor

@tpoisonooo tpoisonooo commented Aug 11, 2022

基于 PR #4096 接着写。

  • 根据 FQViT 论文实现了 int8 layernorm
    • imagenet1k 数据集,只量化 LN 精度 - 0.3%; 量化 LN+Conv+Gemm+mha + bias correlation 精度 -0.9%
    • ncnn2int8 开发了 layernorm requant 功能, 匹配 layernorm_int --> ( mha_int8 | conv_int8 | gemm_int8 ) 模式
    • 更新了 operators.md 文档
    • 更新了对应的 testcase
  • example 增加 vision_transformer_int8,模型下载路径在注释里了
  • 多个 layer 都写了一次的 float2int8 放到了公共的 mathfunction.h

失败的尝试,已经 revert 掉了:

  1. 尝试 int8 binaryop ,精度下降严重
  2. 由于 innerproduct_int8 没实现 requant,导致图优化做得很烂、int8 binaryop 没加速效果

tpoisonooo and others added 30 commits July 28, 2022 18:22
feat(src/layer): add mha int8
@tpoisonooo tpoisonooo changed the title Vit add layernorm int8 support layernorm int8 Aug 11, 2022
@codecov-commenter
Copy link

codecov-commenter commented Aug 11, 2022

Codecov Report

Merging #4132 (81a1149) into master (b4ba207) will decrease coverage by 1.91%.
The diff coverage is 94.67%.

@@             Coverage Diff             @@
##           master    #4132       +/-   ##
===========================================
- Coverage   94.42%   92.51%    -1.92%     
===========================================
  Files         747      374      -373     
  Lines      178769    66122   -112647     
===========================================
- Hits       168811    61173   -107638     
+ Misses       9958     4949     -5009     
Impacted Files Coverage Δ
src/layer/convolution.cpp 94.24% <ø> (-0.22%) ⬇️
src/layer/layernorm.cpp 93.08% <85.52%> (-5.13%) ⬇️
src/layer/multiheadattention.cpp 98.53% <98.77%> (+5.30%) ⬆️
src/layer/mathfun.h 100.00% <100.00%> (ø)
src/layer/arm/convolution_winograd_transform.h 0.00% <0.00%> (-100.00%) ⬇️
...c/layer/arm/convolution_winograd_transform_bf16s.h 0.00% <0.00%> (-98.31%) ⬇️
src/layer/arm/flatten_arm.cpp 35.74% <0.00%> (-63.46%) ⬇️
src/layer/arm/packing_arm.cpp 64.53% <0.00%> (-31.76%) ⬇️
src/layer/arm/slice_arm.cpp 80.32% <0.00%> (-17.15%) ⬇️
src/layer/arm/concat_arm.cpp 80.32% <0.00%> (-16.25%) ⬇️
... and 491 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@lgtm-com
Copy link

lgtm-com bot commented Aug 11, 2022

This pull request introduces 1 alert when merging 9277a1f into acbaaa6 - view on LGTM.com

new alerts:

  • 1 for FIXME comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants