Commit a7f06e1
Added statistic related to out variant nodes
Summary: added more statistic info for static runtime
Test Plan:
caffe2/benchmarks/static_runtime:static_runtime_cpptest
Expected output example:
Static runtime ms per iter: 0.939483. Iters per second: 1064.41
Node #0: 0.195671 ms/iter, %wide_offset.1 : Tensor = aten::add(%wide.1, %self._mu, %4)
Node #1: 0.169457 ms/iter, %wide_normalized.1 : Tensor = aten::mul(%wide_offset.1, %self._sigma)
Node #2: 0.118218 ms/iter, %wide_preproc.1 : Tensor = aten::clamp(%wide_normalized.1, %5, %6)
Node #3: 0.038814 ms/iter, %user_emb_t.1 : Tensor = aten::transpose(%user_emb.1, %4, %7)
Node #4: 0.0860747 ms/iter, %dp_unflatten.1 : Tensor = aten::bmm(%ad_emb_packed.1, %user_emb_t.1)
Node #5: 0.0102666 ms/iter, %31 : Tensor = static_runtime::flatten_copy(%dp_unflatten.1, %4, %8)
Node #6: 0.000476333 ms/iter, %19 : Tensor[] = prim::ListConstruct(%31, %wide_preproc.1)
Node #7: 0.0707332 ms/iter, %input.1 : Tensor = aten::cat(%19, %4)
Node #8: 0.123695 ms/iter, %fc1.1 : Tensor = aten::addmm(%self._fc_b, %input.1, %29, %4, %4)
Node #9: 0.0309244 ms/iter, %23 : Tensor = aten::sigmoid(%fc1.1)
Node #10: 0.0046297 ms/iter, %24 : (Tensor) = prim::TupleConstruct(%23)
Time per node type:
0.195671 ms. 23.0483%. aten::add (1 nodes)
0.169457 ms. 19.9605%. aten::mul (1 nodes, out variant)
0.123695 ms. 14.5702%. aten::addmm (1 nodes, out variant)
0.118218 ms. 13.925%. aten::clamp (1 nodes, out variant)
0.0860747 ms. 10.1388%. aten::bmm (1 nodes, out variant)
0.0707332 ms. 8.33175%. aten::cat (1 nodes, out variant)
0.038814 ms. 4.57195%. aten::transpose (1 nodes)
0.0309244 ms. 3.64263%. aten::sigmoid (1 nodes, out variant)
0.0102666 ms. 1.20932%. static_runtime::flatten_copy (1 nodes, out variant)
0.0046297 ms. 0.545338%. prim::TupleConstruct (1 nodes, out variant)
0.000476333 ms. 0.0561079%. prim::ListConstruct (1 nodes, out variant)
0.848959 ms. in Total
StaticRuntime setup time: 0.018925 ms
Memory allocation time: 0.019808 ms
Memory deallocation time: 0.0120445 ms
Outputs deallocation time: 0.0864947 ms
Total memory managed: 19328 bytes
Total number of reused tensors: 3
Total number of 'out' variant nodes/total number of nodes: 9/11 (81.8182%)
Reviewed By: hlu1
Differential Revision: D28553029
fbshipit-source-id: 55e7eab50b4b475ae219896100bdf4f6678875a41 parent 056287a commit a7f06e1
3 files changed
+27
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
830 | 830 | | |
831 | 831 | | |
832 | 832 | | |
833 | | - | |
834 | | - | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
835 | 839 | | |
836 | 840 | | |
837 | 841 | | |
| |||
851 | 855 | | |
852 | 856 | | |
853 | 857 | | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
854 | 864 | | |
855 | 865 | | |
856 | 866 | | |
| |||
978 | 988 | | |
979 | 989 | | |
980 | 990 | | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
981 | 995 | | |
982 | 996 | | |
| 997 | + | |
983 | 998 | | |
984 | 999 | | |
985 | 1000 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
205 | 205 | | |
206 | 206 | | |
207 | 207 | | |
| 208 | + | |
| 209 | + | |
208 | 210 | | |
209 | 211 | | |
210 | 212 | | |
211 | 213 | | |
| 214 | + | |
212 | 215 | | |
213 | 216 | | |
214 | 217 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
29 | 34 | | |
30 | 35 | | |
31 | 36 | | |
| |||
36 | 41 | | |
37 | 42 | | |
38 | 43 | | |
39 | | - | |
| 44 | + | |
| 45 | + | |
40 | 46 | | |
41 | 47 | | |
42 | 48 | | |
| |||
0 commit comments