Commit b1343d1
Enable logging for the plan() function, ShardEstimators and TrainingPipeline class constructors (meta-pytorch#3521)
Summary:
This diff enables the static logging functionality to collect data for:
1) plan() - This will allow us to look at the inputs and outputs to the planner to help with use issue debugging
2) ShardEstimators - This will allow us to look at the inputs and outputs to the ShardEstimators, which gives us the bandwidth inputs to verify if the planner is generating expected values as well as help with debugging OOMs
3) TrainingPipeline - The class type here will be an indicator of which pipeline was used by the training job. The training pipeline has implications on the memory usage and is an important data point to collect to investigate OOMs.
Reviewed By: kausv
Differential Revision: D863179101 parent 7c7daaf commit b1343d1
File tree
6 files changed
+16
-0
lines changed- torchrec
- distributed
- planner
- train_pipeline
- modules
6 files changed
+16
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| 58 | + | |
58 | 59 | | |
59 | 60 | | |
60 | 61 | | |
| |||
466 | 467 | | |
467 | 468 | | |
468 | 469 | | |
| 470 | + | |
469 | 471 | | |
470 | 472 | | |
471 | 473 | | |
| |||
2021 | 2023 | | |
2022 | 2024 | | |
2023 | 2025 | | |
| 2026 | + | |
2024 | 2027 | | |
2025 | 2028 | | |
2026 | 2029 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
459 | 460 | | |
460 | 461 | | |
461 | 462 | | |
| 463 | + | |
462 | 464 | | |
463 | 465 | | |
464 | 466 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| |||
955 | 956 | | |
956 | 957 | | |
957 | 958 | | |
| 959 | + | |
958 | 960 | | |
959 | 961 | | |
960 | 962 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
146 | 147 | | |
147 | 148 | | |
148 | 149 | | |
| 150 | + | |
149 | 151 | | |
150 | 152 | | |
151 | 153 | | |
| |||
194 | 196 | | |
195 | 197 | | |
196 | 198 | | |
| 199 | + | |
197 | 200 | | |
198 | 201 | | |
199 | 202 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
106 | 107 | | |
107 | 108 | | |
108 | 109 | | |
| 110 | + | |
| 111 | + | |
109 | 112 | | |
110 | 113 | | |
111 | 114 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
125 | 126 | | |
126 | 127 | | |
127 | 128 | | |
| 129 | + | |
128 | 130 | | |
129 | 131 | | |
130 | 132 | | |
| |||
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
| 169 | + | |
167 | 170 | | |
168 | 171 | | |
169 | 172 | | |
| |||
0 commit comments