-
Notifications
You must be signed in to change notification settings - Fork 106
Description
When I run any of the binary layers ( "LinearBinary", "LinearBinaryScaling",) I get the following error. I do not encounter this with any other layer using the same code. I use the collab notebook and the followed the steps of installing MASE there.
Error:
IndexError Traceback (most recent call last)
in <cell line: 0>()
55
56 # Run the study
---> 57 study.optimize(objective, n_trials=20)
58
59 # Store raw results for this precision
35 frames
/content/mase/src/chop/nn/quantizers/utils.py in alpha(tensor)
156 def alpha(tensor): # determine batch means
157 absvalue = tensor.abs()
--> 158 alpha = absvalue.mean(dim=(1, 2, 3), keepdims=True)
159 return alpha.view(-1, 1)
160
IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)
Code:
import optuna
import matplotlib.pyplot as plt
from optuna.samplers import TPESampler
Define all precision layers to test
precision_layers = [
# "LinearInteger",
# "LinearMinifloatDenorm",
# "LinearMinifloatIEEE",
# "LinearLog",
# "LinearBlockFP",
# "LinearBlockLog",
"LinearBinary",
"LinearBinaryScaling",
"LinearBinaryResidualSign",
]
Dictionary to store raw results for each precision
all_results = {}
Loop through each precision layer and run Optuna
for precision in precision_layers:
print()
print(f"Running study for {precision}...")
print()
# Set the layer mapping for this run
layer_mapping = {
"torch.nn.Linear": torch.nn.Linear,
precision: globals()[precision], # Dynamically get the class
}
# Update search_space dynamically for the current precision
search_space = {
"linear_layer_choices": list(layer_mapping.keys()),
"widths": [8, 16, 32],
"frac_widths": [2, 4, 8],
"exponent_widths": [3, 4, 5],
"exponent_bias": [0, 1, 2],
"exponent_bias_width": [1, 2, 3],
"block_sizes": [[2], [4], [8]],
"stochastic": [True, False],
"bipolar": [True, False],
"binary": [True, False],
}
# Create a new study
sampler = TPESampler()
study = optuna.create_study(
direction="maximize",
study_name=f"bert-tiny-nas-{precision}",
sampler=sampler,
)
# Run the study
study.optimize(objective, n_trials=20)
# Store raw results for this precision
all_results[precision] = [trial.value for trial in study.trials]
print(all_results)