P2: model_quantized.onnx filename not recognized as quantized — wrong param estimate

﻿## Description

The model parameter estimation logic uses filename patterns to detect quantization (e.g., `q4`, `int8`, `q4f16`). However, models named `model_quantized.onnx` (a common HuggingFace Optimum naming convention) do not match the quantization pattern, causing the parameter count to be estimated from raw file size without the quantization divisor.

This leads to incorrect capability tier classification — a quantized 0.5B model may be estimated at 1.4B parameters and still classified as Basic, or a quantized 3B model might be estimated at ~10B and classified as Strong.

## Reproduction

```bash
wraithrun --live \
  --model C:\Models\Qwen2.5-0.5B-Instruct-ONNX\onnx\model_quantized.onnx \
  --tokenizer C:\Models\Qwen2.5-0.5B-Instruct-ONNX \
  --task ssh-keys
# Log shows: "Estimated 1.4B params" (should be ~0.5B)
```

## Expected Behavior

Filename patterns should also match:
- `quantized` / `model_quantized`
- `quant` / `model_quant`
- Or better: read ONNX metadata (if available) for actual quantization info

## Affected Files

- `inference_bridge/src/lib.rs` or `inference_bridge/src/onnx_vitis.rs` (filename-based param estimation)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P2: model_quantized.onnx filename not recognized as quantized — wrong param estimate #167

Description

Reproduction

Expected Behavior

Affected Files

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

P2: model_quantized.onnx filename not recognized as quantized — wrong param estimate #167

Description

Description

Reproduction

Expected Behavior

Affected Files

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions