llama-quantize crashes with ios_base::failbit when using --tensor-type-file or --tensor-type (regression from #20503)

### Describe the bug

After PR #20503, `llama-quantize` (and any tool using `--tensor-type-file` / `--tensor-type`) spams the following message for many models:

```
str: cannot properly format tensor name position_embd with suffix=weight bid=-1 xid=-1
str: cannot properly format tensor name token_types with suffix=weight bid=-1 xid=-1
```

This spam eventually causes:

```
llama_model_quantize: failed to quantize: ios_base::failbit set: iostream stream error
main: failed to quantize model from '...'
```

The crash happens reliably on **Qwen3.5** models (and likely other architectures that have non-standard tensors like `position_embd.weight`, `token_types.weight`, `token_embd.weight` etc.).

### Steps to reproduce

1. Use any recent build of llama.cpp **after** PR #20503 (including llama-cpp-turboquant).
2. Create a tensor-type config (example for Qwen3.5-4B with 32 layers):

```txt
position_embd.weight=Q8_0
token_types.weight=Q8_0
token_embd.weight=Q8_0
output.weight=Q8_0
output_norm.weight=Q8_0

# Middle layers (example)
blk.2.attn_q.weight=tq4_1s
... (all the way to blk.29)
```

3. Run:

```bash
llama-quantize --allow-requantize \
  --tensor-type-file config_i.txt \
  Qwen3.5-4B-Q8_0.gguf \
  Qwen3.5-4B-TQ4_1S.gguf \
  Q8_0
```

→ Gets flooded with `cannot properly format tensor name` messages → crashes with `ios_base::failbit`.

Even using multiple `--tensor-type "blk.*.weight=..."` arguments produces the same spam and crash.

### Expected behavior

- Custom tensor types should be applied silently.
- Non-critical tensors (`bid=-1`) should not spam or cause iostream failure.
- Quantization should complete successfully.

### Actual behavior

Massive spam + hard crash before any output file is written.

### Environment

- Model: Qwen3.5-4B (and probably any Qwen 3.x)
- llama.cpp version: post #20503 (tested on latest master + llama-cpp-turboquant)
- Command: `llama-quantize` with `--tensor-type-file` or `--tensor-type`
- OS: Windows / Linux (reproduced on both)

### Related issues

- This is the exact same regression described in #21115 ("Eval bug: regression introduced in #20503")
- The problem is much more severe during quantization because it aborts the entire process.

### Workaround (temporary)

Downgrade to a commit **before** #20503 (e.g. `git checkout 2026-03-20` or earlier).

Would be great if the tensor name formatter could gracefully skip/handle tensors with `bid=-1` / `xid=-1` without spamming and without breaking the iostream.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama-quantize crashes with ios_base::failbit when using --tensor-type-file or --tensor-type (regression from #20503) #68

Describe the bug

Steps to reproduce

Expected behavior

Actual behavior

Environment

Related issues

Workaround (temporary)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

llama-quantize crashes with ios_base::failbit when using --tensor-type-file or --tensor-type (regression from #20503) #68

Description

Describe the bug

Steps to reproduce

Expected behavior

Actual behavior

Environment

Related issues

Workaround (temporary)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions