Skip to content

[BUG] Arrays with >= 2^31 elements fail materialization and some indexing on Metal #3327

@Aristide021

Description

@Aristide021

I ran into this while trying to write an end-to-end regression for a Metal conv_general output-offset bug.

Reproducer

import mlx.core as mx

n = (2**31) // (64 * 64) + 2
x = mx.ones((n, 8, 8, 1), dtype=mx.float16)
w = mx.ones((1, 1, 1, 1), dtype=mx.float16)
y = mx.conv_general(x, w, input_dilation=(9, 9))
mx.eval(y)

Output shape: (524290, 64, 64, 1), total elements: 2147491840.

Expected

Either the array evaluates successfully, or MLX raises a clean size-limit error with the true requested size.

Actual

mx.eval(y) fails with a wrapped allocation size:

RuntimeError: [metal::malloc] Attempting to allocate 18446744069414600704 bytes which is greater than the maximum allowed buffer size of 86586540032 bytes.

18446744069414600704 is 2^64 - 4294950912 consistent with signed-to-unsigned overflow in size bookkeeping.

Related failures from the same y

mx.reshape(y, (-1,))          # ValueError: [reshape] Cannot reshape array of size 2147491840 into shape (-2147475456).
mx.take(y, mx.array([0], dtype=mx.uint32))  # ValueError: [gather] Slice sizes must be in [0, a.shape(i)]. Got (1) for array with shape (-2147475456).

Lazy slice y[-1] reports the correct shape (64, 64, 1), but mx.eval(y[-1]) triggers the same wrapped metal::malloc error.

The inferred shape -2147475456 is 2147491840 interpreted as signed int32 confirming overflow is in shape/size accounting, not just the allocator.

Boundary

Verified on mlx 0.31.1 and local 0.31.2.dev20260325+84099a14b:

output elements reshape eval slice materialization
2147479552 (2³¹ - 4096, n=524287)
2147483648 (exactly 2³¹, n=524288) ✗ negative shape ✗ wrapped alloc ✗ wrapped alloc
2147491840 (reproducer, n=524290) ✗ negative shape ✗ wrapped alloc ✗ wrapped alloc

The overflow triggers at exactly 2^31 elements. The last passing case 2^31 - 4096 falls naturally out of the (n, 64, 64, 1) shape grid.

Root cause

The exact 2^31 boundary is consistent with ShapeElem = int32_t in mlx/array.h. The wrapped allocation and negative shapes appear to be downstream symptoms of shape/size overflow at the host level before Metal is involved. Additional affected paths likely exist.

Environment

  • Device: Apple M3 Max
  • Memory: 137438953472 bytes
  • Max buffer length: 86586540032 bytes
  • Reproduced three times in independent fresh Python processes with identical results
  • Reproduced on both installed wheel (mlx 0.31.1) and local main build

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions