-
Notifications
You must be signed in to change notification settings - Fork 0
Integer Word Conversions Explained
This note describes an approach to representing and optimizing the conversions between integers (and words) of different sizes in Standard ML.
Because different SML implementations can have different sizes
of integer and word types, the designers of the SML Basis Library
had to design a API for conversions that would work for any SML
implementation. The approach is based on the idea that one can
convert from IntN.int to IntM.int by first converting from
IntN.int to LargeInt.int and then converting from LargeInt.int
to IntM.int.[^1] While this approach works for source code, inside
a given compiler we want to have direct conversions.
[^1]: For more details about this design, see The History of Standard ML.
All integer/word conversion operations are expressed using five primitive conversion operators. Algebraic equations over these operators are easy to define and can be used to simplify composition of conversion operations.
The five basic conversion operators are as follows (in all cases,
we assume that
-
COPY($m,n$ ) — copy (i.e., zero-extend) an$m$ -bit value to an$n$ -bit value. The operationCOPY($n,n$ ) is the identity function on n-bit values. -
EXTEND($m,n$ ) — sign extend an$m$ -bit value to a$n$ -bit value. Note that it may be helpful to also distinguish between extending signed and unsigned values, since the former may be in 2's complement form. -
TRUNC($n,m$ ) — truncate an$n$ -bit value to an$m$ -bit value; i.e., discard the high-order$n-m$ bits. -
TEST($n,m$ ) — map an$n$ -bit, 2's complement signed value to an$m$ -bit, 2's complement signed value; this operation raisesOverflowif the input is outside the range$[-2^{m-1},2^{m-1}-1]$ . -
TESTU($n,m$ ) — map an unsigned$n$ -bit value to an$m$ -bit 2's complement value; this operation raiseOverflowif the value is is outside the range$[0,2^{m-1}-1]$ .
COPY and EXTEND are used to go from small values to large, while
TRUNC, TEST, and TESTU are are used to go from large values to
small. The operators COPY, EXTEND, and TRUNC are "pure," while
TEST and TESTU may raise Overflow. We use ∞ for IntInf.int).
(Note: the SML/NJ implementation has a second set of
primops -- TEST_INF, etc. -- for conversions involving IntInf.int)
Most conversions where the sizes are the same can be simplified to the identity:
COPY(n,n) => IDENTITY
TEST(n,n) => IDENTITY
EXTEND(n,n) => IDENTITY
TRUNC(n,n) => IDENTITY
Note, however, that this property does not hold for the TESTU operator, which
raises Overflow if the input is in the range
Assuming that LargeInt is arbitrary precision (i.e., IntInf), LargeWord
is 32 bits, and the default Int and Word types are 31 bits, then the translation
of conversion operations in the Word32 structure is given by:
toLargeInt => TESTU(32,∞)
toLargeIntX => EXTEND(32,∞)
fromLargeInt => TESTU(∞,32)
toInt => TESTU(32,31)
toIntX => TEST(32,31)
fromInt => EXTEND(31,32)
toLargeWord => COPY(32,32) => IDENTITY
toLargeWordX => EXTEND(32,32) => IDENTITY
fromLargeWord => TRUNC(32,32) => IDENTITY
And if LargeInt is Int32, then the operations in the Word8 structure are
toLargeInt => COPY(8,32)
toLargeIntX => EXTEND(8,32)
fromLargeInt => TRUNC(32,8)
toInt => COPY(8,31)
toIntX => EXTEND(8,31)
fromInt => TRUNC(31,8)
toLargeWord => COPY(8,32)
toLargeWordX => EXTEND(8,32)
fromLargeWord => TRUNC(32,8)
These operations allow for simplification via algebraic rewrites.
Each operator composed with itself is itself, but with different parameters.
In the following, we assume that
COPY(n,p) o COPY(m,n) == COPY(m,p)
EXTEND(n,p) o EXTEND(m,n) == EXTEND(m,p)
TRUNC(n,m) o TRUNC(p,n) == TRUNC(p,m)
TEST(n,m) o TEST(p,n) == TEST(p,m)
TESTU(n,m) o TESTU(p,n) == TESTU(p,m)
The composition of different operators is described by the following simple algebra:
EXTEND(n,p) o COPY(m,n) == COPY(m,p) if (n > m)
TRUNC(n,q) o COPY(m,n) == COPY(m,q) if (q >= m)
== TRUNC(m,q) if (q < m)
TEST(n,q) o COPY(m,n) == COPY(m,q) if (q >= m)
== TEST(m,q) if (q < m)
TESTU(n,q) o COPY(m,n) == COPY(m,q) if (q > m)
== TESTU(m,q) if (q <= m)
TRUNC(n,q) o EXTEND(m,n) == EXTEND(m,q) if (q > m)
== TRUNC(m,q) if (q <= m)
TEST(n,q) o EXTEND(m,n) == EXTEND(m,q) if (q >= m)
== TEST(m,q) if (q < m)
TESTU(n,q) o EXTEND(m,n) == TESTU(m,q) if (q <= m)
For example, consider:
Word.toInt o Word.fromLargeWord o Word8.toLargeWordAssuming a 31-bit default word type and 32-bit large words,
this code translates to:
TESTU(31,31) o TRUNC(32,31) o COPY(8,32)and simplifies to:
TESTU(31,31) o COPY(8,31)This expression further simplifies to:
COPY(8, 31)Since both 8-bit and 31-bit quantities are tagged the same way, this
can be translated to a MOVE. With a smart register allocator that MOVE
can be eliminated.