An unlimited-width integer type

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

An unlimited-width integer type

Kristof Beyls via cfe-dev
Hi, I'm considering writing an LLVM backend to target an abstract machine, encoding to a bytecode ISA.

This abstract machine has integers, but not in the sense of having fixed-size machine words; rather, each abstract-machine register and memory location can losslessly hold onto a distinct (non-aliasing) unlimited bit-width value, and math/bitwise ops will operate losslessly on these values to produce new values of unlimited bit-width, without any need for overflow into neighbouring registers/memory locations.

A VM implementing this abstract machine would, of course, use some kind of bignum library to implement these semantics, with each such integer value actually being a pointer to a VM-heap-allocated bignum; or, perhaps, the VM would do load-time optimization to strength-reduce some of the bignums to regular host-CPU integer types.

But, from the perspective of a compiler targeting the abstract machine itself, the ISA just "has" registers and ops of an 'iUNLIMITED' type.

Does LLVM, and specifically the LLVM IR type system, support anything like this? If not, would it be the work of days, or of months, to make it happen?

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: An unlimited-width integer type

Kristof Beyls via cfe-dev


On 11/7/19 11:41 AM, Levi Aul via cfe-dev wrote:
Hi, I'm considering writing an LLVM backend to target an abstract machine, encoding to a bytecode ISA.

This abstract machine has integers, but not in the sense of having fixed-size machine words; rather, each abstract-machine register and memory location can losslessly hold onto a distinct (non-aliasing) unlimited bit-width value, and math/bitwise ops will operate losslessly on these values to produce new values of unlimited bit-width, without any need for overflow into neighbouring registers/memory locations.

A VM implementing this abstract machine would, of course, use some kind of bignum library to implement these semantics, with each such integer value actually being a pointer to a VM-heap-allocated bignum; or, perhaps, the VM would do load-time optimization to strength-reduce some of the bignums to regular host-CPU integer types.

But, from the perspective of a compiler targeting the abstract machine itself, the ISA just "has" registers and ops of an 'iUNLIMITED' type.

Does LLVM, and specifically the LLVM IR type system, support anything like this? If not, would it be the work of days, or of months, to make it happen?


The real question is: what do you expect LLVM to do with these integers? Do you expect LLVM to optimize operations on them? Perform constant folding? If you represent them as pointers that you happen to pass to particular functions/intrinsics representing operations on them, then getting things working will be easy. Depending on the attributes you put on the functions/intrinsics, you can even get some amount of CSE. If you want more, it will likely be more work.

 -Hal



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: An unlimited-width integer type

Kristof Beyls via cfe-dev
In reply to this post by Kristof Beyls via cfe-dev
Hi,

This is roughly the model that Smalltalk provides for integers, and I
did implement a Smalltalk compiler using LLVM some time ago...

A few thoughts come to mind:

1. C and related languages encode fixed-width types in their abstract
machine (in C, the width of types is baked by the time the preprocessor
has run for a lot of code, which is why the EFI bytecode design was such
a bad idea).  Your VM wouldn't be useable from any existing LLVM front
ends without some significant changes, which is not necessarily a reason
*not* to do this, but may limit the utility.

2. LLVM has fixed-width types and intrinsics for checking overflow.  If
you expect that *most* of the operations in your source language(s) will
fit into an LLVM fixed-width integer type that corresponds to a machine
register, then you may find that LLVM does quite a good job of
optimising your fast paths (though leaves your slow paths untouched) if
you implement a compiler for your abstract machine *to* LLVM.  Note that
deferring overflow checks for as long as possible and failing back to
the fast path to redo calculations will generate better code for this.

3. If you're willing to do some extra work, you could try introducing an
iUnlimited type and see how invasive the change is.  There are a number
of places where optimisations query bit widths and other places where
APInt and friends encode a fixed width for a particular compile-time
evaluation.  There are probably other places where known-bits analyses
could allow you to transform iUnlimited to a fixed-width type.  You
could then provide a default lowering that mapped them to pointers and
calls to a bignum library, so existing back ends could use them.  I can
imagine that this would be useful to some front ends that would prefer
to defer lowering of variable-width integers until after some
optimisations.  I can't guarantee that this would be something that
upstream would accept, but I'd love to see someone try the experiment.

4. You may find that doing this in MLIR works better.  MLIR has an open
type system, so you can add a new type for your integers quite easily,
though then you won't benefit from any LLVM optimisations.  It's not
clear that you could necessarily use the generic LLVM backend
infrastructure though, so a direct translation from MLIR to your
bytecode may preferable.

David

On 07/11/2019 17:41, Levi Aul via cfe-dev wrote:

> Hi, I'm considering writing an LLVM backend to target an abstract
> machine, encoding to a bytecode ISA.
>
> This abstract machine has integers, but not in the sense of having
> fixed-size machine words; rather, each abstract-machine register and
> memory location can losslessly hold onto a distinct (non-aliasing)
> unlimited bit-width value, and math/bitwise ops will operate losslessly
> on these values to produce new values of unlimited bit-width, without
> any need for overflow into neighbouring registers/memory locations.
>
> A VM implementing this abstract machine would, of course, use some kind
> of bignum library to implement these semantics, with each such integer
> value actually being a pointer to a VM-heap-allocated bignum; or,
> perhaps, the VM would do load-time optimization to strength-reduce some
> of the bignums to regular host-CPU integer types.
>
> But, from the perspective of a compiler targeting the abstract machine
> itself, the ISA just "has" registers and ops of an 'iUNLIMITED' type.
>
> Does LLVM, and specifically the LLVM IR type system, support anything
> like this? If not, would it be the work of days, or of months, to make
> it happen?
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: An unlimited-width integer type

Kristof Beyls via cfe-dev
In reply to this post by Kristof Beyls via cfe-dev

Fair warning, I'm answering a bit outside my comfort zone, so some of this might be wrong...

If you're wanting to have an LLVM backend that targets this abstract machine, then lowering fixed precision integers into infinite precision integers is fairly straight forward.  You just need to manually implement the twos complement wrapping behaviors in the integer space w/branching.  That probably won't be the fastest thing in the world, but if you're targeting anything other than an mature VM, you shouldn't care. 

Your challenge will be representing the infinite integer type in the backend.  You shouldn't need to change IR at all. 

The simplest, and slowest, approach would be to custom lower to a set of intrinsic calls.  If you type your "registers" as oddly sized pointers into a special address space, you can probably get that working.  Your real challenge will be representing your "integer store" memory since it is by definition not byte addressable.  I don't really have any good suggestions there.  I'd strongly suggest figuring this piece out before moving on to anything else.

The harder approach to the lowering would be introducing an integer type in into the backend.  I have no input on difficulty of that, other than to say it's probably not worth doing unless you're pushing performance fairly hard.  (I doubt you are.)

Philip

On 11/7/19 9:41 AM, Levi Aul via cfe-dev wrote:
Hi, I'm considering writing an LLVM backend to target an abstract machine, encoding to a bytecode ISA.

This abstract machine has integers, but not in the sense of having fixed-size machine words; rather, each abstract-machine register and memory location can losslessly hold onto a distinct (non-aliasing) unlimited bit-width value, and math/bitwise ops will operate losslessly on these values to produce new values of unlimited bit-width, without any need for overflow into neighbouring registers/memory locations.

A VM implementing this abstract machine would, of course, use some kind of bignum library to implement these semantics, with each such integer value actually being a pointer to a VM-heap-allocated bignum; or, perhaps, the VM would do load-time optimization to strength-reduce some of the bignums to regular host-CPU integer types.

But, from the perspective of a compiler targeting the abstract machine itself, the ISA just "has" registers and ops of an 'iUNLIMITED' type.

Does LLVM, and specifically the LLVM IR type system, support anything like this? If not, would it be the work of days, or of months, to make it happen?

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev