Hi Florian (+cfe-dev for visibilty),
I was playing a little bit with Clang matrix language extension and and wanted to check with you to see if I am not missing something about the matrix type conversion. The draft spec says:
"A value of matrix type can be converted to another matrix type if the number of rows and columns are the same and the value’s elements can be converted to the element type of the result type. "
I have tried a different variants of this:
typedef char m2x8_t __attribute__((matrix_type(2, 8)));
typedef char m8x2_t __attribute__((matrix_type(8, 2)));
typedef char m2x2_char_t __attribute__((matrix_type(2, 2)));
typedef int m2x2_int_t __attribute__((matrix_type(2, 2)));
m2x2_int_t f(m2x8_t a, m8x2_t b) {
return static_cast<m2x2_int_t>(a *b);
}
but am getting errors that this conversion is not allowed. Unless I am doing something very silly here, I am guessing that this because the matrix extension is work in progress?
The draft spec also says that implicit conversions don't apply, but that would perhaps be convenient? But I haven't given this any thoughts yet though if that could be problematic.
Moving on a bit to lowering this to te matrix multiply intrinsics. I think it would be convenient if the matrix multiply can accumulate in a wider type (because that's what some instructions do). While there are probably different approaches possible,
the llvm intrinsic has the vector type for the return value and its arguments:
vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, ...)So perhaps we can relax this? Cheers,
Sjoerd.
_______________________________________________ cfe-dev mailing list [hidden email] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev |
Hi,
Yes we can certainly extend this, to allow use cases to map to hardware instructions that implement an extension step, like AAch64’s udot. IIRC it extends the sums, which I think would make the most sense to use, as otherwise it should be sufficient to extend the operands/result. I think the more interesting question here would be how this fits into the C/C++ spec. I guess it would be possible to specify it so a multiply that gets extended lowers to the widening intrinsic, but this would seem quite surprising/awkward. In my opinion, a separate builtin would be a cleaner solution. Cheers, Florian _______________________________________________ cfe-dev mailing list [hidden email] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev |
Hi,
> This should work according to the spec, but the conversion has not been implemented yet I think. I’ve created https://bugs.llvm.org/show_bug.cgi?id=47141 and linked it to https://bugs.llvm.org/show_bug.cgi?id=46163 which
should act as an umbrella issue to track the missing pieces.
Ah, I was unaware of that umbrella ticket. Thanks for that, and for raising the ticket.
> I think currently we match the behavior for vector types and only convert scalar operands for binary operators implicitly to matrixes. If there’s a strong need for implicit conversions, this is certainly something that can be revisited.
Not sure if there's a strong need, but from writing my first examples yesterday, I can see that it would be convenient and possibly cleaner too (i.e. less text/clutter). I am not sure about this one, but it's also what people would
expect perhaps?
> Yes we can certainly extend this, to allow use cases to map to hardware instructions that implement an extension step, like AAch64’s udot. IIRC it extends the sums, which I think would make the most sense to use, as otherwise it should be sufficient to extend
the operands/result.
Yes, or the v8.6 matrix multiply accumulate instructions which multiply 8 bit values and store them to 32-bits.
As I said, I haven't given this too much thought yet, so just for my understanding, what exactly is the surprising/awkward bit of the C/C++ spec here? I was guessing that the assignment of a result from a matrix operation, using an implicit/explicit conversion,
would take care of this?
Cheers,
Sjoerd.
From: Florian Hahn <[hidden email]>
Sent: 12 August 2020 18:22 To: Sjoerd Meijer <[hidden email]> Cc: [hidden email] Developers <[hidden email]> Subject: Re: matrix type conversion Hi,
Yes we can certainly extend this, to allow use cases to map to hardware instructions that implement an extension step, like AAch64’s udot. IIRC it extends the sums, which I think would make the most sense to use, as otherwise it should be sufficient to
extend the operands/result.
I think the more interesting question here would be how this fits into the C/C++ spec. I guess it would be possible to specify it so a multiply that gets extended lowers to the widening intrinsic, but this would seem quite surprising/awkward. In my opinion,
a separate builtin would be a cleaner solution.
Cheers,
Florian
_______________________________________________ cfe-dev mailing list [hidden email] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev |
Oh right, I just had a look at those. It seems like the matrix multiply accumulate instructions widen the result of the matrix multiplication. I don’t think we need any changes to the intrinsic to model that. We should be able to model this by just extending the result vector of the matrix multiplication. And the extension instructions would be generated naturally from implicit/explicit conversion to a a matrix with wider element type. What I was referring to in the statement below was related to instructions where the results of the intermediate multiplications get widened, which are then accumulated using the wider type. To model that, I think we would need a ‘widening’ version of the matrix multiply intrinsic. And mapping this extension ‘in the middle’ to implicit/explicit conversion of the final result would be confusing/surprising IMO. But I might be missing something.
Cheers, Florian _______________________________________________ cfe-dev mailing list [hidden email] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev |
Free forum by Nabble | Edit this page |