C++ Annex K safe C11 functions

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

C++ Annex K safe C11 functions

David Blaikie via cfe-dev
Hello

This file lists part of Annex K  "stdint.h"
https://clang.llvm.org/doxygen/stdint_8h_source.html

But main C++ page doesn't mention Annex K. Is Annex K really fully supported?

Some background
https://clang.llvm.org/compatibility.html
https://clang.llvm.org/cxx_status.html


C11 standard, ISO/IEC 9899:2011  added Annex K safe functions like strncpy_s

__STDC_LIB_EXT1__

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm

Jonny

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <[hidden email]> wrote:
Hello

This file lists part of Annex K  "stdint.h"
https://clang.llvm.org/doxygen/stdint_8h_source.html

But main C++ page doesn't mention Annex K. Is Annex K really fully supported?

That's generally not up to us; that's part of the C standard library, not part of the compiler.

The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.

So in that sense, we implement the part of Annex K that is in our domain.


I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.

C11 standard, ISO/IEC 9899:2011  added Annex K safe functions like strncpy_s

__STDC_LIB_EXT1__

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm

Jonny
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
Thank you for your reply Richard.

On 03/01/2019 22:04, Richard Smith wrote:

> On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hello
>
>     This file lists part of Annex K  "stdint.h"
>     https://clang.llvm.org/doxygen/stdint_8h_source.html
>
>     But main C++ page doesn't mention Annex K. Is Annex K really fully
>     supported?
>
>
> That's generally not up to us; that's part of the C standard library,
> not part of the compiler.
>
> The one part of Annex K that *is* part of the compiler, according to the
> usual division of responsibilities, wherein the compiler provides the
> freestanding headers and the C standard library provides the rest, is
> the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX
> in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is
> defined. However, we do not define __STDC_LIB_EXT1__ because, as noted,
> that's not up to us, and we have no idea what your C standard library
> supports.

I use glibc, it doesn't support Annex K. We are keen to use Annex K
functionality, so looking around for options.

Do you know if Clang has any intention to develop support for a libc C11
with Annex K Support?


I'm looking around, and came across this project
https://github.com/rurban/safeclib/blob/master/README


> So in that sense, we implement the part of Annex K that is in our domain.
>
>     Some background
>     https://clang.llvm.org/compatibility.html
>     https://clang.llvm.org/cxx_status.html
>
>
> I'm not sure what these are supposed to show: Annex K is optional in C,
> and not part of C++.

It would be good if what you state could be added to the compatibility
page, that Annex K is supported only for stdint.h, but that clang
requires a libc which supports C11 Annex K functions/implementation.

Cheers, Jonny


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
I think clang could offer builtins which provide some of the Annex K building blocks, and let libc implementations provide the rest (using the clang builtins when available).

I’m interested in implementing these builtins, unless someone beats me to it. User code often uses Annex K to provide guarantees that are now redundant with trivial automatic variable initialization (llvm.org/rL349442), and I’d like to reduce the hit they’re taking. Here are some notes I wrote for myself a little while ago:

I’ll focus on memset, but this applies to other Annex K functionality (memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s).

These functions simply perform extra checks before calling their regular equivalent, with an extra provision that the operation can’t be as-if’d away (i.e. you have to do the entire memset).

Often, custom memset_s implementations are simple loops (cast to char*, and set each byte to 0), compiled in a different TU, which, amusingly thought LTO and inlining, would totally not obey the “no as-if” rule. Other times they’re implemented in opaque assembly.

Clang doesn’t know about this function, and assumes it’s just another function call. We should tell the compiler about what these functions do so that it knows that stores prior to memset_s are dead, memset_s can’t be removed, what the extra memset_s checks are, and so that we can forward values from memset_s. We can then generate better code (small loops become stores with a loop, memset_s followed by stores get merged, etc).

A few options:

1. Make them a builtin, have libc implementations forward to the builtin.
2. Teach clang / LLVM about these function’s semantics (i.e. if conditions met, same as memset).
3. Add an attribute which teaches clang about memset-like functions, and use it in libc implementations.
4. Use LTO between the projects and libc implementations, allowing clang to peek into memset_s’s implementation.

I think 1. is the best approach.



On Jan 4, 2019, at 1:35 AM, Jonny Grant via cfe-dev <[hidden email]> wrote:

Thank you for your reply Richard.

On 03/01/2019 22:04, Richard Smith wrote:
On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <[hidden email] <[hidden email]>> wrote:
   Hello
   This file lists part of Annex K  "stdint.h"
   https://clang.llvm.org/doxygen/stdint_8h_source.html
   But main C++ page doesn't mention Annex K. Is Annex K really fully
   supported?
That's generally not up to us; that's part of the C standard library, not part of the compiler.
The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.

I use glibc, it doesn't support Annex K. We are keen to use Annex K functionality, so looking around for options.

Do you know if Clang has any intention to develop support for a libc C11 with Annex K Support?


I'm looking around, and came across this project
https://github.com/rurban/safeclib/blob/master/README


So in that sense, we implement the part of Annex K that is in our domain.
   Some background
   https://clang.llvm.org/compatibility.html
   https://clang.llvm.org/cxx_status.html
I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.

It would be good if what you state could be added to the compatibility page, that Annex K is supported only for stdint.h, but that clang requires a libc which supports C11 Annex K functions/implementation.

Cheers, Jonny


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev

Providing the primitives for memset_s would be great.  It is a genuinely useful part of Annex K.  I will note that the majority of Annex K is of little practical use though, due to the global (not thread local) customizable constraint handler.  set_constraint_handler_s and friends don’t play nice when you don’t control the entire codebase / process, especially in multithreaded situations.

 

From: cfe-dev <[hidden email]> On Behalf Of JF Bastien via cfe-dev
Sent: Friday, January 4, 2019 2:38 PM
To: Jonny Grant <[hidden email]>
Cc: Richard Smith <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] C++ Annex K safe C11 functions

 

I think clang could offer builtins which provide some of the Annex K building blocks, and let libc implementations provide the rest (using the clang builtins when available).

 

I’m interested in implementing these builtins, unless someone beats me to it. User code often uses Annex K to provide guarantees that are now redundant with trivial automatic variable initialization (llvm.org/rL349442), and I’d like to reduce the hit they’re taking. Here are some notes I wrote for myself a little while ago:

 

I’ll focus on memset, but this applies to other Annex K functionality (memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s).

 

These functions simply perform extra checks before calling their regular equivalent, with an extra provision that the operation can’t be as-if’d away (i.e. you have to do the entire memset).

 

Often, custom memset_s implementations are simple loops (cast to char*, and set each byte to 0), compiled in a different TU, which, amusingly thought LTO and inlining, would totally not obey the “no as-if” rule. Other times they’re implemented in opaque assembly.

 

Clang doesn’t know about this function, and assumes it’s just another function call. We should tell the compiler about what these functions do so that it knows that stores prior to memset_s are dead, memset_s can’t be removed, what the extra memset_s checks are, and so that we can forward values from memset_s. We can then generate better code (small loops become stores with a loop, memset_s followed by stores get merged, etc).

 

A few options:

 

1. Make them a builtin, have libc implementations forward to the builtin.

2. Teach clang / LLVM about these function’s semantics (i.e. if conditions met, same as memset).

3. Add an attribute which teaches clang about memset-like functions, and use it in libc implementations.

4. Use LTO between the projects and libc implementations, allowing clang to peek into memset_s’s implementation.

 

I think 1. is the best approach.

 

 



On Jan 4, 2019, at 1:35 AM, Jonny Grant via cfe-dev <[hidden email]> wrote:

 

Thank you for your reply Richard.

On 03/01/2019 22:04, Richard Smith wrote:

On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <[hidden email] <[hidden email]>> wrote:
   Hello
   This file lists part of Annex K  "stdint.h"
   https://clang.llvm.org/doxygen/stdint_8h_source.html
   But main C++ page doesn't mention Annex K. Is Annex K really fully
   supported?
That's generally not up to us; that's part of the C standard library, not part of the compiler.
The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.


I use glibc, it doesn't support Annex K. We are keen to use Annex K functionality, so looking around for options.

Do you know if Clang has any intention to develop support for a libc C11 with Annex K Support?


I'm looking around, and came across this project
https://github.com/rurban/safeclib/blob/master/README



So in that sense, we implement the part of Annex K that is in our domain.
   Some background
   https://clang.llvm.org/compatibility.html
   https://clang.llvm.org/cxx_status.html
I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.


It would be good if what you state could be added to the compatibility page, that Annex K is supported only for stdint.h, but that clang requires a libc which supports C11 Annex K functions/implementation.

Cheers, Jonny


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

 


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
In reply to this post by David Blaikie via cfe-dev
Hi! Sounds great
How about setting uninitialised variables to 0xdeadbeef or 0xabbaabba so its easily identifiable when they crop up in use?
We used to clear buffers to 0x11111111 and stack to 0x22222222 I recall

The URL should be llvm.org/r349442  BTW

Jonny

On 04/01/2019 20:37, JF Bastien wrote:
I think clang could offer builtins which provide some of the Annex K building blocks, and let libc implementations provide the rest (using the clang builtins when available).

I’m interested in implementing these builtins, unless someone beats me to it. User code often uses Annex K to provide guarantees that are now redundant with trivial automatic variable initialization (llvm.org/rL349442), and I’d like to reduce the hit they’re taking. Here are some notes I wrote for myself a little while ago:

I’ll focus on memset, but this applies to other Annex K functionality (memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s).

These functions simply perform extra checks before calling their regular equivalent, with an extra provision that the operation can’t be as-if’d away (i.e. you have to do the entire memset).

Often, custom memset_s implementations are simple loops (cast to char*, and set each byte to 0), compiled in a different TU, which, amusingly thought LTO and inlining, would totally not obey the “no as-if” rule. Other times they’re implemented in opaque assembly.

Clang doesn’t know about this function, and assumes it’s just another function call. We should tell the compiler about what these functions do so that it knows that stores prior to memset_s are dead, memset_s can’t be removed, what the extra memset_s checks are, and so that we can forward values from memset_s. We can then generate better code (small loops become stores with a loop, memset_s followed by stores get merged, etc).

A few options:

1. Make them a builtin, have libc implementations forward to the builtin.
2. Teach clang / LLVM about these function’s semantics (i.e. if conditions met, same as memset).
3. Add an attribute which teaches clang about memset-like functions, and use it in libc implementations.
4. Use LTO between the projects and libc implementations, allowing clang to peek into memset_s’s implementation.

I think 1. is the best approach.



On Jan 4, 2019, at 1:35 AM, Jonny Grant via cfe-dev <[hidden email]> wrote:

Thank you for your reply Richard.

On 03/01/2019 22:04, Richard Smith wrote:
On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <[hidden email] <[hidden email]>> wrote:
   Hello
   This file lists part of Annex K  "stdint.h"
   https://clang.llvm.org/doxygen/stdint_8h_source.html
   But main C++ page doesn't mention Annex K. Is Annex K really fully
   supported?
That's generally not up to us; that's part of the C standard library, not part of the compiler.
The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.

I use glibc, it doesn't support Annex K. We are keen to use Annex K functionality, so looking around for options.

Do you know if Clang has any intention to develop support for a libc C11 with Annex K Support?


I'm looking around, and came across this project
https://github.com/rurban/safeclib/blob/master/README


So in that sense, we implement the part of Annex K that is in our domain.
   Some background
   https://clang.llvm.org/compatibility.html
   https://clang.llvm.org/cxx_status.html
I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.

It would be good if what you state could be added to the compatibility page, that Annex K is supported only for stdint.h, but that clang requires a libc which supports C11 Annex K functions/implementation.

Cheers, Jonny


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev


On Jan 4, 2019, at 1:47 PM, Jonny Grant <[hidden email]> wrote:

Hi! Sounds great
How about setting uninitialised variables to 0xdeadbeef or 0xabbaabba so its easily identifiable when they crop up in use?
We used to clear buffers to 0x11111111 and stack to 0x22222222 I recall

This isn’t relevant to the Annex K discussion, let’s keep this thread focused. We discussed initialization values in the original thread as well as the code review, it’s worth reading through that to see why I chose the values I did (mainly so pointers are invalid, and they’re repeated byte-values so the code generation is better).


The URL should be llvm.org/r349442  BTW

Jonny

On 04/01/2019 20:37, JF Bastien wrote:
I think clang could offer builtins which provide some of the Annex K building blocks, and let libc implementations provide the rest (using the clang builtins when available).

I’m interested in implementing these builtins, unless someone beats me to it. User code often uses Annex K to provide guarantees that are now redundant with trivial automatic variable initialization (llvm.org/rL349442), and I’d like to reduce the hit they’re taking. Here are some notes I wrote for myself a little while ago:

I’ll focus on memset, but this applies to other Annex K functionality (memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s).

These functions simply perform extra checks before calling their regular equivalent, with an extra provision that the operation can’t be as-if’d away (i.e. you have to do the entire memset).

Often, custom memset_s implementations are simple loops (cast to char*, and set each byte to 0), compiled in a different TU, which, amusingly thought LTO and inlining, would totally not obey the “no as-if” rule. Other times they’re implemented in opaque assembly.

Clang doesn’t know about this function, and assumes it’s just another function call. We should tell the compiler about what these functions do so that it knows that stores prior to memset_s are dead, memset_s can’t be removed, what the extra memset_s checks are, and so that we can forward values from memset_s. We can then generate better code (small loops become stores with a loop, memset_s followed by stores get merged, etc).

A few options:

1. Make them a builtin, have libc implementations forward to the builtin.
2. Teach clang / LLVM about these function’s semantics (i.e. if conditions met, same as memset).
3. Add an attribute which teaches clang about memset-like functions, and use it in libc implementations.
4. Use LTO between the projects and libc implementations, allowing clang to peek into memset_s’s implementation.

I think 1. is the best approach.



On Jan 4, 2019, at 1:35 AM, Jonny Grant via cfe-dev <[hidden email]> wrote:

Thank you for your reply Richard.

On 03/01/2019 22:04, Richard Smith wrote:
On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <[hidden email] <[hidden email]>> wrote:
   Hello
   This file lists part of Annex K  "stdint.h"
   https://clang.llvm.org/doxygen/stdint_8h_source.html
   But main C++ page doesn't mention Annex K. Is Annex K really fully
   supported?
That's generally not up to us; that's part of the C standard library, not part of the compiler.
The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.

I use glibc, it doesn't support Annex K. We are keen to use Annex K functionality, so looking around for options.

Do you know if Clang has any intention to develop support for a libc C11 with Annex K Support?


I'm looking around, and came across this project
https://github.com/rurban/safeclib/blob/master/README


So in that sense, we implement the part of Annex K that is in our domain.
   Some background
   https://clang.llvm.org/compatibility.html
   https://clang.llvm.org/cxx_status.html
I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.

It would be good if what you state could be added to the compatibility page, that Annex K is supported only for stdint.h, but that clang requires a libc which supports C11 Annex K functions/implementation.

Cheers, Jonny


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev


On 04/01/2019 21:50, JF Bastien wrote:

>
>
>> On Jan 4, 2019, at 1:47 PM, Jonny Grant <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> Hi! Sounds great
>> How about setting uninitialised variables to 0xdeadbeef or 0xabbaabba
>> so its easily identifiable when they crop up in use?
>> We used to clear buffers to 0x11111111 and stack to 0x22222222 I recall
>
> This isn’t relevant to the Annex K discussion, let’s keep this thread
> focused. We discussed initialization values in the original thread as
> well as the code review, it’s worth reading through that to see why I
> chose the values I did (mainly so pointers are invalid, and they’re
> repeated byte-values so the code generation is better).

OK.

BTW, I don't think Clang has its own libc does it? Could the other Annex
K be added in a libc.

Or at a push, in Clang's libc++ (I know it is meant to be C11 standard,
not C++, but C++ usually includes C functions anyway)

Jonny
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev


> On Jan 5, 2019, at 11:35 PM, Jonny Grant <[hidden email]> wrote:
>
>
>
> On 04/01/2019 21:50, JF Bastien wrote:
>>> On Jan 4, 2019, at 1:47 PM, Jonny Grant <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>> Hi! Sounds great
>>> How about setting uninitialised variables to 0xdeadbeef or 0xabbaabba so its easily identifiable when they crop up in use?
>>> We used to clear buffers to 0x11111111 and stack to 0x22222222 I recall
>> This isn’t relevant to the Annex K discussion, let’s keep this thread focused. We discussed initialization values in the original thread as well as the code review, it’s worth reading through that to see why I chose the values I did (mainly so pointers are invalid, and they’re repeated byte-values so the code generation is better).
>
> OK.
>
> BTW, I don't think Clang has its own libc does it?

Correct, clang doesn’t have its own libc. It does have some interception headers for some things, and plenty of builtins.


> Could the other Annex K be added in a libc.

What do you mean?


> Or at a push, in Clang's libc++ (I know it is meant to be C11 standard, not C++, but C++ usually includes C functions anyway)

I’d rather not. I want to implement builtins, and let libc implementations use them as they wish.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
In reply to this post by David Blaikie via cfe-dev
On Sat, 5 Jan 2019 at 23:35, Jonny Grant via cfe-dev <[hidden email]> wrote:
On 04/01/2019 21:50, JF Bastien wrote:
>> On Jan 4, 2019, at 1:47 PM, Jonny Grant <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> Hi! Sounds great
>> How about setting uninitialised variables to 0xdeadbeef or 0xabbaabba
>> so its easily identifiable when they crop up in use?
>> We used to clear buffers to 0x11111111 and stack to 0x22222222 I recall
>
> This isn’t relevant to the Annex K discussion, let’s keep this thread
> focused. We discussed initialization values in the original thread as
> well as the code review, it’s worth reading through that to see why I
> chose the values I did (mainly so pointers are invalid, and they’re
> repeated byte-values so the code generation is better).

OK.

BTW, I don't think Clang has its own libc does it? Could the other Annex
K be added in a libc.

Yes, a libc implementation can certainly implement Annex K (or at least the non-freestanding parts of it) itself. It would also be feasible for a third-party library to provide an implementation of Annex K, and to interject wrapper headers between user code and the underlying libc providing the extra symbols.
 
Or at a push, in Clang's libc++ (I know it is meant to be C11 standard,
not C++, but C++ usually includes C functions anyway)

While C++ includes most of the C standard library, it excludes these parts; even if libc++ chose to provide an implementation of the functions inherited from C, it should still not include these ones.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
In reply to this post by David Blaikie via cfe-dev


On 07/01/2019 17:48, JF Bastien wrote:

>
>
>> On Jan 5, 2019, at 11:35 PM, Jonny Grant <[hidden email]> wrote:
>>
>>
>>
>> On 04/01/2019 21:50, JF Bastien wrote:
>>>> On Jan 4, 2019, at 1:47 PM, Jonny Grant <[hidden email] <mailto:[hidden email]>> wrote:
>>>>
>>>> Hi! Sounds great
>>>> How about setting uninitialised variables to 0xdeadbeef or 0xabbaabba so its easily identifiable when they crop up in use?
>>>> We used to clear buffers to 0x11111111 and stack to 0x22222222 I recall
>>> This isn’t relevant to the Annex K discussion, let’s keep this thread focused. We discussed initialization values in the original thread as well as the code review, it’s worth reading through that to see why I chose the values I did (mainly so pointers are invalid, and they’re repeated byte-values so the code generation is better).
>>
>> OK.
>>
>> BTW, I don't think Clang has its own libc does it?
>
> Correct, clang doesn’t have its own libc. It does have some interception headers for some things, and plenty of builtins.
>
>
>> Could the other Annex K be added in a libc.
>
> What do you mean?
>
>
>> Or at a push, in Clang's libc++ (I know it is meant to be C11 standard, not C++, but C++ usually includes C functions anyway)
>
> I’d rather not. I want to implement builtins, and let libc implementations use them as they wish.
>

Sounds good. I've not created a clang ticket on
https://bugs.llvm.org/enter_bug.cgi  to track adding the safe C11
functions. Let me know if you'd like me to file it.

Cheers, Jonny
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
In reply to this post by David Blaikie via cfe-dev
Hi JF

Was just wondering how things were going adding these memset_s (and the others you mentioned).?

Jonny


On 04/01/2019 20:37, JF Bastien wrote:
I think clang could offer builtins which provide some of the Annex K building blocks, and let libc implementations provide the rest (using the clang builtins when available).

I’m interested in implementing these builtins, unless someone beats me to it. User code often uses Annex K to provide guarantees that are now redundant with trivial automatic variable initialization (llvm.org/rL349442), and I’d like to reduce the hit they’re taking. Here are some notes I wrote for myself a little while ago:

I’ll focus on memset, but this applies to other Annex K functionality (memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s).

These functions simply perform extra checks before calling their regular equivalent, with an extra provision that the operation can’t be as-if’d away (i.e. you have to do the entire memset).

Often, custom memset_s implementations are simple loops (cast to char*, and set each byte to 0), compiled in a different TU, which, amusingly thought LTO and inlining, would totally not obey the “no as-if” rule. Other times they’re implemented in opaque assembly.

Clang doesn’t know about this function, and assumes it’s just another function call. We should tell the compiler about what these functions do so that it knows that stores prior to memset_s are dead, memset_s can’t be removed, what the extra memset_s checks are, and so that we can forward values from memset_s. We can then generate better code (small loops become stores with a loop, memset_s followed by stores get merged, etc).

A few options:

1. Make them a builtin, have libc implementations forward to the builtin.
2. Teach clang / LLVM about these function’s semantics (i.e. if conditions met, same as memset).
3. Add an attribute which teaches clang about memset-like functions, and use it in libc implementations.
4. Use LTO between the projects and libc implementations, allowing clang to peek into memset_s’s implementation.

I think 1. is the best approach.



On Jan 4, 2019, at 1:35 AM, Jonny Grant via cfe-dev <[hidden email]> wrote:

Thank you for your reply Richard.

On 03/01/2019 22:04, Richard Smith wrote:
On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <[hidden email] <[hidden email]>> wrote:
   Hello
   This file lists part of Annex K  "stdint.h"
   https://clang.llvm.org/doxygen/stdint_8h_source.html
   But main C++ page doesn't mention Annex K. Is Annex K really fully
   supported?
That's generally not up to us; that's part of the C standard library, not part of the compiler.
The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.

I use glibc, it doesn't support Annex K. We are keen to use Annex K functionality, so looking around for options.

Do you know if Clang has any intention to develop support for a libc C11 with Annex K Support?


I'm looking around, and came across this project
https://github.com/rurban/safeclib/blob/master/README


So in that sense, we implement the part of Annex K that is in our domain.
   Some background
   https://clang.llvm.org/compatibility.html
   https://clang.llvm.org/cxx_status.html
I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.

It would be good if what you state could be added to the compatibility page, that Annex K is supported only for stdint.h, but that clang requires a libc which supports C11 Annex K functions/implementation.

Cheers, Jonny


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev

On Mar 3, 2019, at 7:46 AM, Jonny Grant <[hidden email]> wrote:

Hi JF

Was just wondering how things were going adding these memset_s (and the others you mentioned).?

I haven’t started. It’s on my short list of things to do. 


Jonny


On 04/01/2019 20:37, JF Bastien wrote:
I think clang could offer builtins which provide some of the Annex K building blocks, and let libc implementations provide the rest (using the clang builtins when available).

I’m interested in implementing these builtins, unless someone beats me to it. User code often uses Annex K to provide guarantees that are now redundant with trivial automatic variable initialization (llvm.org/rL349442), and I’d like to reduce the hit they’re taking. Here are some notes I wrote for myself a little while ago:

I’ll focus on memset, but this applies to other Annex K functionality (memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s).

These functions simply perform extra checks before calling their regular equivalent, with an extra provision that the operation can’t be as-if’d away (i.e. you have to do the entire memset).

Often, custom memset_s implementations are simple loops (cast to char*, and set each byte to 0), compiled in a different TU, which, amusingly thought LTO and inlining, would totally not obey the “no as-if” rule. Other times they’re implemented in opaque assembly.

Clang doesn’t know about this function, and assumes it’s just another function call. We should tell the compiler about what these functions do so that it knows that stores prior to memset_s are dead, memset_s can’t be removed, what the extra memset_s checks are, and so that we can forward values from memset_s. We can then generate better code (small loops become stores with a loop, memset_s followed by stores get merged, etc).

A few options:

1. Make them a builtin, have libc implementations forward to the builtin.
2. Teach clang / LLVM about these function’s semantics (i.e. if conditions met, same as memset).
3. Add an attribute which teaches clang about memset-like functions, and use it in libc implementations.
4. Use LTO between the projects and libc implementations, allowing clang to peek into memset_s’s implementation.

I think 1. is the best approach.



On Jan 4, 2019, at 1:35 AM, Jonny Grant via cfe-dev <[hidden email]> wrote:

Thank you for your reply Richard.

On 03/01/2019 22:04, Richard Smith wrote:
On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <[hidden email] <[hidden email]>> wrote:
   Hello
   This file lists part of Annex K  "stdint.h"
   https://clang.llvm.org/doxygen/stdint_8h_source.html
   But main C++ page doesn't mention Annex K. Is Annex K really fully
   supported?
That's generally not up to us; that's part of the C standard library, not part of the compiler.
The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.

I use glibc, it doesn't support Annex K. We are keen to use Annex K functionality, so looking around for options.

Do you know if Clang has any intention to develop support for a libc C11 with Annex K Support?


I'm looking around, and came across this project
https://github.com/rurban/safeclib/blob/master/README


So in that sense, we implement the part of Annex K that is in our domain.
   Some background
   https://clang.llvm.org/compatibility.html
   https://clang.llvm.org/cxx_status.html
I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.

It would be good if what you state could be added to the compatibility page, that Annex K is supported only for stdint.h, but that clang requires a libc which supports C11 Annex K functions/implementation.

Cheers, Jonny


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
Long time Annex K observer here: I would just like to add my 2 cents.

I would very much like to see some kind of minimal compiler support for the bounds checking functions - memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s - since overruns in their non-bounds-checked equivalents have been responsible for hundreds of security vulnerabilities over the years. It doesn't have to be fully annex K compliant for my concerns, just has to obey the buffer boundaries. Some kind of minimal compiler support would also make cross platform programming much easier, since as of right now I use the bounds checking versions in code compiled for MSVC, and have to either #ifdef it out or lose the extra checking.

Sincerely,
Alexander Riccio
--
"Change the world or go home."

If left to my own devices, I will build more.


On Sun, Mar 3, 2019 at 12:07 PM JF Bastien via cfe-dev <[hidden email]> wrote:

On Mar 3, 2019, at 7:46 AM, Jonny Grant <[hidden email]> wrote:

Hi JF

Was just wondering how things were going adding these memset_s (and the others you mentioned).?

I haven’t started. It’s on my short list of things to do. 


Jonny


On 04/01/2019 20:37, JF Bastien wrote:
I think clang could offer builtins which provide some of the Annex K building blocks, and let libc implementations provide the rest (using the clang builtins when available).

I’m interested in implementing these builtins, unless someone beats me to it. User code often uses Annex K to provide guarantees that are now redundant with trivial automatic variable initialization (llvm.org/rL349442), and I’d like to reduce the hit they’re taking. Here are some notes I wrote for myself a little while ago:

I’ll focus on memset, but this applies to other Annex K functionality (memcpy_s, memmove_s, strcpy_s, strncpy_s, strcat_s, strncat_s, strtok_s, memset_s, strerror_s, strerrorlen_s, strnlen_s).

These functions simply perform extra checks before calling their regular equivalent, with an extra provision that the operation can’t be as-if’d away (i.e. you have to do the entire memset).

Often, custom memset_s implementations are simple loops (cast to char*, and set each byte to 0), compiled in a different TU, which, amusingly thought LTO and inlining, would totally not obey the “no as-if” rule. Other times they’re implemented in opaque assembly.

Clang doesn’t know about this function, and assumes it’s just another function call. We should tell the compiler about what these functions do so that it knows that stores prior to memset_s are dead, memset_s can’t be removed, what the extra memset_s checks are, and so that we can forward values from memset_s. We can then generate better code (small loops become stores with a loop, memset_s followed by stores get merged, etc).

A few options:

1. Make them a builtin, have libc implementations forward to the builtin.
2. Teach clang / LLVM about these function’s semantics (i.e. if conditions met, same as memset).
3. Add an attribute which teaches clang about memset-like functions, and use it in libc implementations.
4. Use LTO between the projects and libc implementations, allowing clang to peek into memset_s’s implementation.

I think 1. is the best approach.



On Jan 4, 2019, at 1:35 AM, Jonny Grant via cfe-dev <[hidden email]> wrote:

Thank you for your reply Richard.

On 03/01/2019 22:04, Richard Smith wrote:
On Thu, 3 Jan 2019 at 13:44, Jonny Grant via cfe-dev <[hidden email] <[hidden email]>> wrote:
   Hello
   This file lists part of Annex K  "stdint.h"
   https://clang.llvm.org/doxygen/stdint_8h_source.html
   But main C++ page doesn't mention Annex K. Is Annex K really fully
   supported?
That's generally not up to us; that's part of the C standard library, not part of the compiler.
The one part of Annex K that *is* part of the compiler, according to the usual division of responsibilities, wherein the compiler provides the freestanding headers and the C standard library provides the rest, is the definition of rsize_t in <stddef.h> and the definition of RSIZE_MAX in <stdint.h>, and Clang provides those if __STDC_WANT_LIB_EXT1__ is defined. However, we do not define __STDC_LIB_EXT1__ because, as noted, that's not up to us, and we have no idea what your C standard library supports.

I use glibc, it doesn't support Annex K. We are keen to use Annex K functionality, so looking around for options.

Do you know if Clang has any intention to develop support for a libc C11 with Annex K Support?


I'm looking around, and came across this project
https://github.com/rurban/safeclib/blob/master/README


So in that sense, we implement the part of Annex K that is in our domain.
   Some background
   https://clang.llvm.org/compatibility.html
   https://clang.llvm.org/cxx_status.html
I'm not sure what these are supposed to show: Annex K is optional in C, and not part of C++.

It would be good if what you state could be added to the compatibility page, that Annex K is supported only for stdint.h, but that clang requires a libc which supports C11 Annex K functions/implementation.

Cheers, Jonny


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++ Annex K safe C11 functions

David Blaikie via cfe-dev
On Mon, Mar 04, 2019 at 04:00:17PM -0500, <Alexander G. Riccio> via cfe-dev wrote:
> I would *very *much like to see some kind of minimal compiler support for
> the bounds checking functions - memcpy_s, memmove_s, strcpy_s, strncpy_s,
> strcat_s - since overruns in their non-bounds-checked equivalents have been
> responsible for hundreds of security vulnerabilities over the years. It
> doesn't have to be fully annex K compliant for my concerns, just has to
> obey the buffer boundaries. Some kind of minimal compiler support would
> also make cross platform programming much easier, since as of right now I
> use the bounds checking versions in code compiled for MSVC, and have to
> either #ifdef it out or lose the extra checking.

It has been mentioned in a recent thread about fortify, but what exactly
do you *miss* for implementing them on top of the existing
__builtin_object_size? That's what can already be used to implement
_FORTIFY=2 and I don't think Annex K is much different beyond all the
runtime crash junk.

Joerg
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev