Speculative load optimization

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Speculative load optimization

Eric Fiselier via cfe-dev
This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
that involves speculating loads. The patch itself needs some work regardless,
but we are questioning the legality of the optimization, which is currently
performed by both gcc and icc. The desired transformation looks like this:

    typedef struct S {
      char padding[4088];
      struct S *p1;
      struct S *p2;
    } S;

    struct S* f1(struct S *s, int x)
    {
      S *r;
      if (x)
        r = s->p1;
      else
        r = s->p2;
      return r;
    }

TO

    struct S* f1(struct S *s, int x)
    {
      return (x) ? s->p1 : s->p2;
    }

The fundamental question seems to be whether loading one member of struct S
makes it valid to load other members of the same struct. Both gcc & icc
seem to think so and make a distinction between this case and a similar case
where one field is accessed through an lvalue of a different type:

    typedef struct T {
      char padding[4088];
      struct S *p1;
    } T;

    struct S* f2(struct S *s, int x)
    {
      S *r;
      if (x)
        r = ((T*)s)->p1;
      else
        r = s->p2;
      return r;
    }

Neither compiler will transform this case.

Any insight on the validity of this optimization in relation to the C/C++
standards would be appreciated.

Thanks,
Dave Kreitzer
Intel Compilers

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Speculative load optimization

Eric Fiselier via cfe-dev
I think ternary is equivalent to an if in that only one side is evaluated. But I think you're saying that gcc and icc will dereference s for both sides of the if and then select between the loaded values.  Like this

    struct S* f1(struct S *s, int x)
    {
      struct S *a = s->p1;
      struct S *b = s->p2;
      return (x) ? a : b;
    }

~Craig

On Mon, Oct 9, 2017 at 2:16 PM, Kreitzer, David L via cfe-dev <[hidden email]> wrote:
This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
that involves speculating loads. The patch itself needs some work regardless,
but we are questioning the legality of the optimization, which is currently
performed by both gcc and icc. The desired transformation looks like this:

    typedef struct S {
      char padding[4088];
      struct S *p1;
      struct S *p2;
    } S;

    struct S* f1(struct S *s, int x)
    {
      S *r;
      if (x)
        r = s->p1;
      else
        r = s->p2;
      return r;
    }

TO

    struct S* f1(struct S *s, int x)
    {
      return (x) ? s->p1 : s->p2;
    }

The fundamental question seems to be whether loading one member of struct S
makes it valid to load other members of the same struct. Both gcc & icc
seem to think so and make a distinction between this case and a similar case
where one field is accessed through an lvalue of a different type:

    typedef struct T {
      char padding[4088];
      struct S *p1;
    } T;

    struct S* f2(struct S *s, int x)
    {
      S *r;
      if (x)
        r = ((T*)s)->p1;
      else
        r = s->p2;
      return r;
    }

Neither compiler will transform this case.

Any insight on the validity of this optimization in relation to the C/C++
standards would be appreciated.

Thanks,
Dave Kreitzer
Intel Compilers

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Speculative load optimization

Eric Fiselier via cfe-dev
In reply to this post by Eric Fiselier via cfe-dev
I am not a language lawyer, but I'll atttempt to answer anyway.

On Mon, Oct 9, 2017 at 2:16 PM, Kreitzer, David L via cfe-dev
<[hidden email]> wrote:

> This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
> that involves speculating loads. The patch itself needs some work regardless,
> but we are questioning the legality of the optimization, which is currently
> performed by both gcc and icc. The desired transformation looks like this:
>
>     typedef struct S {
>       char padding[4088];
>       struct S *p1;
>       struct S *p2;
>     } S;
>
>     struct S* f1(struct S *s, int x)
>     {
>       S *r;
>       if (x)
>         r = s->p1;
>       else
>         r = s->p2;
>       return r;
>     }
>
> TO
>
>     struct S* f1(struct S *s, int x)
>     {
>       return (x) ? s->p1 : s->p2;
>     }
>
> The fundamental question seems to be whether loading one member of struct S
> makes it valid to load other members of the same struct.

Yes, I believe that's true. If we're dereferencing s on both paths, it
must point to a struct S object, and then loading from any member of
that object should be fine.

> Both gcc & icc
> seem to think so and make a distinction between this case and a similar case
> where one field is accessed through an lvalue of a different type:
>
>     typedef struct T {
>       char padding[4088];
>       struct S *p1;
>     } T;
>
>     struct S* f2(struct S *s, int x)
>     {
>       S *r;
>       if (x)
>         r = ((T*)s)->p1;
>       else
>         r = s->p2;
>       return r;
>     }
>
> Neither compiler will transform this case.

I suspect it would be within the compiler's rights, but my language
knowledge is too weak. Does the cast imply that s points to a valid
struct S object (or null, but then we couldn't dereference it)? I'm
curious to find out :-)

 - Hans
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Speculative load optimization

Eric Fiselier via cfe-dev
I suspect

>         r = ((T*)s)->p1;

Is allowed under C99 §6.5, strict aliasing rule
An object shall have its stored value accessed only by an lvalue expression that has one of the following types
...
- an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union)

CMIIW, the cmov optimizer may be pessimistic here -- access to one path implies object is non-null, dereferenceable by compatible types.

-Kevin


On Mon, Oct 9, 2017 at 8:00 PM, Hans Wennborg via cfe-dev <[hidden email]> wrote:
I am not a language lawyer, but I'll atttempt to answer anyway.

On Mon, Oct 9, 2017 at 2:16 PM, Kreitzer, David L via cfe-dev
<[hidden email]> wrote:
> This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
> that involves speculating loads. The patch itself needs some work regardless,
> but we are questioning the legality of the optimization, which is currently
> performed by both gcc and icc. The desired transformation looks like this:
>
>     typedef struct S {
>       char padding[4088];
>       struct S *p1;
>       struct S *p2;
>     } S;
>
>     struct S* f1(struct S *s, int x)
>     {
>       S *r;
>       if (x)
>         r = s->p1;
>       else
>         r = s->p2;
>       return r;
>     }
>
> TO
>
>     struct S* f1(struct S *s, int x)
>     {
>       return (x) ? s->p1 : s->p2;
>     }
>
> The fundamental question seems to be whether loading one member of struct S
> makes it valid to load other members of the same struct.

Yes, I believe that's true. If we're dereferencing s on both paths, it
must point to a struct S object, and then loading from any member of
that object should be fine.

> Both gcc & icc
> seem to think so and make a distinction between this case and a similar case
> where one field is accessed through an lvalue of a different type:
>
>     typedef struct T {
>       char padding[4088];
>       struct S *p1;
>     } T;
>
>     struct S* f2(struct S *s, int x)
>     {
>       S *r;
>       if (x)
>         r = ((T*)s)->p1;
>       else
>         r = s->p2;
>       return r;
>     }
>
> Neither compiler will transform this case.

I suspect it would be within the compiler's rights, but my language
knowledge is too weak. Does the cast imply that s points to a valid
struct S object (or null, but then we couldn't dereference it)? I'm
curious to find out :-)

 - Hans
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Speculative load optimization

Eric Fiselier via cfe-dev
In reply to this post by Eric Fiselier via cfe-dev

[+Richard, Chandler]


On 10/09/2017 07:00 PM, Hans Wennborg via cfe-dev wrote:
I am not a language lawyer, but I'll atttempt to answer anyway.

On Mon, Oct 9, 2017 at 2:16 PM, Kreitzer, David L via cfe-dev
[hidden email] wrote:
This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
that involves speculating loads. The patch itself needs some work regardless,
but we are questioning the legality of the optimization, which is currently
performed by both gcc and icc. The desired transformation looks like this:

    typedef struct S {
      char padding[4088];
      struct S *p1;
      struct S *p2;
    } S;

    struct S* f1(struct S *s, int x)
    {
      S *r;
      if (x)
        r = s->p1;
      else
        r = s->p2;
      return r;
    }

TO

    struct S* f1(struct S *s, int x)
    {
      return (x) ? s->p1 : s->p2;
    }

The fundamental question seems to be whether loading one member of struct S
makes it valid to load other members of the same struct.
Yes, I believe that's true. If we're dereferencing s on both paths, it
must point to a struct S object, and then loading from any member of
that object should be fine.

I also believe that this is correct.

I think that Chandler summarized some things to be careful about in this regard here:
  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/477944.html

Of the three points highlighted here, the second clearly might apply:
2) Related to #1, there are applications that rely on this memory model,
for example structures where entire regions of the structure live in
protected pages and cannot be correctly accessed.

This, however, is clearly an extension to the standard memory model, and I see no reason to support this by default. Speculatively loading cache lines under contention from other cores might not be a good thing to do for performance reasons, but that's not a correctness issue.

 -Hal


Both gcc & icc
seem to think so and make a distinction between this case and a similar case
where one field is accessed through an lvalue of a different type:

    typedef struct T {
      char padding[4088];
      struct S *p1;
    } T;

    struct S* f2(struct S *s, int x)
    {
      S *r;
      if (x)
        r = ((T*)s)->p1;
      else
        r = s->p2;
      return r;
    }

Neither compiler will transform this case.
I suspect it would be within the compiler's rights, but my language
knowledge is too weak. Does the cast imply that s points to a valid
struct S object (or null, but then we couldn't dereference it)? I'm
curious to find out :-)

 - Hans
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Speculative load optimization

Eric Fiselier via cfe-dev
On Oct 10, 2017, at 12:12 PM, Hal Finkel via cfe-dev <[hidden email]> wrote:

[+Richard, Chandler]


On 10/09/2017 07:00 PM, Hans Wennborg via cfe-dev wrote:
I am not a language lawyer, but I'll atttempt to answer anyway.

On Mon, Oct 9, 2017 at 2:16 PM, Kreitzer, David L via cfe-dev
[hidden email] wrote:
This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
that involves speculating loads. The patch itself needs some work regardless,
but we are questioning the legality of the optimization, which is currently
performed by both gcc and icc. The desired transformation looks like this:

    typedef struct S {
      char padding[4088];
      struct S *p1;
      struct S *p2;
    } S;

    struct S* f1(struct S *s, int x)
    {
      S *r;
      if (x)
        r = s->p1;
      else
        r = s->p2;
      return r;
    }

TO

    struct S* f1(struct S *s, int x)
    {
      return (x) ? s->p1 : s->p2;
    }

The fundamental question seems to be whether loading one member of struct S
makes it valid to load other members of the same struct.
Yes, I believe that's true. If we're dereferencing s on both paths, it
must point to a struct S object, and then loading from any member of
that object should be fine.

I also believe that this is correct.

I think that Chandler summarized some things to be careful about in this regard here:
  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/477944.html

Of the three points highlighted here, the second clearly might apply:
2) Related to #1, there are applications that rely on this memory model,
for example structures where entire regions of the structure live in
protected pages and cannot be correctly accessed.

This, however, is clearly an extension to the standard memory model, and I see no reason to support this by default. Speculatively loading cache lines under contention from other cores might not be a good thing to do for performance reasons, but that's not a correctness issue.

I agree that this optimization is legal under the C and C++ specs.

John.


 -Hal


      
Both gcc & icc
seem to think so and make a distinction between this case and a similar case
where one field is accessed through an lvalue of a different type:

    typedef struct T {
      char padding[4088];
      struct S *p1;
    } T;

    struct S* f2(struct S *s, int x)
    {
      S *r;
      if (x)
        r = ((T*)s)->p1;
      else
        r = s->p2;
      return r;
    }

Neither compiler will transform this case.
I suspect it would be within the compiler's rights, but my language
knowledge is too weak. Does the cast imply that s points to a valid
struct S object (or null, but then we couldn't dereference it)? I'm
curious to find out :-)

 - Hans
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Speculative load optimization

Eric Fiselier via cfe-dev

On Oct 10, 2017, at 2:57 PM, John McCall via cfe-dev <[hidden email]> wrote:

On Oct 10, 2017, at 12:12 PM, Hal Finkel via cfe-dev <[hidden email]> wrote:

[+Richard, Chandler]


On 10/09/2017 07:00 PM, Hans Wennborg via cfe-dev wrote:
I am not a language lawyer, but I'll atttempt to answer anyway.

On Mon, Oct 9, 2017 at 2:16 PM, Kreitzer, David L via cfe-dev
[hidden email] wrote:
This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
that involves speculating loads. The patch itself needs some work regardless,
but we are questioning the legality of the optimization, which is currently
performed by both gcc and icc. The desired transformation looks like this:

    typedef struct S {
      char padding[4088];
      struct S *p1;
      struct S *p2;
    } S;

    struct S* f1(struct S *s, int x)
    {
      S *r;
      if (x)
        r = s->p1;
      else
        r = s->p2;
      return r;
    }

TO

    struct S* f1(struct S *s, int x)
    {
      return (x) ? s->p1 : s->p2;
    }

The fundamental question seems to be whether loading one member of struct S
makes it valid to load other members of the same struct.
Yes, I believe that's true. If we're dereferencing s on both paths, it
must point to a struct S object, and then loading from any member of
that object should be fine.

I also believe that this is correct.

I think that Chandler summarized some things to be careful about in this regard here:
  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/477944.html

Of the three points highlighted here, the second clearly might apply:
2) Related to #1, there are applications that rely on this memory model,
for example structures where entire regions of the structure live in
protected pages and cannot be correctly accessed.

This, however, is clearly an extension to the standard memory model, and I see no reason to support this by default. Speculatively loading cache lines under contention from other cores might not be a good thing to do for performance reasons, but that's not a correctness issue.

I agree that this optimization is legal under the C and C++ specs.

Sorry, I'd meant to add a proviso to this.  It's legal under the C and C++
specs to the same extent that any load speculation is.

In general, load speculation is theoretically problematic in C/C++ because
it is undefined behavior to have a race even if the race is spurious.  For example,
a program containing a load that races with a store has undefined behavior even
if the result of the load is never used.  Therefore, strictly speaking, speculating a
load can introduce a race that didn't otherwise exist, meaning that it can introduce
undefined behavior to a program, meaning that it's not a legal transformation.  Now,
AFAIK LLVM doesn't have any multi-threaded analyses that would actually
miscompile such a speculation, but it can still matter because e.g. TSan emits code
that dynamically enforces the memory model, and it will dutifully report the race here.

(Or, to quote from C11 5.1.2.4p28:

  Transformations that introduce a speculative read of a potentially shared memory
  location may not preserve the semantics of the program as defined in this standard,
  since they potentially introduce a data race. However, they are typically valid in the
  context of an optimizing compiler that targets a specific machine with well-defined
  semantics for data races. They would be invalid for a hypothetical machine that is
  not tolerant of races or provides hardware race detection.

In this sense, TSan makes an arbitrary machine race-intolerant.)

One obvious-seeming idea that I haven't fully thought through is to simply give
speculated loads weaker semantics in LLVM, so that it's not undefined behavior
for a speculated load to be part of a race unless the result of the load is actually
used.  Of course, you'd have to define what "used" means.

John.



John.


 -Hal

Both gcc & icc
seem to think so and make a distinction between this case and a similar case
where one field is accessed through an lvalue of a different type:

    typedef struct T {
      char padding[4088];
      struct S *p1;
    } T;

    struct S* f2(struct S *s, int x)
    {
      S *r;
      if (x)
        r = ((T*)s)->p1;
      else
        r = s->p2;
      return r;
    }

Neither compiler will transform this case.
I suspect it would be within the compiler's rights, but my language
knowledge is too weak. Does the cast imply that s points to a valid
struct S object (or null, but then we couldn't dereference it)? I'm
curious to find out :-)

 - Hans
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Speculative load optimization

Eric Fiselier via cfe-dev
On Tue, Oct 10, 2017 at 12:23 PM, John McCall via cfe-dev <[hidden email]> wrote:

On Oct 10, 2017, at 2:57 PM, John McCall via cfe-dev <[hidden email]> wrote:

On Oct 10, 2017, at 12:12 PM, Hal Finkel via cfe-dev <[hidden email]> wrote:

[+Richard, Chandler]


On 10/09/2017 07:00 PM, Hans Wennborg via cfe-dev wrote:
I am not a language lawyer, but I'll atttempt to answer anyway.

On Mon, Oct 9, 2017 at 2:16 PM, Kreitzer, David L via cfe-dev
[hidden email] wrote:
This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
that involves speculating loads. The patch itself needs some work regardless,
but we are questioning the legality of the optimization, which is currently
performed by both gcc and icc. The desired transformation looks like this:

    typedef struct S {
      char padding[4088];
      struct S *p1;
      struct S *p2;
    } S;

    struct S* f1(struct S *s, int x)
    {
      S *r;
      if (x)
        r = s->p1;
      else
        r = s->p2;
      return r;
    }

TO

    struct S* f1(struct S *s, int x)
    {
      return (x) ? s->p1 : s->p2;
    }

The fundamental question seems to be whether loading one member of struct S
makes it valid to load other members of the same struct.
Yes, I believe that's true. If we're dereferencing s on both paths, it
must point to a struct S object, and then loading from any member of
that object should be fine.

I also believe that this is correct.

I think that Chandler summarized some things to be careful about in this regard here:
  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/477944.html

Of the three points highlighted here, the second clearly might apply:
2) Related to #1, there are applications that rely on this memory model,
for example structures where entire regions of the structure live in
protected pages and cannot be correctly accessed.

This, however, is clearly an extension to the standard memory model, and I see no reason to support this by default. Speculatively loading cache lines under contention from other cores might not be a good thing to do for performance reasons, but that's not a correctness issue.

I agree that this optimization is legal under the C and C++ specs.

Sorry, I'd meant to add a proviso to this.  It's legal under the C and C++
specs to the same extent that any load speculation is.

In general, load speculation is theoretically problematic in C/C++ because
it is undefined behavior to have a race even if the race is spurious.  For example,
a program containing a load that races with a store has undefined behavior even
if the result of the load is never used.  Therefore, strictly speaking, speculating a
load can introduce a race that didn't otherwise exist, meaning that it can introduce
undefined behavior to a program,

This is where the argument falls down.  If there's no race in the source code, the behavior *is* defined.  It doesn't matter that the hardware might have a racy read in the implementation, so long as it doesn't affect the behavior of the abstract machine.
 
meaning that it's not a legal transformation.

There might be other reasons (maybe practical reasons) why this isn't permitted, but it's not because it introduces UB (it doesn't).
 
 Now,
AFAIK LLVM doesn't have any multi-threaded analyses that would actually
miscompile such a speculation, but it can still matter because e.g. TSan emits code
that dynamically enforces the memory model, and it will dutifully report the race here.

(Or, to quote from C11 5.1.2.4p28:

  Transformations that introduce a speculative read of a potentially shared memory
  location may not preserve the semantics of the program as defined in this standard,
  since they potentially introduce a data race.

I think this is somewhat muddy, and essentially contradicted by the text immediately following it:
 
However, they are typically valid in the
  context of an optimizing compiler that targets a specific machine with well-defined
  semantics for data races. They would be invalid for a hypothetical machine that is
  not tolerant of races or provides hardware race detection.

In this sense, TSan makes an arbitrary machine race-intolerant.)

Right, this transformation is invalid according to TSan (and a compiler targeting TSan should not apply it) .  Apart from such debug implementations, I'm not aware of an extant platform on which it's problematic.

-- James


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Speculative load optimization

Eric Fiselier via cfe-dev

On Oct 10, 2017, at 4:01 PM, James Dennett <[hidden email]> wrote:

On Tue, Oct 10, 2017 at 12:23 PM, John McCall via cfe-dev <[hidden email]> wrote:

On Oct 10, 2017, at 2:57 PM, John McCall via cfe-dev <[hidden email]> wrote:

On Oct 10, 2017, at 12:12 PM, Hal Finkel via cfe-dev <[hidden email]> wrote:

[+Richard, Chandler]


On 10/09/2017 07:00 PM, Hans Wennborg via cfe-dev wrote:
I am not a language lawyer, but I'll atttempt to answer anyway.

On Mon, Oct 9, 2017 at 2:16 PM, Kreitzer, David L via cfe-dev
[hidden email] wrote:
This llvm patch, https://reviews.llvm.org/D37289, attempts to do an optimization
that involves speculating loads. The patch itself needs some work regardless,
but we are questioning the legality of the optimization, which is currently
performed by both gcc and icc. The desired transformation looks like this:

    typedef struct S {
      char padding[4088];
      struct S *p1;
      struct S *p2;
    } S;

    struct S* f1(struct S *s, int x)
    {
      S *r;
      if (x)
        r = s->p1;
      else
        r = s->p2;
      return r;
    }

TO

    struct S* f1(struct S *s, int x)
    {
      return (x) ? s->p1 : s->p2;
    }

The fundamental question seems to be whether loading one member of struct S
makes it valid to load other members of the same struct.
Yes, I believe that's true. If we're dereferencing s on both paths, it
must point to a struct S object, and then loading from any member of
that object should be fine.

I also believe that this is correct.

I think that Chandler summarized some things to be careful about in this regard here:
  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/477944.html

Of the three points highlighted here, the second clearly might apply:
2) Related to #1, there are applications that rely on this memory model,
for example structures where entire regions of the structure live in
protected pages and cannot be correctly accessed.

This, however, is clearly an extension to the standard memory model, and I see no reason to support this by default. Speculatively loading cache lines under contention from other cores might not be a good thing to do for performance reasons, but that's not a correctness issue.

I agree that this optimization is legal under the C and C++ specs.

Sorry, I'd meant to add a proviso to this.  It's legal under the C and C++
specs to the same extent that any load speculation is.

In general, load speculation is theoretically problematic in C/C++ because
it is undefined behavior to have a race even if the race is spurious.  For example,
a program containing a load that races with a store has undefined behavior even
if the result of the load is never used.  Therefore, strictly speaking, speculating a
load can introduce a race that didn't otherwise exist, meaning that it can introduce
undefined behavior to a program,

This is where the argument falls down.  If there's no race in the source code, the behavior *is* defined.  It doesn't matter that the hardware might have a racy read in the implementation, so long as it doesn't affect the behavior of the abstract machine.

Okay, sorry, let me be more precise about abstract machines.

If an implementation used the rules of the C abstract machine for its intermediate representation, it would not be allowed to speculate loads because doing so would introduce undefined behavior on the abstract machine.  It doesn't matter what the actual target machine would do.

Now, Clang and other LLVM frontends do not use the C abstract machine for their intermediate representations: they use LLVM, and LLVM is its own abstract language with its own abstract machine and its own concept of undefined behavior.  A frontend generating LLVM is doing a source-to-source transformation which is expected to maintain semantics, including the absence of undefined behavior.  LLVM's abstract machine says that racing loads do not have undefined behavior, they merely yield undef.  (I hadn't checked this specifically and didn't realize the LLVM rule was so much weaker than the C rule, for which I apologize.)  So speculating a load in LLVM is fine, at least as far as races go.

TSan appears to be assuming that the loads and stores it sees have C-machine semantics, which only works in close coordination with the frontend.

John.

meaning that it's not a legal transformation.

There might be other reasons (maybe practical reasons) why this isn't permitted, but it's not because it introduces UB (it doesn't).
 
 Now,
AFAIK LLVM doesn't have any multi-threaded analyses that would actually
miscompile such a speculation, but it can still matter because e.g. TSan emits code
that dynamically enforces the memory model, and it will dutifully report the race here.

(Or, to quote from C11 5.1.2.4p28:

  Transformations that introduce a speculative read of a potentially shared memory
  location may not preserve the semantics of the program as defined in this standard,
  since they potentially introduce a data race.

I think this is somewhat muddy, and essentially contradicted by the text immediately following it:
 
However, they are typically valid in the
  context of an optimizing compiler that targets a specific machine with well-defined
  semantics for data races. They would be invalid for a hypothetical machine that is
  not tolerant of races or provides hardware race detection.

In this sense, TSan makes an arbitrary machine race-intolerant.)

Right, this transformation is invalid according to TSan (and a compiler targeting TSan should not apply it) .  Apart from such debug implementations, I'm not aware of an extant platform on which it's problematic.



-- James



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev