[RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

Robinson, Paul via cfe-dev
 
Hello,
 
In the discussion on bugzilla 30729, it is mentioned that the 'this' pointer needs to be valid upon entry to a non-static method.  Does the standard guarantee this is non-null on entry?  If so, is there a reason we can't use that fact to mark 'this' as 'dereferenceable(sizeof(*this))'?
 
There are LICM optimizations we can do based on the knowledge that 'this' is non-null on entry to a non-static member function. 
 
eg.  For the following IR,  the two highlighted loads are not being hoisted out of the for loop because we are not able to guarantee that the pointer is non-null.  If the 'this' pointer is guaranteed to be non-null on entry and we mark it thus,  then the 2 loads within the for-loop body can then be hoisted out into the loop preheader.
 
This is of course just one example of an optimization we could perform based on this knowledge, but there are probably a number of others (i.e. anything that relies on a pointer being 'dereferenceable(N)').
 
$ cat a.ll
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%struct.S = type { <4 x i32>, <4 x i32> }
; Function Attrs: norecurse nounwind readonly
define <4 x i32> @_ZNK1S20constShouldBeHoistedEmDv4_i(%struct.S* nocapture readonly %this, i64 %n, <4 x
i32> %x) align 2 {
entry:
 %tobool9 = icmp eq i64 %n, 0
 br i1 %tobool9, label %for.end, label %for.body.lr.ph

for.body.lr.ph:                                   ; preds = %entry
 %k1 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 0
 %k2 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 1
 br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %if.end
 %n.addr.011 = phi i64 [ %n, %for.body.lr.ph ], [ %div, %if.end ]
 %x.addr.010 = phi <4 x i32> [ %x, %for.body.lr.ph ], [ %x.addr.1, %if.end ]
 %rem = and i64 %n.addr.011, 15
 %cmp = icmp eq i64 %rem, 0
 br i1 %cmp, label %if.end, label %if.then

if.then:                                          ; preds = %for.body
 
%0 = load <4 x i32>, <4 x i32>* %k1, align 16
 %add = add <4 x i32> %0, %x.addr.010
 
%1 = load <4 x i32>, <4 x i32>* %k2, align 16
 %xor = xor <4 x i32> %add, %1
 br label %if.end

if.end:                                           ; preds = %for.body, %if.then
 %x.addr.1 = phi <4 x i32> [ %xor, %if.then ], [ %x.addr.010, %for.body ]
 %div = lshr i64 %n.addr.011, 4
 %tobool = icmp eq i64 %div, 0
 br i1 %tobool, label %for.end, label %for.body

for.end:                                          ; preds = %if.end, %entry
 %x.addr.0.lcssa = phi <4 x i32> [ %x, %entry ], [ %x.addr.1, %if.end ]
 ret <4 x i32> %x.addr.0.lcssa
}

 
 

Regards,
Lei Huang
 
 
LLVM Development on POWER
Internal mail: C2/YGK/8200/MKM
Phone: (905) 413-4419
TieLine: 969-4419
E-mail: [hidden email]


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

Robinson, Paul via cfe-dev


On 12/05/2017 01:47 PM, Lei Huang via cfe-dev wrote:
 
Hello,
 
In the discussion on bugzilla 30729, it is mentioned that the 'this' pointer needs to be valid upon entry to a non-static method.  Does the standard guarantee this is non-null on entry?

Yes. You have to call a non-static member function on a valid object.

 If so, is there a reason we can't use that fact to mark 'this' as 'dereferenceable(sizeof(*this))'?

Yes, this seems like a good idea.

 -Hal

 
There are LICM optimizations we can do based on the knowledge that 'this' is non-null on entry to a non-static member function. 
 
eg.  For the following IR,  the two highlighted loads are not being hoisted out of the for loop because we are not able to guarantee that the pointer is non-null.  If the 'this' pointer is guaranteed to be non-null on entry and we mark it thus,  then the 2 loads within the for-loop body can then be hoisted out into the loop preheader.
 
This is of course just one example of an optimization we could perform based on this knowledge, but there are probably a number of others (i.e. anything that relies on a pointer being 'dereferenceable(N)').
 
$ cat a.ll
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%struct.S = type { <4 x i32>, <4 x i32> }
; Function Attrs: norecurse nounwind readonly
define <4 x i32> @_ZNK1S20constShouldBeHoistedEmDv4_i(%struct.S* nocapture readonly %this, i64 %n, <4 x
i32> %x) align 2 {
entry:
 %tobool9 = icmp eq i64 %n, 0
 br i1 %tobool9, label %for.end, label %for.body.lr.ph

for.body.lr.ph:                                   ; preds = %entry
 %k1 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 0
 %k2 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 1
 br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %if.end
 %n.addr.011 = phi i64 [ %n, %for.body.lr.ph ], [ %div, %if.end ]
 %x.addr.010 = phi <4 x i32> [ %x, %for.body.lr.ph ], [ %x.addr.1, %if.end ]
 %rem = and i64 %n.addr.011, 15
 %cmp = icmp eq i64 %rem, 0
 br i1 %cmp, label %if.end, label %if.then

if.then:                                          ; preds = %for.body
 
%0 = load <4 x i32>, <4 x i32>* %k1, align 16
 %add = add <4 x i32> %0, %x.addr.010
 
%1 = load <4 x i32>, <4 x i32>* %k2, align 16
 %xor = xor <4 x i32> %add, %1
 br label %if.end

if.end:                                           ; preds = %for.body, %if.then
 %x.addr.1 = phi <4 x i32> [ %xor, %if.then ], [ %x.addr.010, %for.body ]
 %div = lshr i64 %n.addr.011, 4
 %tobool = icmp eq i64 %div, 0
 br i1 %tobool, label %for.end, label %for.body

for.end:                                          ; preds = %if.end, %entry
 %x.addr.0.lcssa = phi <4 x i32> [ %x, %entry ], [ %x.addr.1, %if.end ]
 ret <4 x i32> %x.addr.0.lcssa
}

 
 

Regards,
Lei Huang
 
 
LLVM Development on POWER
Internal mail: C2/YGK/8200/MKM
Phone: (905) 413-4419
TieLine: 969-4419
E-mail: [hidden email]



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

Robinson, Paul via cfe-dev

I think that the key concept goes all the way back to the original C++ Standard (C++98), where section 5.2.2 “Function Call” states:

 

The first expression in the postfix expression is then called the object expression, and the call

is as a member of the object pointed to or referred to. In the case of an implicit class member access, the

implied object is the one pointed to by this. [Note: a member function call of the form f() is interpreted

as (*this).f() (see 9.3.1). ]

 

A NULL pointer does not point to an object, so two things make a NULL ‘this’ invalid - first the object expression does not refer to an object and is thus undefined; and second, the note which clarifies that it is equivalent to ‘(*this).’ means that if ‘this’ is NULL, then it is a NULL pointer dereference which is already undefined behaviour elsewhere in the Standard.  I can’t remember if “notes” are normative.

 

In “very old C++”, i.e. prior to the introduction of static member functions (circa 1987), the following idiom was not unusual to get the semantic intent of a static member provide that ‘this’ was neither implicitly nor explicitly used:

 

class T {

public:

  void wishIwasStatic();

};

 

...

((T*)0)->wishIwasStatic();

 

Such functions had access to all object of type ‘T’, but without the need for ‘friend’ declarations.  But this was only a stop-gap until the introduction of static member functions was devised.

 

I very much doubt that pre-C++98 code such as this is still part of any production application, and if it is, it really ought to be rewritten.

 

Making ‘this’ not de-referenceable seems to me to be a really good idea semantically, and if it yields performance advantages too, then this is a really good thing.  Perhaps it might be a good idea to ping either Bjarne Stroustrup or the C++ Standards committee to be sure - though I expect many of the committee’s members are also participants in this forum.

 

            MartinO

 

From: cfe-dev [mailto:[hidden email]] On Behalf Of Hal Finkel via cfe-dev
Sent: 06 December 2017 13:07
To: Lei Huang <[hidden email]>; [hidden email]
Cc: LLVM on Power <[hidden email]>
Subject: Re: [cfe-dev] [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

 

 

On 12/05/2017 01:47 PM, Lei Huang via cfe-dev wrote:

 

Hello,
 
In the discussion on bugzilla 30729, it is mentioned that the 'this' pointer needs to be valid upon entry to a non-static method.  Does the standard guarantee this is non-null on entry?


Yes. You have to call a non-static member function on a valid object.


 If so, is there a reason we can't use that fact to mark 'this' as 'dereferenceable(sizeof(*this))'?


Yes, this seems like a good idea.

 -Hal


 
There are LICM optimizations we can do based on the knowledge that 'this' is non-null on entry to a non-static member function. 
 
eg.  For the following IR,  the two highlighted loads are not being hoisted out of the for loop because we are not able to guarantee that the pointer is non-null.  If the 'this' pointer is guaranteed to be non-null on entry and we mark it thus,  then the 2 loads within the for-loop body can then be hoisted out into the loop preheader.
 
This is of course just one example of an optimization we could perform based on this knowledge, but there are probably a number of others (i.e. anything that relies on a pointer being 'dereferenceable(N)').
 
$ cat a.ll
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%struct.S = type { <4 x i32>, <4 x i32> }
; Function Attrs: norecurse nounwind readonly
define <4 x i32> @_ZNK1S20constShouldBeHoistedEmDv4_i(%struct.S* nocapture readonly %this, i64 %n, <4 x
i32> %x) align 2 {
entry:
 %tobool9 = icmp eq i64 %n, 0
 br i1 %tobool9, label %for.end, label %for.body.lr.ph

for.body.lr.ph:                                   ; preds = %entry
 %k1 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 0
 %k2 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 1
 br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %if.end
 %n.addr.011 = phi i64 [ %n, %for.body.lr.ph ], [ %div, %if.end ]
 %x.addr.010 = phi <4 x i32> [ %x, %for.body.lr.ph ], [ %x.addr.1, %if.end ]
 %rem = and i64 %n.addr.011, 15
 %cmp = icmp eq i64 %rem, 0
 br i1 %cmp, label %if.end, label %if.then

if.then:                                          ; preds = %for.body
 %0 = load <4 x i32>, <4 x i32>* %k1, align 16

 %add = add <4 x i32> %0, %x.addr.010
 %1 = load <4 x i32>, <4 x i32>* %k2, align 16

 %xor = xor <4 x i32> %add, %1
 br label %if.end

if.end:                                           ; preds = %for.body, %if.then
 %x.addr.1 = phi <4 x i32> [ %xor, %if.then ], [ %x.addr.010, %for.body ]
 %div = lshr i64 %n.addr.011, 4
 %tobool = icmp eq i64 %div, 0
 br i1 %tobool, label %for.end, label %for.body

for.end:                                          ; preds = %if.end, %entry
 %x.addr.0.lcssa = phi <4 x i32> [ %x, %entry ], [ %x.addr.1, %if.end ]
 ret <4 x i32> %x.addr.0.lcssa
}

 

 


Regards,
Lei Huang

 

 

LLVM Development on POWER

Internal mail: C2/YGK/8200/MKM
Phone: (905) 413-4419
TieLine: 969-4419
E-mail: [hidden email]





_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

Robinson, Paul via cfe-dev


On 12/06/2017 01:32 PM, Martin J. O'Riordan wrote:

I think that the key concept goes all the way back to the original C++ Standard (C++98), where section 5.2.2 “Function Call” states:

 

The first expression in the postfix expression is then called the object expression, and the call

is as a member of the object pointed to or referred to. In the case of an implicit class member access, the

implied object is the one pointed to by this. [Note: a member function call of the form f() is interpreted

as (*this).f() (see 9.3.1). ]

 

A NULL pointer does not point to an object, so two things make a NULL ‘this’ invalid - first the object expression does not refer to an object and is thus undefined; and second, the note which clarifies that it is equivalent to ‘(*this).’ means that if ‘this’ is NULL, then it is a NULL pointer dereference which is already undefined behaviour elsewhere in the Standard.  I can’t remember if “notes” are normative.

 

In “very old C++”, i.e. prior to the introduction of static member functions (circa 1987), the following idiom was not unusual to get the semantic intent of a static member provide that ‘this’ was neither implicitly nor explicitly used:

 

class T {

public:

  void wishIwasStatic();

};

 

...

((T*)0)->wishIwasStatic();

 

Such functions had access to all object of type ‘T’, but without the need for ‘friend’ declarations.  But this was only a stop-gap until the introduction of static member functions was devised.

 

I very much doubt that pre-C++98 code such as this is still part of any production application, and if it is, it really ought to be rewritten.


Regardless, this is not legal C++ code, and we don't need to support it.

 

Making ‘this’ not de-referenceable seems to me to be a really good idea semantically, and if it yields performance advantages too, then this is a really good thing.  Perhaps it might be a good idea to ping either Bjarne Stroustrup or the C++ Standards committee to be sure - though I expect many of the committee’s members are also participants in this forum.


Yes, many of us are here :-)

 -Hal

 

            MartinO

 

From: cfe-dev [[hidden email]] On Behalf Of Hal Finkel via cfe-dev
Sent: 06 December 2017 13:07
To: Lei Huang [hidden email]; [hidden email]
Cc: LLVM on Power [hidden email]
Subject: Re: [cfe-dev] [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

 

 

On 12/05/2017 01:47 PM, Lei Huang via cfe-dev wrote:

 

Hello,
 
In the discussion on bugzilla 30729, it is mentioned that the 'this' pointer needs to be valid upon entry to a non-static method.  Does the standard guarantee this is non-null on entry?


Yes. You have to call a non-static member function on a valid object.


 If so, is there a reason we can't use that fact to mark 'this' as 'dereferenceable(sizeof(*this))'?


Yes, this seems like a good idea.

 -Hal


 
There are LICM optimizations we can do based on the knowledge that 'this' is non-null on entry to a non-static member function. 
 
eg.  For the following IR,  the two highlighted loads are not being hoisted out of the for loop because we are not able to guarantee that the pointer is non-null.  If the 'this' pointer is guaranteed to be non-null on entry and we mark it thus,  then the 2 loads within the for-loop body can then be hoisted out into the loop preheader.
 
This is of course just one example of an optimization we could perform based on this knowledge, but there are probably a number of others (i.e. anything that relies on a pointer being 'dereferenceable(N)').
 
$ cat a.ll
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%struct.S = type { <4 x i32>, <4 x i32> }
; Function Attrs: norecurse nounwind readonly
define <4 x i32> @_ZNK1S20constShouldBeHoistedEmDv4_i(%struct.S* nocapture readonly %this, i64 %n, <4 x
i32> %x) align 2 {
entry:
 %tobool9 = icmp eq i64 %n, 0
 br i1 %tobool9, label %for.end, label %for.body.lr.ph

for.body.lr.ph:                                   ; preds = %entry
 %k1 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 0
 %k2 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 1
 br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %if.end
 %n.addr.011 = phi i64 [ %n, %for.body.lr.ph ], [ %div, %if.end ]
 %x.addr.010 = phi <4 x i32> [ %x, %for.body.lr.ph ], [ %x.addr.1, %if.end ]
 %rem = and i64 %n.addr.011, 15
 %cmp = icmp eq i64 %rem, 0
 br i1 %cmp, label %if.end, label %if.then

if.then:                                          ; preds = %for.body
 %0 = load <4 x i32>, <4 x i32>* %k1, align 16

 %add = add <4 x i32> %0, %x.addr.010
 %1 = load <4 x i32>, <4 x i32>* %k2, align 16

 %xor = xor <4 x i32> %add, %1
 br label %if.end

if.end:                                           ; preds = %for.body, %if.then
 %x.addr.1 = phi <4 x i32> [ %xor, %if.then ], [ %x.addr.010, %for.body ]
 %div = lshr i64 %n.addr.011, 4
 %tobool = icmp eq i64 %div, 0
 br i1 %tobool, label %for.end, label %for.body

for.end:                                          ; preds = %if.end, %entry
 %x.addr.0.lcssa = phi <4 x i32> [ %x, %entry ], [ %x.addr.1, %if.end ]
 ret <4 x i32> %x.addr.0.lcssa
}

 

 


Regards,
Lei Huang

 

 

LLVM Development on POWER

Internal mail: C2/YGK/8200/MKM
Phone: (905) 413-4419
TieLine: 969-4419
E-mail: [hidden email]





_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

Robinson, Paul via cfe-dev
In reply to this post by Robinson, Paul via cfe-dev
I am not sure that there is a requirement that *this remains a valid object.
Experimenting with the -O3 behaviour of the following program (with a powerpc64le-unknown-linux-gnu target), it seems the dereferenceable attribute is scope based?

int a[1], b[1], *cp;
void dobadstuff() { delete cp; }

__attribute__((__noinline__))
void foo(int * __restrict__ a, int * __restrict__ b, int &c, int n) {
  dobadstuff();
  for (int i = 0; i < n; ++i)
    if (a[i] > 0)
      a[i] = c * b[i];
}

int main(void) {
  cp = new int;
  foo(a, b, *cp, 1);
}

Valgrind reports an invalid read.


On Wed, Dec 6, 2017 at 8:06 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 12/05/2017 01:47 PM, Lei Huang via cfe-dev wrote:
 
Hello,
 
In the discussion on bugzilla 30729, it is mentioned that the 'this' pointer needs to be valid upon entry to a non-static method.  Does the standard guarantee this is non-null on entry?

Yes. You have to call a non-static member function on a valid object.

 If so, is there a reason we can't use that fact to mark 'this' as 'dereferenceable(sizeof(*this))'?

Yes, this seems like a good idea.

 -Hal

 
There are LICM optimizations we can do based on the knowledge that 'this' is non-null on entry to a non-static member function. 
 
eg.  For the following IR,  the two highlighted loads are not being hoisted out of the for loop because we are not able to guarantee that the pointer is non-null.  If the 'this' pointer is guaranteed to be non-null on entry and we mark it thus,  then the 2 loads within the for-loop body can then be hoisted out into the loop preheader.
 
This is of course just one example of an optimization we could perform based on this knowledge, but there are probably a number of others (i.e. anything that relies on a pointer being 'dereferenceable(N)').
 
$ cat a.ll
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%struct.S = type { <4 x i32>, <4 x i32> }
; Function Attrs: norecurse nounwind readonly
define <4 x i32> @_ZNK1S20constShouldBeHoistedEmDv4_i(%struct.S* nocapture readonly %this, i64 %n, <4 x
i32> %x) align 2 {
entry:
 %tobool9 = icmp eq i64 %n, 0
 br i1 %tobool9, label %for.end, label %for.body.lr.ph

for.body.lr.ph:                                   ; preds = %entry
 %k1 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 0
 %k2 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 1
 br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %if.end
 %n.addr.011 = phi i64 [ %n, %for.body.lr.ph ], [ %div, %if.end ]
 %x.addr.010 = phi <4 x i32> [ %x, %for.body.lr.ph ], [ %x.addr.1, %if.end ]
 %rem = and i64 %n.addr.011, 15
 %cmp = icmp eq i64 %rem, 0
 br i1 %cmp, label %if.end, label %if.then

if.then:                                          ; preds = %for.body
 
%0 = load <4 x i32>, <4 x i32>* %k1, align 16
 %add = add <4 x i32> %0, %x.addr.010
 
%1 = load <4 x i32>, <4 x i32>* %k2, align 16
 %xor = xor <4 x i32> %add, %1
 br label %if.end

if.end:                                           ; preds = %for.body, %if.then
 %x.addr.1 = phi <4 x i32> [ %xor, %if.then ], [ %x.addr.010, %for.body ]
 %div = lshr i64 %n.addr.011, 4
 %tobool = icmp eq i64 %div, 0
 br i1 %tobool, label %for.end, label %for.body

for.end:                                          ; preds = %if.end, %entry
 %x.addr.0.lcssa = phi <4 x i32> [ %x, %entry ], [ %x.addr.1, %if.end ]
 ret <4 x i32> %x.addr.0.lcssa
}

 
 

Regards,
Lei Huang
 
 
LLVM Development on POWER
Internal mail: C2/YGK/8200/MKM
Phone: <a href="tel:(905)%20413-4419" value="+19054134419" target="_blank">(905) 413-4419
TieLine: 969-4419
E-mail: [hidden email]



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

Robinson, Paul via cfe-dev


On 12/06/2017 05:09 PM, Hubert Tong wrote:
I am not sure that there is a requirement that *this remains a valid object.

I don't think that there is, and, IIRC, we already have an open bug about this (the same applies to reference arguments, for which we also apply the dereferenceable attribute). The problem, as Nick Lewycky pointed out a some time ago, is that just because a pointer is dereferenceable upon entry to the function, doesn't mean that it always will remain so.

I agree that we should fix this at some point, but it's an orthogonal issue.

 -Hal

Experimenting with the -O3 behaviour of the following program (with a powerpc64le-unknown-linux-gnu target), it seems the dereferenceable attribute is scope based?

int a[1], b[1], *cp;
void dobadstuff() { delete cp; }

__attribute__((__noinline__))
void foo(int * __restrict__ a, int * __restrict__ b, int &c, int n) {
  dobadstuff();
  for (int i = 0; i < n; ++i)
    if (a[i] > 0)
      a[i] = c * b[i];
}

int main(void) {
  cp = new int;
  foo(a, b, *cp, 1);
}

Valgrind reports an invalid read.


On Wed, Dec 6, 2017 at 8:06 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 12/05/2017 01:47 PM, Lei Huang via cfe-dev wrote:
 
Hello,
 
In the discussion on bugzilla 30729, it is mentioned that the 'this' pointer needs to be valid upon entry to a non-static method.  Does the standard guarantee this is non-null on entry?

Yes. You have to call a non-static member function on a valid object.

 If so, is there a reason we can't use that fact to mark 'this' as 'dereferenceable(sizeof(*this))'?

Yes, this seems like a good idea.

 -Hal

 
There are LICM optimizations we can do based on the knowledge that 'this' is non-null on entry to a non-static member function. 
 
eg.  For the following IR,  the two highlighted loads are not being hoisted out of the for loop because we are not able to guarantee that the pointer is non-null.  If the 'this' pointer is guaranteed to be non-null on entry and we mark it thus,  then the 2 loads within the for-loop body can then be hoisted out into the loop preheader.
 
This is of course just one example of an optimization we could perform based on this knowledge, but there are probably a number of others (i.e. anything that relies on a pointer being 'dereferenceable(N)').
 
$ cat a.ll
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%struct.S = type { <4 x i32>, <4 x i32> }
; Function Attrs: norecurse nounwind readonly
define <4 x i32> @_ZNK1S20constShouldBeHoistedEmDv4_i(%struct.S* nocapture readonly %this, i64 %n, <4 x
i32> %x) align 2 {
entry:
 %tobool9 = icmp eq i64 %n, 0
 br i1 %tobool9, label %for.end, label %for.body.lr.ph

for.body.lr.ph:                                   ; preds = %entry
 %k1 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 0
 %k2 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 1
 br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %if.end
 %n.addr.011 = phi i64 [ %n, %for.body.lr.ph ], [ %div, %if.end ]
 %x.addr.010 = phi <4 x i32> [ %x, %for.body.lr.ph ], [ %x.addr.1, %if.end ]
 %rem = and i64 %n.addr.011, 15
 %cmp = icmp eq i64 %rem, 0
 br i1 %cmp, label %if.end, label %if.then

if.then:                                          ; preds = %for.body
 
%0 = load <4 x i32>, <4 x i32>* %k1, align 16
 %add = add <4 x i32> %0, %x.addr.010
 
%1 = load <4 x i32>, <4 x i32>* %k2, align 16
 %xor = xor <4 x i32> %add, %1
 br label %if.end

if.end:                                           ; preds = %for.body, %if.then
 %x.addr.1 = phi <4 x i32> [ %xor, %if.then ], [ %x.addr.010, %for.body ]
 %div = lshr i64 %n.addr.011, 4
 %tobool = icmp eq i64 %div, 0
 br i1 %tobool, label %for.end, label %for.body

for.end:                                          ; preds = %if.end, %entry
 %x.addr.0.lcssa = phi <4 x i32> [ %x, %entry ], [ %x.addr.1, %if.end ]
 ret <4 x i32> %x.addr.0.lcssa
}

 
 

Regards,
Lei Huang
 
 
LLVM Development on POWER
Internal mail: C2/YGK/8200/MKM
Phone: <a moz-do-not-send="true" href="tel:%28905%29%20413-4419" value="+19054134419" target="_blank">(905) 413-4419
TieLine: 969-4419
E-mail: [hidden email]



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________ cfe-dev mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

Robinson, Paul via cfe-dev


On 12/06/2017 05:47 PM, Hal Finkel via cfe-dev wrote:


On 12/06/2017 05:09 PM, Hubert Tong wrote:
I am not sure that there is a requirement that *this remains a valid object.

I don't think that there is, and, IIRC, we already have an open bug about this (the same applies to reference arguments, for which we also apply the dereferenceable attribute). The problem, as Nick Lewycky pointed out a some time ago,

For the record, here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150202/257887.html

 -Hal

is that just because a pointer is dereferenceable upon entry to the function, doesn't mean that it always will remain so.

I agree that we should fix this at some point, but it's an orthogonal issue.

 -Hal

Experimenting with the -O3 behaviour of the following program (with a powerpc64le-unknown-linux-gnu target), it seems the dereferenceable attribute is scope based?

int a[1], b[1], *cp;
void dobadstuff() { delete cp; }

__attribute__((__noinline__))
void foo(int * __restrict__ a, int * __restrict__ b, int &c, int n) {
  dobadstuff();
  for (int i = 0; i < n; ++i)
    if (a[i] > 0)
      a[i] = c * b[i];
}

int main(void) {
  cp = new int;
  foo(a, b, *cp, 1);
}

Valgrind reports an invalid read.


On Wed, Dec 6, 2017 at 8:06 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 12/05/2017 01:47 PM, Lei Huang via cfe-dev wrote:
 
Hello,
 
In the discussion on bugzilla 30729, it is mentioned that the 'this' pointer needs to be valid upon entry to a non-static method.  Does the standard guarantee this is non-null on entry?

Yes. You have to call a non-static member function on a valid object.

 If so, is there a reason we can't use that fact to mark 'this' as 'dereferenceable(sizeof(*this))'?

Yes, this seems like a good idea.

 -Hal

 
There are LICM optimizations we can do based on the knowledge that 'this' is non-null on entry to a non-static member function. 
 
eg.  For the following IR,  the two highlighted loads are not being hoisted out of the for loop because we are not able to guarantee that the pointer is non-null.  If the 'this' pointer is guaranteed to be non-null on entry and we mark it thus,  then the 2 loads within the for-loop body can then be hoisted out into the loop preheader.
 
This is of course just one example of an optimization we could perform based on this knowledge, but there are probably a number of others (i.e. anything that relies on a pointer being 'dereferenceable(N)').
 
$ cat a.ll
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%struct.S = type { <4 x i32>, <4 x i32> }
; Function Attrs: norecurse nounwind readonly
define <4 x i32> @_ZNK1S20constShouldBeHoistedEmDv4_i(%struct.S* nocapture readonly %this, i64 %n, <4 x
i32> %x) align 2 {
entry:
 %tobool9 = icmp eq i64 %n, 0
 br i1 %tobool9, label %for.end, label %for.body.lr.ph

for.body.lr.ph:                                   ; preds = %entry
 %k1 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 0
 %k2 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 1
 br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %if.end
 %n.addr.011 = phi i64 [ %n, %for.body.lr.ph ], [ %div, %if.end ]
 %x.addr.010 = phi <4 x i32> [ %x, %for.body.lr.ph ], [ %x.addr.1, %if.end ]
 %rem = and i64 %n.addr.011, 15
 %cmp = icmp eq i64 %rem, 0
 br i1 %cmp, label %if.end, label %if.then

if.then:                                          ; preds = %for.body
 
%0 = load <4 x i32>, <4 x i32>* %k1, align 16
 %add = add <4 x i32> %0, %x.addr.010
 
%1 = load <4 x i32>, <4 x i32>* %k2, align 16
 %xor = xor <4 x i32> %add, %1
 br label %if.end

if.end:                                           ; preds = %for.body, %if.then
 %x.addr.1 = phi <4 x i32> [ %xor, %if.then ], [ %x.addr.010, %for.body ]
 %div = lshr i64 %n.addr.011, 4
 %tobool = icmp eq i64 %div, 0
 br i1 %tobool, label %for.end, label %for.body

for.end:                                          ; preds = %if.end, %entry
 %x.addr.0.lcssa = phi <4 x i32> [ %x, %entry ], [ %x.addr.1, %if.end ]
 ret <4 x i32> %x.addr.0.lcssa
}

 
 

Regards,
Lei Huang
 
 
LLVM Development on POWER
Internal mail: C2/YGK/8200/MKM
Phone: <a moz-do-not-send="true" href="tel:%28905%29%20413-4419" value="+19054134419" target="_blank">(905) 413-4419
TieLine: 969-4419
E-mail: [hidden email]



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________ cfe-dev mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Setting dereferenceable flag on the implicit this parameter for non-static member functions

Robinson, Paul via cfe-dev


On 12/06/2017 05:53 PM, Hal Finkel via cfe-dev wrote:


On 12/06/2017 05:47 PM, Hal Finkel via cfe-dev wrote:


On 12/06/2017 05:09 PM, Hubert Tong wrote:
I am not sure that there is a requirement that *this remains a valid object.

I don't think that there is, and, IIRC, we already have an open bug about this (the same applies to reference arguments, for which we also apply the dereferenceable attribute). The problem, as Nick Lewycky pointed out a some time ago,

For the record, here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150202/257887.html

Oh, and my hope is that the only thing we need to do to fix this without too many performance regressions is to add some parameter attribute, not_freed or similar, and have FunctionAttrs do bottom-up inference on it.

 -Hal


 -Hal

is that just because a pointer is dereferenceable upon entry to the function, doesn't mean that it always will remain so.

I agree that we should fix this at some point, but it's an orthogonal issue.

 -Hal

Experimenting with the -O3 behaviour of the following program (with a powerpc64le-unknown-linux-gnu target), it seems the dereferenceable attribute is scope based?

int a[1], b[1], *cp;
void dobadstuff() { delete cp; }

__attribute__((__noinline__))
void foo(int * __restrict__ a, int * __restrict__ b, int &c, int n) {
  dobadstuff();
  for (int i = 0; i < n; ++i)
    if (a[i] > 0)
      a[i] = c * b[i];
}

int main(void) {
  cp = new int;
  foo(a, b, *cp, 1);
}

Valgrind reports an invalid read.


On Wed, Dec 6, 2017 at 8:06 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 12/05/2017 01:47 PM, Lei Huang via cfe-dev wrote:
 
Hello,
 
In the discussion on bugzilla 30729, it is mentioned that the 'this' pointer needs to be valid upon entry to a non-static method.  Does the standard guarantee this is non-null on entry?

Yes. You have to call a non-static member function on a valid object.

 If so, is there a reason we can't use that fact to mark 'this' as 'dereferenceable(sizeof(*this))'?

Yes, this seems like a good idea.

 -Hal

 
There are LICM optimizations we can do based on the knowledge that 'this' is non-null on entry to a non-static member function. 
 
eg.  For the following IR,  the two highlighted loads are not being hoisted out of the for loop because we are not able to guarantee that the pointer is non-null.  If the 'this' pointer is guaranteed to be non-null on entry and we mark it thus,  then the 2 loads within the for-loop body can then be hoisted out into the loop preheader.
 
This is of course just one example of an optimization we could perform based on this knowledge, but there are probably a number of others (i.e. anything that relies on a pointer being 'dereferenceable(N)').
 
$ cat a.ll
target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

%struct.S = type { <4 x i32>, <4 x i32> }
; Function Attrs: norecurse nounwind readonly
define <4 x i32> @_ZNK1S20constShouldBeHoistedEmDv4_i(%struct.S* nocapture readonly %this, i64 %n, <4 x
i32> %x) align 2 {
entry:
 %tobool9 = icmp eq i64 %n, 0
 br i1 %tobool9, label %for.end, label %for.body.lr.ph

for.body.lr.ph:                                   ; preds = %entry
 %k1 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 0
 %k2 = getelementptr inbounds %struct.S, %struct.S* %this, i64 0, i32 1
 br label %for.body

for.body:                                         ; preds = %for.body.lr.ph, %if.end
 %n.addr.011 = phi i64 [ %n, %for.body.lr.ph ], [ %div, %if.end ]
 %x.addr.010 = phi <4 x i32> [ %x, %for.body.lr.ph ], [ %x.addr.1, %if.end ]
 %rem = and i64 %n.addr.011, 15
 %cmp = icmp eq i64 %rem, 0
 br i1 %cmp, label %if.end, label %if.then

if.then:                                          ; preds = %for.body
 
%0 = load <4 x i32>, <4 x i32>* %k1, align 16
 %add = add <4 x i32> %0, %x.addr.010
 
%1 = load <4 x i32>, <4 x i32>* %k2, align 16
 %xor = xor <4 x i32> %add, %1
 br label %if.end

if.end:                                           ; preds = %for.body, %if.then
 %x.addr.1 = phi <4 x i32> [ %xor, %if.then ], [ %x.addr.010, %for.body ]
 %div = lshr i64 %n.addr.011, 4
 %tobool = icmp eq i64 %div, 0
 br i1 %tobool, label %for.end, label %for.body

for.end:                                          ; preds = %if.end, %entry
 %x.addr.0.lcssa = phi <4 x i32> [ %x, %entry ], [ %x.addr.1, %if.end ]
 ret <4 x i32> %x.addr.0.lcssa
}

 
 

Regards,
Lei Huang
 
 
LLVM Development on POWER
Internal mail: C2/YGK/8200/MKM
Phone: <a moz-do-not-send="true" href="tel:%28905%29%20413-4419" value="+19054134419" target="_blank">(905) 413-4419
TieLine: 969-4419
E-mail: [hidden email]



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________ cfe-dev mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev