Ownership attribute for malloc etc. checking

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Ownership attribute for malloc etc. checking

Andrew McGregor-3
This is a second try at some attributes that presently only enable the malloc checker to be smarter.  Syntactically, however, the attributes are intended to be more generally useful.

The attributes are currently used like this:

void __attribute((ownership_takes(malloc, 1))) bar(char * it) {
  free(it);
}

char * __attribute((ownership_returns(malloc, 1))) bar2(size_t i) {
  return (char *) malloc(i);
}

There is a third called ownership_holds.  The distinction is that ownership_takes does not allow the resource to be used after passing it in, while ownership_holds does.

The first argument is intended to be the name of a type of resource, in this case memory allocated through malloc.  This is the only value currently checked, any other value is silently ignored.

The second argument is an index into the function's argument list, for ownership_returns it is the size of the malloc region, for the others it is the pointer to check.

In this patch, only one argument per function can be annotated, that still has to be fixed.

Andrew



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

clang-ownership.patch (31K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Jordy Rose

I do think this is pretty cool stuff -- like Ted said, being able to
generalize the leak-checking code would be nice (perhaps even unifying
malloc/free and Cocoa retain counting someday).

I have a number of minor suggestions, such as asking the attribute which
arguments matter instead of iterating over them all and seeing if they're
important, and following the LLVM comment style (complete, capitalized
sentences ending in [.?!] whenever possible). Also using "!" instead of
"not". *grin*

I think you have a logic error dealing with held memory. You're allowing a
double-free if the current free is a hold, but I think what you want is to
allow /uses/ if the /previous/ free was a hold. Perhaps the RefState kind
needs a new enum value, Relinquished or something.

It'd be nice to write a couple of tests for this new attribute. It's
probably sufficiently new to warrant its own test file in the test/Analysis
directory.

Secretly I would love for the module name /not/ to be checked at all, only
associated with a pointer when ownership_returns is encountered. It would
then be expected that any non-escaping marked pointer is cleaned up by the
end of the function body. This would let people define their own modules
without any extra configuration at all.

const char * __attribute__((ownership_returns(my_app_intern)))
intern(const char *);
void __attribute__((ownership_takes(my_app_intern))) release(const char
*);

void useInput (const char *input) {
  const char *str = intern(input);
  // do complicated stuff with str
  release(str);
}

The downside, of course, is that /any/ use of the pointer as a function
argument would count as escaping, making it not very useful. Perhaps there
could be heuristics for this sort of thing -- a non-builtin module name
/must/ be balanced or explicitly escape, or else be passed to a free-like
function. For things like "malloc" and "fopen", we probably have to be more
lenient.

If this is extended to work with multiple owners, you'll also run into the
problem of two kinds of "hold" -- in both cases, there's a new owner, but
the current owner doesn't necessarily give up ownership. But that's a
problem for another day.

Last, as a small note, I don't think ownership_returns should require a
size argument at all -- it falls down for something as simple as this:

struct Node *allocNode() {
  return (struct Node *)malloc(sizeof(struct Node));
}

I see you're using MallocMemAux(); if you use the second form, which takes
a size SVal instead of an Expr*, then you can handle cases where the size
is unknown.

Overall, I like it! Perhaps clean it up a bit, but having a general
"ownership checker" seems like a great idea.

Jordy
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Jeffrey Yasskin
In reply to this post by Andrew McGregor-3
I'm not any sort of authority on the checker, so take this with a
grain of salt, but I had a comment on the syntax here.

Identifying a function argument by index is really error prone and
hard to use. In particular, you wind up having to tell people to use
different bases for member functions vs non-member functions, and fix
up the index when anything changes. It would be much nicer to attach
the attribute to the actual argument, like

void bar(char * it __attribute((ownership_takes(malloc)))) {
  free(it);
}

I don't know whether this placement is possible with clang's current
parser—gcc doesn't allow it—but I believe Sean's working on fixing
that over the summer.

On Wed, Jun 23, 2010 at 7:12 PM, Andrew McGregor <[hidden email]> wrote:

> This is a second try at some attributes that presently only enable the
> malloc checker to be smarter.  Syntactically, however, the attributes are
> intended to be more generally useful.
> The attributes are currently used like this:
> void __attribute((ownership_takes(malloc, 1))) bar(char * it) {
>   free(it);
> }
> char * __attribute((ownership_returns(malloc, 1))) bar2(size_t i) {
>   return (char *) malloc(i);
> }
> There is a third called ownership_holds.  The distinction is that
> ownership_takes does not allow the resource to be used after passing it in,
> while ownership_holds does.
> The first argument is intended to be the name of a type of resource, in this
> case memory allocated through malloc.  This is the only value currently
> checked, any other value is silently ignored.
> The second argument is an index into the function's argument list, for
> ownership_returns it is the size of the malloc region, for the others it is
> the pointer to check.
> In this patch, only one argument per function can be annotated, that still
> has to be fixed.
> Andrew
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Andrew McGregor-3
In reply to this post by Jordy Rose
Ok, here's an updated patch that should cover all your suggestions.

Tests are included, they pass.

It also adds a check for functions returning memory that has been freed/taken (but not held, because while this is somewhat dangerous, it is in fact what malloc itself might do).

Andrew

On Thu, Jun 24, 2010 at 4:11 PM, Jordy Rose <[hidden email]> wrote:

I do think this is pretty cool stuff -- like Ted said, being able to
generalize the leak-checking code would be nice (perhaps even unifying
malloc/free and Cocoa retain counting someday).

I have a number of minor suggestions, such as asking the attribute which
arguments matter instead of iterating over them all and seeing if they're
important, and following the LLVM comment style (complete, capitalized
sentences ending in [.?!] whenever possible). Also using "!" instead of
"not". *grin*

I think you have a logic error dealing with held memory. You're allowing a
double-free if the current free is a hold, but I think what you want is to
allow /uses/ if the /previous/ free was a hold. Perhaps the RefState kind
needs a new enum value, Relinquished or something.

It'd be nice to write a couple of tests for this new attribute. It's
probably sufficiently new to warrant its own test file in the test/Analysis
directory.

Secretly I would love for the module name /not/ to be checked at all, only
associated with a pointer when ownership_returns is encountered. It would
then be expected that any non-escaping marked pointer is cleaned up by the
end of the function body. This would let people define their own modules
without any extra configuration at all.

const char * __attribute__((ownership_returns(my_app_intern)))
intern(const char *);
void __attribute__((ownership_takes(my_app_intern))) release(const char
*);

void useInput (const char *input) {
 const char *str = intern(input);
 // do complicated stuff with str
 release(str);
}

The downside, of course, is that /any/ use of the pointer as a function
argument would count as escaping, making it not very useful. Perhaps there
could be heuristics for this sort of thing -- a non-builtin module name
/must/ be balanced or explicitly escape, or else be passed to a free-like
function. For things like "malloc" and "fopen", we probably have to be more
lenient.

If this is extended to work with multiple owners, you'll also run into the
problem of two kinds of "hold" -- in both cases, there's a new owner, but
the current owner doesn't necessarily give up ownership. But that's a
problem for another day.

Last, as a small note, I don't think ownership_returns should require a
size argument at all -- it falls down for something as simple as this:

struct Node *allocNode() {
 return (struct Node *)malloc(sizeof(struct Node));
}

I see you're using MallocMemAux(); if you use the second form, which takes
a size SVal instead of an Expr*, then you can handle cases where the size
is unknown.

Overall, I like it! Perhaps clean it up a bit, but having a general
"ownership checker" seems like a great idea.

Jordy


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

clang-ownership-withtests.patch (41K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Jordy Rose

Looking good! Some remaining comments below.

I'm also a reasonably new committer, so someone else should probably look
over this as well (particularly the Attr hunks, which I'm not very familiar
with).

Jordy


General:
- It would be nice to make sure one argument isn't both ownership_takes
and ownership_holds.
- LLVM's convention is to use spaces rather than tabs.
- There's a hunk where the only thing that changed is indentation; it'd be
nice to take that out.
- Two more useful test would be free(p); my_free(p) and free(p);
my_hold(p). Both of which should warn, of course.


(MallocChecker::EvalCallExpr)
+    for (const Attr *attr = FD->getAttrs(); attr; attr = attr->getNext())
{
+      if (const OwnershipReturnsAttr* Att =
dyn_cast<OwnershipReturnsAttr>(attr)) {
+        MallocMemReturnsAttr(C, CE, Att);
+        rv = true;
+      }
+      if (const OwnershipTakesAttr* Att = dyn_cast<OwnershipTakesAttr>
(attr)) {
+        FreeMemTakesAttr(C, CE, Att);
+        rv = true;
+      }
+      if (const OwnershipHoldsAttr* Att = dyn_cast<OwnershipHoldsAttr>
(attr)) {
+        FreeMemHoldsAttr(C, CE, Att);
+        rv = true;
+      }
+    }

Here's the dispatch from EvalCallExpr(). I think even though it's
technically a little less efficient, you should just simplify this and use
FD->getAttr<OwnershipTakesAttr>(), etc. It's a little simpler, and it lets
you order the attributes.

Why does that matter? Because an ownership_takes() or ownership_holds()
function might have a return value! If it's also ownership_returns(), then
you're fine, but otherwise you need to set a symbolic return value. (You
can see how this is done in MallocMemAux(), or in StreamChecker.cpp with
fopen.) There should probably be a test for this as well.

Actually, because these attributes don't affect the return value, they
should probably go in a PreVisitCallExpr() callback, rather than
EvalCallExpr(). But for now it's probably okay, as long as you handle that
return value.


(MallocChecker::MallocMemReturnsAttr)
+  for (const unsigned *I = Att->begin(), *E = Att->end(); I != E; ++I,
++count) {
+    const GRState *state =
+        MallocMemAux(C, CE, CE->getArg(*I), UndefinedVal(),
C.getState());
+    C.addTransition(state);
+  }
+  if (count == 0) {
+    const GRState *state = MallocMemAux(C, CE, UndefinedVal(),
UndefinedVal(),
+                                        C.getState());
+    C.addTransition(state);
+  }

What does it mean to have multiple size fields? There can still only be
one return value. Maybe you should only allow zero or one positions for
ownership_returns (which would make this section simpler).


(MallocChecker::PreVisitReturnStmt)
+  if (RS->isReleased()) {
+    ExplodedNode *N = C.GenerateSink();
+    if (!BT_UseFree)
+      BT_UseFree = new BuiltinBug("Use dynamically allocated memory
after"
+                                  " it is freed.");
 
+    BugReport *R = new BugReport(*BT_UseFree,
BT_UseFree->getDescription(),
+                                 N);
+    C.EmitReport(R);
+  }

This section checks whether the return value is a released address.
Unfortunately this isn't a 100% unambiguous case; consider the following
(contrived) example:

struct Node *freeSmallest () {
  struct Node *min;
  // find the minimum node
  free(min);
  return min; // so the caller knows the ID of the deleted node
}

As long as the caller doesn't dereference the returned pointer, it could
still be a useful value. 95% of the time, this check would be useful, but
since a false positive is worse than a false negative for the analyzer, we
should probably leave it out.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Fwd: Ownership attribute for malloc etc. checking

Andrew McGregor-3
Oops, forgot to reply-all...

---------- Forwarded message ----------
From: Andrew McGregor <[hidden email]>
Date: Mon, Jun 28, 2010 at 12:04 PM
Subject: Re: Ownership attribute for malloc etc. checking
To: Jordy Rose <[hidden email]>


Hi,

Thanks for the review.  Comments inline, new patch attached.

Andrew

On Sun, Jun 27, 2010 at 9:33 AM, Jordy Rose <[hidden email]> wrote:

Looking good! Some remaining comments below.

I'm also a reasonably new committer, so someone else should probably look
over this as well (particularly the Attr hunks, which I'm not very familiar
with).

Jordy


General:
- It would be nice to make sure one argument isn't both ownership_takes
and ownership_holds.

True, will do.
 
- LLVM's convention is to use spaces rather than tabs.
- There's a hunk where the only thing that changed is indentation; it'd be
nice to take that out.

Oops, neither of these were intentional.
 
- Two more useful test would be free(p); my_free(p) and free(p);
my_hold(p). Both of which should warn, of course.

Instead of crashing, yes.  Well spotted.
 
 (MallocChecker::EvalCallExpr)
+    for (const Attr *attr = FD->getAttrs(); attr; attr = attr->getNext())
{
+      if (const OwnershipReturnsAttr* Att =
dyn_cast<OwnershipReturnsAttr>(attr)) {
+        MallocMemReturnsAttr(C, CE, Att);
+        rv = true;
+      }
+      if (const OwnershipTakesAttr* Att = dyn_cast<OwnershipTakesAttr>
(attr)) {
+        FreeMemTakesAttr(C, CE, Att);
+        rv = true;
+      }
+      if (const OwnershipHoldsAttr* Att = dyn_cast<OwnershipHoldsAttr>
(attr)) {
+        FreeMemHoldsAttr(C, CE, Att);
+        rv = true;
+      }
+    }

Here's the dispatch from EvalCallExpr(). I think even though it's
technically a little less efficient, you should just simplify this and use
FD->getAttr<OwnershipTakesAttr>(), etc. It's a little simpler, and it lets
you order the attributes.

However, I did it this way because I wanted to allow two or more attributes of the same kind (for the benefit of macros).

In other words, __attribute((ownership_holds, 1)) __attribute((ownership_holds, 2)) should be the same as __attribute((ownership_holds, 1, 2)).
 
Why does that matter? Because an ownership_takes() or ownership_holds()
function might have a return value! If it's also ownership_returns(), then
you're fine, but otherwise you need to set a symbolic return value. (You
can see how this is done in MallocMemAux(), or in StreamChecker.cpp with
fopen.) There should probably be a test for this as well.

It isn't necessarily the case that an ownership_takes() function that returns a pointer generated that pointer from an ownership_takes argument, or allocated anything, so what would you set the region to, in the absence of other information?  They default to Unknown, yes?
 
Actually, because these attributes don't affect the return value, they
should probably go in a PreVisitCallExpr() callback, rather than
EvalCallExpr(). But for now it's probably okay, as long as you handle that
return value.


(MallocChecker::MallocMemReturnsAttr)
+  for (const unsigned *I = Att->begin(), *E = Att->end(); I != E; ++I,
++count) {
+    const GRState *state =
+        MallocMemAux(C, CE, CE->getArg(*I), UndefinedVal(),
C.getState());
+    C.addTransition(state);
+  }
+  if (count == 0) {
+    const GRState *state = MallocMemAux(C, CE, UndefinedVal(),
UndefinedVal(),
+                                        C.getState());
+    C.addTransition(state);
+  }

What does it mean to have multiple size fields? There can still only be
one return value. Maybe you should only allow zero or one positions for
ownership_returns (which would make this section simpler).

The intention here was to implement something for calloc-like functions and matrix allocators; one argument is a special case, more than one the size of the region is the product of the arguments times sizeof(pointee of the return value), taking sizeof(void) as 1.  If it isn't worth doing that, fair enough, but that's what I was thinking this could lead to.

However, the parser only allows the one argument at the moment, so this is redundant.
 
(MallocChecker::PreVisitReturnStmt)
+  if (RS->isReleased()) {
+    ExplodedNode *N = C.GenerateSink();
+    if (!BT_UseFree)
+      BT_UseFree = new BuiltinBug("Use dynamically allocated memory
after"
+                                  " it is freed.");

+    BugReport *R = new BugReport(*BT_UseFree,
BT_UseFree->getDescription(),
+                                 N);
+    C.EmitReport(R);
+  }

This section checks whether the return value is a released address.
Unfortunately this isn't a 100% unambiguous case; consider the following
(contrived) example:

struct Node *freeSmallest () {
 struct Node *min;
 // find the minimum node
 free(min);
 return min; // so the caller knows the ID of the deleted node
}

As long as the caller doesn't dereference the returned pointer, it could
still be a useful value. 95% of the time, this check would be useful, but
since a false positive is worse than a false negative for the analyzer, we
should probably leave it out.

Ok, I see.  Out it is, then.

Andrew


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

clang-ownership-withtests-r1.patch (42K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Ted Kremenek
Hi Andrew,

In addition to Jordy's excellent comments, here are my comments on your patch.  Comments inline:

Index: test/Analysis/malloc.c
===================================================================
--- test/Analysis/malloc.c (revision 106614)
+++ test/Analysis/malloc.c (working copy)
@@ -4,7 +4,13 @@
 void free(void *);
 void *realloc(void *ptr, size_t size);
 void *calloc(size_t nmemb, size_t size);
+void __attribute((ownership_returns(malloc))) *my_malloc(size_t);
+void __attribute((ownership_takes(malloc, 1))) my_free(void *);
+void __attribute((ownership_returns(malloc, 1))) *my_malloc2(size_t);
+void __attribute((ownership_holds(malloc, 1))) my_hold(void *);
+void *my_malloc3(size_t);
 

Good test cases, but also include test cases where the index isn't '1'.

Please also include test cases when multiple indices are annotated with a single attribute.

Also, what happens if the same index is specified more than once?


Index: include/clang/Basic/Attr.td
===================================================================
--- include/clang/Basic/Attr.td (revision 106614)
+++ include/clang/Basic/Attr.td (working copy)
@@ -297,6 +297,21 @@
   let Spellings = ["overloadable"];
 }
 
+def OwnershipReturns : Attr {
+  let Spellings = ["ownership_returns"];
+  let Args = [StringArgument<"Module">, IntArgument<"SizeIdx">];
+}
+
+def OwnershipTakes : Attr {
+  let Spellings = ["ownership_takes"];
+  let Args = [StringArgument<"Module">, IntArgument<"PtrIdx">];
+}
+
+def OwnershipHolds : Attr {
+  let Spellings = ["ownership_holds"];
+  let Args = [StringArgument<"Module">, IntArgument<"PtrIdx">];
+}
+
 def Packed : Attr {
   let Spellings = ["packed"];
 }

In addition to this syntax, Jeffrey Yaskin also made the excellent point of whether or not we want to allow the ownership_takes and ownership_holds attributes directly to be affixed directly to parameters instead of using indices.  Should we allow both models, and thus make the PtrIdx optional in those cases?

I'm also not certain what the SizeIdx entry is for.  Is this to explicitly model the size of the returned memory?  That's fine, but please also keep in mind how we would want to model out-parameters.  Would we want a separate attribute?


Index: include/clang/Basic/DiagnosticSemaKinds.td
===================================================================
--- include/clang/Basic/DiagnosticSemaKinds.td (revision 106614)
+++ include/clang/Basic/DiagnosticSemaKinds.td (working copy)
@@ -831,6 +831,12 @@
   "attribute may only be applied to an Objective-C interface">;
 def err_nonnull_pointers_only : Error<
   "nonnull attribute only applies to pointer arguments">;
+def err_ownership_returns_integers_only : Error<
+  "ownership_returns attribute only applies to integer arguments">;
+def err_ownership_takes_pointers_only : Error<
+  "ownership_takes attribute only applies to pointer arguments">;
+def err_ownership_holds_pointers_only : Error<
+  "ownership_holds attribute only applies to pointer arguments">;
 def err_format_strftime_third_parameter : Error<
   "strftime format attribute requires 3rd parameter to be 0">;
 def err_format_attribute_requires_variadic : Error<




Index: include/clang/AST/Attr.h
===================================================================
--- include/clang/AST/Attr.h (revision 106614)
+++ include/clang/AST/Attr.h (working copy)
@@ -351,6 +351,128 @@
   static bool classof(const NonNullAttr *A) { return true; }
 };
 
+class OwnershipReturnsAttr: public AttrWithString {
+  unsigned* ArgNums;
+  unsigned Size;
+public:
+  OwnershipReturnsAttr(ASTContext &C, unsigned* arg_nums, unsigned size,
+                       llvm::StringRef module);
+
+  virtual void Destroy(ASTContext &C);
+
+  llvm::StringRef getModule() const {
+    return getString();
+  }
+  void setModule(ASTContext &C, llvm::StringRef module);
+  typedef const unsigned *iterator;
+  iterator begin() const {
+    return ArgNums;
+  }
+  iterator end() const {
+    return ArgNums + Size;
+  }
+  unsigned size() const {
+    return Size;
+  }
+
+  bool isOwnershipReturns(unsigned arg) const {
+    return ArgNums ? std::binary_search(ArgNums, ArgNums + Size, arg) : true;
+  }
+
+  bool isModule(const char *m) const {
+    return getModule().equals(m);
+  }
+
+  virtual Attr *clone(ASTContext &C) const;
+
+  static bool classof(const Attr *A) {
+    return A->getKind() == attr::OwnershipReturns;
+  }
+  static bool classof(const OwnershipReturnsAttr *A) {
+    return true;
+  }
+};
+
+class OwnershipTakesAttr: public AttrWithString {
+  unsigned* ArgNums;
+  unsigned Size;
+public:
+  OwnershipTakesAttr(ASTContext &C, unsigned* arg_nums, unsigned size,
+                     llvm::StringRef module);
+
+  virtual void Destroy(ASTContext &C);
+
+  llvm::StringRef getModule() const {
+    return getString();
+  }
+  void setModule(ASTContext &C, llvm::StringRef module);
+  typedef const unsigned *iterator;
+  iterator begin() const {
+    return ArgNums;
+  }
+  iterator end() const {
+    return ArgNums + Size;
+  }
+  unsigned size() const {
+    return Size;
+  }
+
+  bool isOwnershipTakes(unsigned arg) const {
+    return ArgNums ? std::binary_search(ArgNums, ArgNums + Size, arg) : true;
+  }
+
+  bool isModule(const char *m) const {
+    return getModule().equals(m);
+  }
+
+  virtual Attr *clone(ASTContext &C) const;
+
+  static bool classof(const Attr *A) {
+    return A->getKind() == attr::OwnershipTakes;
+  }
+  static bool classof(const OwnershipTakesAttr *A) {
+    return true;
+  }
+};
+
+class OwnershipHoldsAttr: public AttrWithString {
+  unsigned* ArgNums;
+  unsigned Size;
+public:
+  OwnershipHoldsAttr(ASTContext &C, unsigned* arg_nums, unsigned size,
+                     llvm::StringRef module);
+
+  virtual void Destroy(ASTContext &C);
+
+  llvm::StringRef getModule() const {
+    return getString();
+  }
+  void setModule(ASTContext &C, llvm::StringRef module);
+  typedef const unsigned *iterator;
+  iterator begin() const {
+    return ArgNums;
+  }
+  iterator end() const {
+    return ArgNums + Size;
+  }
+  unsigned size() const {
+    return Size;
+  }
+
+  bool isOwnershipHolds(unsigned arg) const {
+    return ArgNums ? std::binary_search(ArgNums, ArgNums + Size, arg) : true;
+  }
+
+  bool isModule(const char *m) const {
+    return getModule().equals(m);
+  }
+
+  virtual Attr *clone(ASTContext &C) const;
+
+  static bool classof(const Attr *A) { return A->getKind() == attr::OwnershipHolds; }
+  static bool classof(const OwnershipHoldsAttr *A) { return true; }
+};

Consider having these three attribute classes subclass a common base class, e.g. OwnershipAttr.  There's a bunch of duplication here which is unneeded.  For example, setModule, getModule, etc., have identical implementations.

There is also value in having them derive from a common superclass.  From a design perspective, it clearly marks them as being related.


Index: include/clang/Parse/AttributeList.h
===================================================================
--- include/clang/Parse/AttributeList.h (revision 106614)
+++ include/clang/Parse/AttributeList.h (working copy)
@@ -97,6 +97,9 @@
     AT_ns_returns_retained,     // Clang-specific.
     AT_objc_gc,
     AT_overloadable,       // Clang-specific.
+    AT_ownership_returns,  // Clang-specific.
+    AT_ownership_takes,    // Clang-specific.
+    AT_ownership_holds,    // Clang-specific.

Please keep these sorted w.r.t. each other.

     AT_packed,
     AT_pure,
     AT_regparm,
Index: lib/Sema/SemaDeclAttr.cpp


===================================================================
--- lib/Sema/SemaDeclAttr.cpp (revision 106614)
+++ lib/Sema/SemaDeclAttr.cpp (working copy)
@@ -350,6 +350,261 @@
   d->addAttr(::new (S.Context) NonNullAttr(S.Context, start, size));
 }
 
+static void HandleOwnershipReturnsAttr(Decl *d, const AttributeList &Attr,
+                                       Sema &S) {
+  // This attribute must be applied to a function declaration.
+  // The first argument to the attribute must be a string,
+  // the name of the resource, for example "malloc".
+  // The next argument is optional, and represents the size of allocation.
+  // If present, it must be an integer less than the number of function arguments,
+  // this is an argument index, the argument must be of integer type.
+  // If not present, the allocation is of unknown size.
+
+  if (!Attr.getParameterName()) {
+    S.Diag(Attr.getLoc(), diag::err_attribute_argument_n_not_string)
+        << "ownership_returns" << 1;
+    return;
+  }
+
+  if (!isFunctionOrMethodOrBlock(d) || !hasFunctionProto(d)) {
+    S.Diag(Attr.getLoc(), diag::warn_attribute_wrong_decl_type)
+        << Attr.getName() << 0 /*function*/;
+    return;
+  }

These attributes only apply to functions, not blocks or Objective-C methods.  We can possibly extend them to support methods (and possibly even blocks), but that's not in the current design.  Your diagnostic even indicates that it only works on functions.  Use 'isFunction' instead.


+
+  unsigned NumArgs = getFunctionOrMethodNumArgs(d);
+
+  llvm::StringRef Module = Attr.getParameterName()->getName();
+
+  // Normalize the argument, __foo__ becomes foo.
+  if (Module.startswith("__") && Module.endswith("__"))
+    Module = Module.substr(2, Module.size() - 4);
+
+  llvm::SmallVector<unsigned, 10> OwnershipReturnsArgs;

Note that the small vector here is 'unsigned', but that the arguments themselves can be signed.  You should issue the appropriate diagnostic, and make sure the indices are non-negative.

+
+  if (Attr.getNumArgs() == 1) {
+    // Handle the index argument, if present.
+    Expr *IdxExpr = static_cast<Expr *> (Attr.getArg(0));
+    llvm::APSInt ArgNum(32);
+    if (IdxExpr->isTypeDependent() || IdxExpr->isValueDependent()
+        || !IdxExpr->isIntegerConstantExpr(ArgNum, S.Context)) {
+      S.Diag(Attr.getLoc(), diag::err_attribute_argument_not_int)
+          << "ownership_returns" << IdxExpr->getSourceRange();
+      return;
+    }
+
+    unsigned x = (unsigned) ArgNum.getZExtValue();
+
+    if (x < 1 || x > NumArgs) {
+      S.Diag(Attr.getLoc(), diag::err_attribute_argument_out_of_bounds)
+          << "ownership_returns" << 1 << IdxExpr->getSourceRange();
+      return;
+    }
+    --x;
+    // Is the function argument an integer type?
+    QualType T = getFunctionOrMethodArgType(d, x);
+    if (!T->isUnsignedIntegerType()) {
+      // FIXME: Should also highlight argument in decl.
+      S.Diag(Attr.getLoc(), diag::err_ownership_returns_integers_only)
+          << "ownership_returns" << IdxExpr->getSourceRange();
+      return;
+    }
+    OwnershipReturnsArgs.push_back(x);
+  }

Looks good.

+
+  unsigned* start = OwnershipReturnsArgs.data();
+  unsigned size = OwnershipReturnsArgs.size();
+  std::sort(start, start + size);

Please use array_pod_sort.  std::sort causes a bunch of stuff to get pulled in (or instantiated) that bloats the code.


+  d->addAttr(::new (S.Context) OwnershipReturnsAttr(S.Context, start, size,
+                                                    Module));
+}

Looks good.

+
+static void HandleOwnershipHoldsAttr(Decl *d, const AttributeList &AL,
+                                     Sema &S) {
+  // This attribute must be applied to a function declaration.
+  // The first argument to the attribute must be a string,
+  // the name of the resource, for example "malloc".
+  // The following arguments must be argument indexes, the arguments must be of pointer type.
+  // The difference between Holds and Takes is that a pointer may still be used
+  // after being held.  free() should be __attribute((ownership_takes)), whereas a list
+  // append function may well be __attribute((ownership_holds)).
+
+  if (!AL.getParameterName()) {
+    S.Diag(AL.getLoc(), diag::err_attribute_argument_n_not_string)
+        << "ownership_holds" << 1;
+    return;
+  }
+
+  if (AL.getNumArgs() < 1) {
+    S.Diag(AL.getLoc(), diag::err_attribute_wrong_number_arguments) << 2;
+    return;
+  }
+
+  if (!isFunctionOrMethodOrBlock(d) || !hasFunctionProto(d)) {
+    S.Diag(AL.getLoc(), diag::warn_attribute_wrong_decl_type)
+        << AL.getName() << 0 /*function*/;
+    return;
+  }

Same comment as above.  Use 'isFunction'.

+
+  unsigned NumArgs = getFunctionOrMethodNumArgs(d);
+
+  llvm::StringRef Module = AL.getParameterName()->getName();
+
+  // Normalize the argument, __foo__ becomes foo.
+  if (Module.startswith("__") && Module.endswith("__"))
+    Module = Module.substr(2, Module.size() - 4);
+
+  llvm::SmallVector<unsigned, 10> OwnershipHoldsArgs;
+
+  for (AttributeList::arg_iterator I = AL.arg_begin(), E = AL.arg_end(); I
+      != E; ++I) {
+
+    Expr *IdxExpr = static_cast<Expr *> (*I);
+    llvm::APSInt ArgNum(32);
+    if (IdxExpr->isTypeDependent() || IdxExpr->isValueDependent()
+        || !IdxExpr->isIntegerConstantExpr(ArgNum, S.Context)) {
+      S.Diag(AL.getLoc(), diag::err_attribute_argument_not_int)
+          << "ownership_holds" << IdxExpr->getSourceRange();
+      continue;
+    }
+
+    unsigned x = (unsigned) ArgNum.getZExtValue();
+
+    if (x < 1 || x > NumArgs) {
+      S.Diag(AL.getLoc(), diag::err_attribute_argument_out_of_bounds)
+          << "ownership_holds" << x << IdxExpr->getSourceRange();
+      continue;
+    }
+    --x;
+    // Is the function argument a pointer type?
+    QualType T = getFunctionOrMethodArgType(d, x);
+    if (!T->isAnyPointerType() && !T->isBlockPointerType()) {
+      // FIXME: Should also highlight argument in decl.
+      S.Diag(AL.getLoc(), diag::err_ownership_holds_pointers_only)
+          << IdxExpr->getSourceRange();
+      continue;
+    }
+    // Check we don't have a conflict with an ownership_takes attribute.
+    if (d->hasAttrs()) {
+      for (const Attr *attr = d->getAttrs(); attr; attr = attr->getNext()) {
+        if (const OwnershipTakesAttr* Att = dyn_cast<OwnershipTakesAttr> (attr)) {
+          for (const unsigned *I = Att->begin(), *E = Att->end(); I != E; ++I) {
+            if (x == *I) {
+              S.Diag(AL.getLoc(), diag::err_attributes_are_not_compatible)
+                  << "ownership_takes" << "ownership_holds";
+            }
+          }
+        }
+      }
+    }
+    OwnershipHoldsArgs.push_back(x);
+  }
+
+  if (OwnershipHoldsArgs.empty()) {
+    S.Diag(AL.getLoc(), diag::err_attribute_wrong_number_arguments) << 2;
+    return;
+  }
+
+  unsigned* start = &OwnershipHoldsArgs[0];
+  unsigned size = OwnershipHoldsArgs.size();
+  std::sort(start, start + size);
+  d->addAttr(::new (S.Context) OwnershipHoldsAttr(S.Context, start, size,
+                                                  Module));
+}

Most of this is copy-and-paste from HandleOwnershipReturnsAttr.  This looks like it could easily be refactored, with few choice 'if' statements based on the attribute type differentiating behavior.  Not only is this excessively verbose, this can lead to subtle bugs and inconsistencies.  For example, see how HandleNSReturnsRetainedAttr() is used to process four different kinds of attributes.  Please refactor the logic for these attributes into a common function.

+
+static void HandleOwnershipTakesAttr(Decl *d, const AttributeList &AL,
+                                     Sema &S) {
+  // This attribute must be applied to a function declaration.
+  // The first argument to the attribute must be a string,
+  // the name of the resource, for example "malloc".
+  // The following arguments must be argument indexes, the arguments must be of pointer type.
+  // The difference between Holds and Takes is that a pointer may still be used
+  // after being held.  free() should be __attribute((ownership_takes)), whereas a list
+  // append function may well be __attribute((ownership_holds)).
+
+  if (!AL.getParameterName()) {
+    S.Diag(AL.getLoc(), diag::err_attribute_argument_n_not_string)
+        << "ownership_takes" << 1;
+    return;
+  }
+
+  if (AL.getNumArgs() < 1) {
+    S.Diag(AL.getLoc(), diag::err_attribute_wrong_number_arguments) << 2;
+    return;
+  }
+
+  if (!isFunctionOrMethodOrBlock(d) || !hasFunctionProto(d)) {
+    S.Diag(AL.getLoc(), diag::warn_attribute_wrong_decl_type)
+        << AL.getName() << 0 /*function*/;
+    return;
+  }
+
+  unsigned NumArgs = getFunctionOrMethodNumArgs(d);
+
+  llvm::StringRef Module = AL.getParameterName()->getName();
+
+  // Normalize the argument, __foo__ becomes foo.
+  if (Module.startswith("__") && Module.endswith("__"))
+    Module = Module.substr(2, Module.size() - 4);
+
+  llvm::SmallVector<unsigned, 10> OwnershipTakesArgs;
+
+  for (AttributeList::arg_iterator I = AL.arg_begin(), E = AL.arg_end(); I
+      != E; ++I) {
+
+    Expr *IdxExpr = static_cast<Expr *> (*I);
+    llvm::APSInt ArgNum(32);
+    if (IdxExpr->isTypeDependent() || IdxExpr->isValueDependent()
+        || !IdxExpr->isIntegerConstantExpr(ArgNum, S.Context)) {
+      S.Diag(AL.getLoc(), diag::err_attribute_argument_not_int)
+          << "ownership_takes" << IdxExpr->getSourceRange();
+      continue;
+    }
+
+    unsigned x = (unsigned) ArgNum.getZExtValue();
+
+    if (x < 1 || x > NumArgs) {
+      S.Diag(AL.getLoc(), diag::err_attribute_argument_out_of_bounds)
+          << "ownership_takes" << x << IdxExpr->getSourceRange();
+      continue;
+    }
+    --x;
+    // Is the function argument a pointer type?
+    QualType T = getFunctionOrMethodArgType(d, x);
+    if (!T->isAnyPointerType() && !T->isBlockPointerType()) {
+      // FIXME: Should also highlight argument in decl.
+      S.Diag(AL.getLoc(), diag::err_ownership_holds_pointers_only)
+          << IdxExpr->getSourceRange();
+      continue;
+    }
+    // Check we don't have a conflict with an ownership_holds attribute.
+    if (d->hasAttrs()) {
+      for (const Attr *attr = d->getAttrs(); attr; attr = attr->getNext()) {
+        if (const OwnershipHoldsAttr* Att = dyn_cast<OwnershipHoldsAttr> (attr)) {
+          for (const unsigned *I = Att->begin(), *E = Att->end(); I != E; ++I) {
+            if (x == *I) {
+              S.Diag(AL.getLoc(), diag::err_attributes_are_not_compatible)
+                  << "ownership_takes" << "ownership_holds";
+            }
+          }
+        }
+      }
+    }
+    OwnershipTakesArgs.push_back(x);
+  }
+
+  if (OwnershipTakesArgs.empty()) {
+    S.Diag(AL.getLoc(), diag::err_attribute_wrong_number_arguments) << 2;
+    return;
+  }
+
+  unsigned* start = &OwnershipTakesArgs[0];
+  unsigned size = OwnershipTakesArgs.size();
+  std::sort(start, start + size);
+  d->addAttr(::new (S.Context) OwnershipTakesAttr(S.Context, start, size,
+                                                  Module));
+}
+

Same comment as above.  This is all copy-and-paste.  We can do better.


 static bool isStaticVarOrStaticFunciton(Decl *D) {
   if (VarDecl *VD = dyn_cast<VarDecl>(D))
     return VD->getStorageClass() == VarDecl::Static;
@@ -2002,6 +2257,12 @@
   case AttributeList::AT_mode:        HandleModeAttr        (D, Attr, S); break;
   case AttributeList::AT_malloc:      HandleMallocAttr      (D, Attr, S); break;
   case AttributeList::AT_nonnull:     HandleNonNullAttr     (D, Attr, S); break;
+  case AttributeList::AT_ownership_returns:
+      HandleOwnershipReturnsAttr     (D, Attr, S); break;
+  case AttributeList::AT_ownership_takes:
+      HandleOwnershipTakesAttr     (D, Attr, S); break;
+  case AttributeList::AT_ownership_holds:
+      HandleOwnershipHoldsAttr     (D, Attr, S); break;

Instead of calling three functions, probably calling one or two is fine (and looking at the Attr kind).

   case AttributeList::AT_noreturn:    HandleNoReturnAttr    (D, Attr, S); break;
   case AttributeList::AT_nothrow:     HandleNothrowAttr     (D, Attr, S); break;
   case AttributeList::AT_override:    HandleOverrideAttr    (D, Attr, S); break;


Index: lib/AST/AttrImpl.cpp
===================================================================
--- lib/AST/AttrImpl.cpp (revision 106614)
+++ lib/AST/AttrImpl.cpp (working copy)
@@ -66,6 +66,70 @@
   Attr::Destroy(C);
 }
 
+OwnershipReturnsAttr::OwnershipReturnsAttr(ASTContext &C, unsigned* arg_nums,
+                                           unsigned size,
+                                           llvm::StringRef module) :
+  AttrWithString(attr::OwnershipReturns, C, module), ArgNums(0), Size(0) {
+  if (size == 0)
+    return;
+  assert(arg_nums);
+  ArgNums = new (C) unsigned[size];
+  Size = size;
+  memcpy(ArgNums, arg_nums, sizeof(*ArgNums) * size);
+}

Looks good.

+
+void OwnershipReturnsAttr::setModule(ASTContext &C, llvm::StringRef module) {
+  ReplaceString(C, module);
+}
+
+void OwnershipReturnsAttr::Destroy(ASTContext &C) {
+  if (ArgNums)
+    C.Deallocate(ArgNums);
+  Attr::Destroy(C);
+}

Looks good.

+
+OwnershipTakesAttr::OwnershipTakesAttr(ASTContext &C, unsigned* arg_nums,
+                                       unsigned size, llvm::StringRef module) :
+  AttrWithString(attr::OwnershipTakes, C, module), ArgNums(0), Size(0) {
+  if (size == 0)
+    return;
+  assert(arg_nums);
+  ArgNums = new (C) unsigned[size];
+  Size = size;
+  memcpy(ArgNums, arg_nums, sizeof(*ArgNums) * size);
+}

Notice how this is practically identical to OwnershipReturnsAttr?  I think most of this can be merged into a common constructor of a base class.

+
+void OwnershipTakesAttr::setModule(ASTContext &C, llvm::StringRef module) {
+  ReplaceString(C, module);
+}
+
+void OwnershipTakesAttr::Destroy(ASTContext &C) {
+  if (ArgNums)
+    C.Deallocate(ArgNums);
+  Attr::Destroy(C);
+}
+
+OwnershipHoldsAttr::OwnershipHoldsAttr(ASTContext &C, unsigned* arg_nums,
+                                       unsigned size, llvm::StringRef module) :
+  AttrWithString(attr::OwnershipHolds, C, module), ArgNums(0), Size(0) {
+  if (size == 0)
+    return;
+  assert(arg_nums);
+  ArgNums = new (C) unsigned[size];
+  Size = size;
+  memcpy(ArgNums, arg_nums, sizeof(*ArgNums) * size);
+}

Same comment as with OwnershipReturnsAttr.  This is basically copy-and-paste.  The attributes are basically the same except one is "takes" instead of "holds".  We shouldn't be duplicating so much code for that.

+
+void OwnershipHoldsAttr::setModule(ASTContext &C, llvm::StringRef module) {
+  ReplaceString(C, module);
+}
+
+void OwnershipHoldsAttr::Destroy(ASTContext &C) {
+  if (ArgNums)
+    C.Deallocate(ArgNums);
+  Attr::Destroy(C);
+}
+
 #define DEF_SIMPLE_ATTR_CLONE(ATTR)                                     \
   Attr *ATTR##Attr::clone(ASTContext &C) const {                        \
     return ::new (C) ATTR##Attr;                                        \
@@ -165,6 +229,18 @@
   return ::new (C) NonNullAttr(C, ArgNums, Size);
 }
 
+Attr *OwnershipReturnsAttr::clone(ASTContext &C) const {
+  return ::new (C) OwnershipReturnsAttr(C, ArgNums, Size, getModule());
+}
+
+Attr *OwnershipTakesAttr::clone(ASTContext &C) const {
+  return ::new (C) OwnershipTakesAttr(C, ArgNums, Size, getModule());
+}
+
+Attr *OwnershipHoldsAttr::clone(ASTContext &C) const {
+  return ::new (C) OwnershipHoldsAttr(C, ArgNums, Size, getModule());
+}
+
 Attr *FormatAttr::clone(ASTContext &C) const {
   return ::new (C) FormatAttr(C, getType(), formatIdx, firstArg);
 }
Index: lib/Lex/PPMacroExpansion.cpp
===================================================================
--- lib/Lex/PPMacroExpansion.cpp (revision 106614)
+++ lib/Lex/PPMacroExpansion.cpp (working copy)
@@ -505,6 +505,9 @@
            .Case("cxx_static_assert", LangOpts.CPlusPlus0x)
            .Case("objc_nonfragile_abi", LangOpts.ObjCNonFragileABI)
            .Case("objc_weak_class", LangOpts.ObjCNonFragileABI)
+           .Case("ownership_returns", true)
+           .Case("ownership_takes", true)
+           .Case("ownership_holds", true)

Great!

          //.Case("cxx_concepts", false)
          //.Case("cxx_lambdas", false)
          //.Case("cxx_nullptr", false)
Index: lib/Checker/MallocChecker.cpp
===================================================================
--- lib/Checker/MallocChecker.cpp (revision 106614)
+++ lib/Checker/MallocChecker.cpp (working copy)
@@ -24,15 +24,17 @@
 namespace {
 
 class RefState {
-  enum Kind { AllocateUnchecked, AllocateFailed, Released, Escaped } K;
+  enum Kind { AllocateUnchecked, AllocateFailed, Released, Escaped, Relinquished } K;
   const Stmt *S;
 
 public:
   RefState(Kind k, const Stmt *s) : K(k), S(s) {}
 
   bool isAllocated() const { return K == AllocateUnchecked; }
+  bool isFailed() const { return K == AllocateFailed; }
   bool isReleased() const { return K == Released; }
   bool isEscaped() const { return K == Escaped; }
+  bool isRelinquished() const { return K == Relinquished; }
 
   bool operator==(const RefState &X) const {
     return K == X.K && S == X.S;
@@ -46,6 +48,7 @@
   }
   static RefState getReleased(const Stmt *s) { return RefState(Released, s); }
   static RefState getEscaped(const Stmt *s) { return RefState(Escaped, s); }
+  static RefState getRelinquished(const Stmt *s) { return RefState(Relinquished, s); }
 
   void Profile(llvm::FoldingSetNodeID &ID) const {
     ID.AddInteger(K);
@@ -59,12 +62,13 @@
   BuiltinBug *BT_DoubleFree;
   BuiltinBug *BT_Leak;
   BuiltinBug *BT_UseFree;
+  BuiltinBug *BT_UseRelinquished;
   BuiltinBug *BT_BadFree;
   IdentifierInfo *II_malloc, *II_free, *II_realloc, *II_calloc;
 
 public:
   MallocChecker() 
-    : BT_DoubleFree(0), BT_Leak(0), BT_UseFree(0), BT_BadFree(0),
+    : BT_DoubleFree(0), BT_Leak(0), BT_UseFree(0), BT_UseRelinquished(0), BT_BadFree(0),
       II_malloc(0), II_free(0), II_realloc(0), II_calloc(0) {}
   static void *getTag();
   bool EvalCallExpr(CheckerContext &C, const CallExpr *CE);
@@ -76,6 +80,7 @@
 
 private:
   void MallocMem(CheckerContext &C, const CallExpr *CE);
+  void MallocMemReturnsAttr(CheckerContext &C, const CallExpr *CE, const OwnershipReturnsAttr* Att);
   const GRState *MallocMemAux(CheckerContext &C, const CallExpr *CE,
                               const Expr *SizeEx, SVal Init,
                               const GRState *state) {
@@ -86,8 +91,10 @@
                               const GRState *state);
 
   void FreeMem(CheckerContext &C, const CallExpr *CE);
+  void FreeMemTakesAttr(CheckerContext &C, const CallExpr *CE, const OwnershipTakesAttr* Att);
+  void FreeMemHoldsAttr(CheckerContext &C, const CallExpr *CE, const OwnershipHoldsAttr* Att);
   const GRState *FreeMemAux(CheckerContext &C, const CallExpr *CE,
-                            const GRState *state);
+                            const GRState *state, unsigned Num, bool Hold);
 
   void ReallocMem(CheckerContext &C, const CallExpr *CE);
   void CallocMem(CheckerContext &C, const CallExpr *CE);
@@ -156,7 +163,26 @@
     return true;
   }
 
-  return false;
+  // Check all the attributes, if there are any.
+  // There can be multiple of these attributes.
+  bool rv = false;
+  if (FD->hasAttrs()) {
+    for (const Attr *attr = FD->getAttrs(); attr; attr = attr->getNext()) {
+      if (const OwnershipReturnsAttr* Att = dyn_cast<OwnershipReturnsAttr>(attr)) {
+        MallocMemReturnsAttr(C, CE, Att);
+        rv = true;
+      }
+      if (const OwnershipTakesAttr* Att = dyn_cast<OwnershipTakesAttr> (attr)) {
+        FreeMemTakesAttr(C, CE, Att);
+        rv = true;
+      }
+      if (const OwnershipHoldsAttr* Att = dyn_cast<OwnershipHoldsAttr> (attr)) {
+        FreeMemHoldsAttr(C, CE, Att);
+        rv = true;
+      }

Use 'else if' to avoid redundant checks here.

+    }
+  }
+  return rv;
 }
 
 void MallocChecker::MallocMem(CheckerContext &C, const CallExpr *CE) {
@@ -165,6 +191,23 @@
   C.addTransition(state);
 }
 
+void MallocChecker::MallocMemReturnsAttr(CheckerContext &C, const CallExpr *CE,
+                                         const OwnershipReturnsAttr* Att) {
+  if (!Att->isModule("malloc"))
+    return;
+
+  const unsigned *I = Att->begin(), *E = Att->end();
+  if (I != E) {
+    const GRState *state =
+        MallocMemAux(C, CE, CE->getArg(*I), UndefinedVal(), C.getState());
+    C.addTransition(state);
+    return;
+  }
+  const GRState *state = MallocMemAux(C, CE, UndefinedVal(), UndefinedVal(),
+                                        C.getState());
+  C.addTransition(state);
+}
+
 const GRState *MallocChecker::MallocMemAux(CheckerContext &C,  
                                            const CallExpr *CE,
                                            SVal Size, SVal Init,
@@ -188,15 +231,41 @@
 }
 
 void MallocChecker::FreeMem(CheckerContext &C, const CallExpr *CE) {
-  const GRState *state = FreeMemAux(C, CE, C.getState());
+  const GRState *state = FreeMemAux(C, CE, C.getState(), 0, false);
 
   if (state)
     C.addTransition(state);
 }
 
+void MallocChecker::FreeMemTakesAttr(CheckerContext &C, const CallExpr *CE,
+                                     const OwnershipTakesAttr* Att) {
+  if (!Att->isModule("malloc"))
+    return;
+
+  for (const unsigned *I = Att->begin(), *E = Att->end(); I != E; ++I) {
+    const GRState *state =
+        FreeMemAux(C, CE, C.getState(), *I, false);
+    if (state)
+      C.addTransition(state);
+  }
+}
+
+void MallocChecker::FreeMemHoldsAttr(CheckerContext &C, const CallExpr *CE,
+                                     const OwnershipHoldsAttr* Att) {
+  if (!Att->isModule("malloc"))
+    return;
+
+  for (const unsigned *I = Att->begin(), *E = Att->end(); I != E; ++I) {
+    const GRState *state =
+        FreeMemAux(C, CE, C.getState(), *I, true);
+    if (state)
+      C.addTransition(state);
+  }
+}
+
 const GRState *MallocChecker::FreeMemAux(CheckerContext &C, const CallExpr *CE,
-                                         const GRState *state) {
-  const Expr *ArgExpr = CE->getArg(0);
+                                         const GRState *state, unsigned Num, bool Hold) {
+  const Expr *ArgExpr = CE->getArg(Num);
   SVal ArgVal = state->getSVal(ArgExpr);
 
   // If ptr is NULL, no operation is preformed.
@@ -272,7 +341,24 @@
     return NULL;
   }
 
+  // Check free after relinquished ownership.
+    if (RS->isRelinquished()) {
+        ExplodedNode *N = C.GenerateSink();
+        if (N) {
+            if (!BT_UseRelinquished)
+                BT_UseRelinquished = new BuiltinBug("Double free",
+                    "Try to free a memory block that has been passed elsewhere");
+            // FIXME: should find where it's freed last time.
+            BugReport *R = new BugReport(*BT_UseRelinquished,
+                    BT_UseRelinquished->getDescription(), N);
+            C.EmitReport(R);
+        }
+        return NULL;
+    }
+
   // Normal free.
+  if (Hold)
+      return state->set<RegionState>(Sym, RefState::getRelinquished(CE));
   return state->set<RegionState>(Sym, RefState::getReleased(CE));
 }
 
@@ -437,13 +523,13 @@
                                       ValMgr.makeIntValWithPtrWidth(0, false));
 
     if (const GRState *stateSizeZero = stateNotEqual->Assume(SizeZero, true)) {
-      const GRState *stateFree = FreeMemAux(C, CE, stateSizeZero);
+      const GRState *stateFree = FreeMemAux(C, CE, stateSizeZero, 0, false);
       if (stateFree)
         C.addTransition(stateFree->BindExpr(CE, UndefinedVal(), true));
     }
 
     if (const GRState *stateSizeNotZero=stateNotEqual->Assume(SizeZero,false)) {
-      const GRState *stateFree = FreeMemAux(C, CE, stateSizeNotZero);
+      const GRState *stateFree = FreeMemAux(C, CE, stateSizeNotZero, 0, false);
       if (stateFree) {
         // FIXME: We should copy the content of the original buffer.
         const GRState *stateRealloc = MallocMemAux(C, CE, CE->getArg(1), 
@@ -536,7 +622,6 @@
   // FIXME: check other cases.
   if (RS->isAllocated())
     state = state->set<RegionState>(Sym, RefState::getEscaped(S));
-
   C.addTransition(state);
 }
 
Index: lib/Parse/AttributeList.cpp
===================================================================
--- lib/Parse/AttributeList.cpp (revision 106614)
+++ lib/Parse/AttributeList.cpp (working copy)
@@ -118,6 +118,9 @@
     .Case("ns_returns_retained", AT_ns_returns_retained)
     .Case("cf_returns_not_retained", AT_cf_returns_not_retained)
     .Case("cf_returns_retained", AT_cf_returns_retained)
+    .Case("ownership_returns", AT_ownership_returns)
+    .Case("ownership_holds", AT_ownership_holds)
+    .Case("ownership_takes", AT_ownership_takes)
     .Case("reqd_work_group_size", AT_reqd_wg_size)
     .Case("init_priority", AT_init_priority)
     .Case("no_instrument_function", AT_no_instrument_function)

Looks good.

On Jun 28, 2010, at 2:15 PM, Andrew McGregor wrote:

Oops, forgot to reply-all...

---------- Forwarded message ----------
From: Andrew McGregor <[hidden email]>
Date: Mon, Jun 28, 2010 at 12:04 PM
Subject: Re: Ownership attribute for malloc etc. checking
To: Jordy Rose <[hidden email]>


Hi,

Thanks for the review.  Comments inline, new patch attached.

Andrew

On Sun, Jun 27, 2010 at 9:33 AM, Jordy Rose <[hidden email]> wrote:

Looking good! Some remaining comments below.

I'm also a reasonably new committer, so someone else should probably look
over this as well (particularly the Attr hunks, which I'm not very familiar
with).

Jordy


General:
- It would be nice to make sure one argument isn't both ownership_takes
and ownership_holds.

True, will do.
 
- LLVM's convention is to use spaces rather than tabs.
- There's a hunk where the only thing that changed is indentation; it'd be
nice to take that out.

Oops, neither of these were intentional.
 
- Two more useful test would be free(p); my_free(p) and free(p);
my_hold(p). Both of which should warn, of course.

Instead of crashing, yes.  Well spotted.
 
 (MallocChecker::EvalCallExpr)
+    for (const Attr *attr = FD->getAttrs(); attr; attr = attr->getNext())
{
+      if (const OwnershipReturnsAttr* Att =
dyn_cast<OwnershipReturnsAttr>(attr)) {
+        MallocMemReturnsAttr(C, CE, Att);
+        rv = true;
+      }
+      if (const OwnershipTakesAttr* Att = dyn_cast<OwnershipTakesAttr>
(attr)) {
+        FreeMemTakesAttr(C, CE, Att);
+        rv = true;
+      }
+      if (const OwnershipHoldsAttr* Att = dyn_cast<OwnershipHoldsAttr>
(attr)) {
+        FreeMemHoldsAttr(C, CE, Att);
+        rv = true;
+      }
+    }

Here's the dispatch from EvalCallExpr(). I think even though it's
technically a little less efficient, you should just simplify this and use
FD->getAttr<OwnershipTakesAttr>(), etc. It's a little simpler, and it lets
you order the attributes.

However, I did it this way because I wanted to allow two or more attributes of the same kind (for the benefit of macros).

In other words, __attribute((ownership_holds, 1)) __attribute((ownership_holds, 2)) should be the same as __attribute((ownership_holds, 1, 2)).
 
Why does that matter? Because an ownership_takes() or ownership_holds()
function might have a return value! If it's also ownership_returns(), then
you're fine, but otherwise you need to set a symbolic return value. (You
can see how this is done in MallocMemAux(), or in StreamChecker.cpp with
fopen.) There should probably be a test for this as well.

It isn't necessarily the case that an ownership_takes() function that returns a pointer generated that pointer from an ownership_takes argument, or allocated anything, so what would you set the region to, in the absence of other information?  They default to Unknown, yes?
 
Actually, because these attributes don't affect the return value, they
should probably go in a PreVisitCallExpr() callback, rather than
EvalCallExpr(). But for now it's probably okay, as long as you handle that
return value.


(MallocChecker::MallocMemReturnsAttr)
+  for (const unsigned *I = Att->begin(), *E = Att->end(); I != E; ++I,
++count) {
+    const GRState *state =
+        MallocMemAux(C, CE, CE->getArg(*I), UndefinedVal(),
C.getState());
+    C.addTransition(state);
+  }
+  if (count == 0) {
+    const GRState *state = MallocMemAux(C, CE, UndefinedVal(),
UndefinedVal(),
+                                        C.getState());
+    C.addTransition(state);
+  }

What does it mean to have multiple size fields? There can still only be
one return value. Maybe you should only allow zero or one positions for
ownership_returns (which would make this section simpler).

The intention here was to implement something for calloc-like functions and matrix allocators; one argument is a special case, more than one the size of the region is the product of the arguments times sizeof(pointee of the return value), taking sizeof(void) as 1.  If it isn't worth doing that, fair enough, but that's what I was thinking this could lead to.

However, the parser only allows the one argument at the moment, so this is redundant.
 
(MallocChecker::PreVisitReturnStmt)
+  if (RS->isReleased()) {
+    ExplodedNode *N = C.GenerateSink();
+    if (!BT_UseFree)
+      BT_UseFree = new BuiltinBug("Use dynamically allocated memory
after"
+                                  " it is freed.");

+    BugReport *R = new BugReport(*BT_UseFree,
BT_UseFree->getDescription(),
+                                 N);
+    C.EmitReport(R);
+  }

This section checks whether the return value is a released address.
Unfortunately this isn't a 100% unambiguous case; consider the following
(contrived) example:

struct Node *freeSmallest () {
 struct Node *min;
 // find the minimum node
 free(min);
 return min; // so the caller knows the ID of the deleted node
}

As long as the caller doesn't dereference the returned pointer, it could
still be a useful value. 95% of the time, this check would be useful, but
since a false positive is worse than a false negative for the analyzer, we
should probably leave it out.

Ok, I see.  Out it is, then.

Andrew

<clang-ownership-withtests-r1.patch>


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Jordy Rose
In reply to this post by Jordy Rose

A few last comments (along with what Ted had to say). Also, I second being
able to put the attributes on the arguments themselves. (Why didn't the GCC
people do nonnull like that?)


>> Here's the dispatch from EvalCallExpr(). I think even though it's
>> technically a little less efficient, you should just simplify this and
>> use
>> FD->getAttr<OwnershipTakesAttr>(), etc. It's a little simpler, and it
>> lets
>> you order the attributes.
>>
>
> However, I did it this way because I wanted to allow two or more
attributes
> of the same kind (for the benefit of macros).
>
> In other words, __attribute((ownership_holds, 1))
> __attribute((ownership_holds, 2)) should be the same
> as __attribute((ownership_holds, 1, 2)).
>

Ah, good point. Okay, as long as return values are still handled (see
below).


>> Why does that matter? Because an ownership_takes() or ownership_holds()
>> function might have a return value! If it's also ownership_returns(),
>> then
>> you're fine, but otherwise you need to set a symbolic return value.
(You
>> can see how this is done in MallocMemAux(), or in StreamChecker.cpp
with
>> fopen.) There should probably be a test for this as well.
>
>
> It isn't necessarily the case that an ownership_takes() function that
> returns a pointer generated that pointer from an ownership_takes
argument,
> or allocated anything, so what would you set the region to, in the
absence
> of other information?  They default to Unknown, yes?

I believe so, but for return values from function calls we prefer to use a
"conjured symbol value". This is a kind of value that can be reasoned about
symbolically in later statements -- for example, we can bound its value if
it's an integer, or mark that it's null or non-null if it's a pointer.


>> What does it mean to have multiple size fields? There can still only be
>> one return value. Maybe you should only allow zero or one positions for
>> ownership_returns (which would make this section simpler).
>>
>
> The intention here was to implement something for calloc-like functions
and
> matrix allocators; one argument is a special case, more than one the
size
> of
> the region is the product of the arguments times sizeof(pointee of the
> return value), taking sizeof(void) as 1.  If it isn't worth doing that,
> fair
> enough, but that's what I was thinking this could lead to.

It's a good idea, but consider the following examples:

Node *getNode (size_t extra) {
  return malloc(sizeof(Node)+extra);
}

Node *getNode (size_t total) {
  assert(total >= extra);
  return malloc(total);
}

Node *getNodes (size_t count) {
  return malloc(sizeof(Node)*count);
}

Your proposal is to use the second or third meaning, based on whether one
or more arguments are included. I think we should just pick one meaning and
stick with it (i.e. not special-case the one-argument case). Or just take
it out, though as long as we allow the zero-argument case it doesn't hurt
anyone to handle arguments in one particular way.

> However, the parser only allows the one argument at the moment, so this
is
> redundant.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Jeffrey Yasskin
On Mon, Jun 28, 2010 at 6:03 PM, Jordy Rose <[hidden email]> wrote:
>
> A few last comments (along with what Ted had to say). Also, I second being
> able to put the attributes on the arguments themselves. (Why didn't the GCC
> people do nonnull like that?)

FWIW, Andrew replied privately to me with,

"I agree, but so far as I can see that is not possible presently."

which strikes me as a good reason not to do it until Sean's done
making it possible.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Andrew McGregor-3
I think the checker itself is general enough to handle this... if we find one attached to an argument, we could just generate the index in the attribute parser.

Hmm, but which is cleaner and clearer... have the checker know that arguments might be attributed as well, or have the parser push attributes from arguments onto the function?  The latter is probably less code, but might be rather obscure.  On the other hand, that whole attribute parser is pretty obscure, and is full of 'fixme's saying it deserves redesigned.

BTW, all the private replies are unintentional, I'm getting used to some new mail clients at present.

Andrew

On Tue, Jun 29, 2010 at 1:09 PM, Jeffrey Yasskin <[hidden email]> wrote:
On Mon, Jun 28, 2010 at 6:03 PM, Jordy Rose <[hidden email]> wrote:
>
> A few last comments (along with what Ted had to say). Also, I second being
> able to put the attributes on the arguments themselves. (Why didn't the GCC
> people do nonnull like that?)

FWIW, Andrew replied privately to me with,

"I agree, but so far as I can see that is not possible presently."

which strikes me as a good reason not to do it until Sean's done
making it possible.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Ted Kremenek

On Jun 28, 2010, at 6:16 PM, Andrew McGregor wrote:

> Hmm, but which is cleaner and clearer... have the checker know that arguments might be attributed as well, or have the parser push attributes from arguments onto the function?  The latter is probably less code, but might be rather obscure.  On the other hand, that whole attribute parser is pretty obscure, and is full of 'fixme's saying it deserves redesigned.
>


I don't think we want to duplicate attributes; attributes on the ParmVarDecls should only be on the ParmVarDecls, unless of course we want to incorporate the the attributes into the function's type.  The AST should have high fidelity to the actual source code.

That said, I don't think the checker should be in the business of having to reason about how the function was annotated.  The checker just cares about the semantics, while it's up to the parser and the attributes to handle the syntax.

One possible solution is to provide a static function(s) to OwnershipAttr et al that allows one to query a FunctionDecl for the relevant annotations.  Then the checker just cares if a parameter is annotated, but it doesn't care how it was written.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Andrew McGregor-3


On Tue, Jun 29, 2010 at 1:27 PM, Ted Kremenek <[hidden email]> wrote:

On Jun 28, 2010, at 6:16 PM, Andrew McGregor wrote:

> Hmm, but which is cleaner and clearer... have the checker know that arguments might be attributed as well, or have the parser push attributes from arguments onto the function?  The latter is probably less code, but might be rather obscure.  On the other hand, that whole attribute parser is pretty obscure, and is full of 'fixme's saying it deserves redesigned.
>


I don't think we want to duplicate attributes; attributes on the ParmVarDecls should only be on the ParmVarDecls, unless of course we want to incorporate the the attributes into the function's type.  The AST should have high fidelity to the actual source code.

That said, I don't think the checker should be in the business of having to reason about how the function was annotated.  The checker just cares about the semantics, while it's up to the parser and the attributes to handle the syntax.

One possible solution is to provide a static function(s) to OwnershipAttr et al that allows one to query a FunctionDecl for the relevant annotations.  Then the checker just cares if a parameter is annotated, but it doesn't care how it was written.

So, thinking about this led me to this idea:

These attributes actually have minimal differences so far as the checker is concerned.  So how about there being only one attribute object  type, probably an OwnershipAttr, with an enum kind embedded in it, that can be applied to an argument.  Each one would apply to just one argument or the return value, but may have several arguments (the so far unimplemented calloc case will need this, as will pointer-to-pointer malloc).  The parser would have to construct two attribute objects for __attribute((ownership_holds, 1, 2)).  That way the dispatch can be a case statement on the enum rather than the current mess of casts and tests.  This would also save some of the methods from being templates, which I think would be unnecessary complexity.

Or, again in order to reduce the duplication, I could make templates of some of the methods and use the type to represent the attribute kind the same way as I have so far.  Disadvantage: this is probably going to bloat.

Andrew

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Ted Kremenek
On Jun 28, 2010, at 7:00 PM, Andrew McGregor wrote:

> So, thinking about this led me to this idea:
>
> These attributes actually have minimal differences so far as the checker is concerned.  So how about there being only one attribute object  type, probably an OwnershipAttr, with an enum kind embedded in it, that can be applied to an argument.  Each one would apply to just one argument or the return value, but may have several arguments (the so far unimplemented calloc case will need this, as will pointer-to-pointer malloc).  The parser would have to construct two attribute objects for __attribute((ownership_holds, 1, 2)).  That way the dispatch can be a case statement on the enum rather than the current mess of casts and tests.  This would also save some of the methods from being templates, which I think would be unnecessary complexity.
>
> Or, again in order to reduce the duplication, I could make templates of some of the methods and use the type to represent the attribute kind the same way as I have so far.  Disadvantage: this is probably going to bloat.

I'm fine with having one attribute with an enum, but I don't think it is mutually exclusive from having subclasses.  You can keep all the key logic in one class, and for clients that want to do queries that are specific to a particular attribute they can do downcast.  For example:

class OwnershipAttr {
  enum OwnershipKind { Holds, Takes, ... };
  OwnershipKind OKind;
...
};

class OwnershipReturnsAttr : public OwnershipAttr {
public:
  OwnershipReturnsAttr() : OwnershipAttr(Returns) {}

  static bool classof(OwnershipAttr *A) { return A->OKind == Returns; }
};

This is basically the same idiom we use with Stmts and Exprs in Clang's ASTS.  This gives you the best of both worlds.  You can use the 'enum' when you want switch on the ownership kind, but still can use the classes to encapsulate the relevant logic and add APIs that are only meaningful for a particular attribute.  For example, consider the main analysis function in the static analyzer.  It is littered with case statements like:

   case Stmt::BlockExprClass:
      VisitBlockExpr(cast<BlockExpr>(S), Pred, Dst);
      break;

The strength of this approach is you get strong-typing when you want it, but it doesn't add a burden if you don't need it (e.g., if you only care that something is a OwnershipAttr* instead of a OwnershipReturnsAttr*).
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Andrew McGregor-3
Ok, so I implemented what I think is all the suggestions...

The patch is around 100 lines shorter, so that's a definite improvement.

I'm not sure this is as minimal as it could be.

While I was there, I also changed the use of std::sort in the nonnull parsing, since that's where I picked it up from in the first place.

Andrew

On Tue, Jun 29, 2010 at 2:43 PM, Ted Kremenek <[hidden email]> wrote:
On Jun 28, 2010, at 7:00 PM, Andrew McGregor wrote:

> So, thinking about this led me to this idea:
>
> These attributes actually have minimal differences so far as the checker is concerned.  So how about there being only one attribute object  type, probably an OwnershipAttr, with an enum kind embedded in it, that can be applied to an argument.  Each one would apply to just one argument or the return value, but may have several arguments (the so far unimplemented calloc case will need this, as will pointer-to-pointer malloc).  The parser would have to construct two attribute objects for __attribute((ownership_holds, 1, 2)).  That way the dispatch can be a case statement on the enum rather than the current mess of casts and tests.  This would also save some of the methods from being templates, which I think would be unnecessary complexity.
>
> Or, again in order to reduce the duplication, I could make templates of some of the methods and use the type to represent the attribute kind the same way as I have so far.  Disadvantage: this is probably going to bloat.

I'm fine with having one attribute with an enum, but I don't think it is mutually exclusive from having subclasses.  You can keep all the key logic in one class, and for clients that want to do queries that are specific to a particular attribute they can do downcast.  For example:

class OwnershipAttr {
 enum OwnershipKind { Holds, Takes, ... };
 OwnershipKind OKind;
...
};

class OwnershipReturnsAttr : public OwnershipAttr {
public:
 OwnershipReturnsAttr() : OwnershipAttr(Returns) {}

 static bool classof(OwnershipAttr *A) { return A->OKind == Returns; }
};

This is basically the same idiom we use with Stmts and Exprs in Clang's ASTS.  This gives you the best of both worlds.  You can use the 'enum' when you want switch on the ownership kind, but still can use the classes to encapsulate the relevant logic and add APIs that are only meaningful for a particular attribute.  For example, consider the main analysis function in the static analyzer.  It is littered with case statements like:

  case Stmt::BlockExprClass:
     VisitBlockExpr(cast<BlockExpr>(S), Pred, Dst);
     break;

The strength of this approach is you get strong-typing when you want it, but it doesn't add a burden if you don't need it (e.g., if you only care that something is a OwnershipAttr* instead of a OwnershipReturnsAttr*).


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

clang-ownership-withtests-r2.patch (36K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Andrew McGregor-3
In reply to this post by Jordy Rose


On Tue, Jun 29, 2010 at 1:03 PM, Jordy Rose <[hidden email]> wrote:

>> Why does that matter? Because an ownership_takes() or ownership_holds()
>> function might have a return value! If it's also ownership_returns(),
>> then
>> you're fine, but otherwise you need to set a symbolic return value.
(You
>> can see how this is done in MallocMemAux(), or in StreamChecker.cpp
with
>> fopen.) There should probably be a test for this as well.
>
>
> It isn't necessarily the case that an ownership_takes() function that
> returns a pointer generated that pointer from an ownership_takes
argument,
> or allocated anything, so what would you set the region to, in the
absence
> of other information?  They default to Unknown, yes?

I believe so, but for return values from function calls we prefer to use a
"conjured symbol value". This is a kind of value that can be reasoned about
symbolically in later statements -- for example, we can bound its value if
it's an integer, or mark that it's null or non-null if it's a pointer.

So, I'm still not sure I understand what you mean me to do here... could you elaborate?

Since you seem to be the symbolic value expert, I'll ask about the case I'm trying to solve now.  I added some symbolic checks to the free handling, similar to realloc, to prevent warnings in the common case of error handling for allocation handlers.

So now this works:

char *  __attribute((ownership_returns(malloc))) foo(void) {
  char *textString = malloc(128*sizeof(char));
  if(textString == NULL)
    return NULL; // No warning here
  return textString;
}

But now I have this:

struct it {
  char * s;
};

struct it *  __attribute((ownership_returns(malloc))) foo(void) {
  struct it *rv = malloc(sizeof(struct it));
  if (!rv)
    return NULL; // Does not warn here.
  char *textString = malloc(128*sizeof(char));
  if(textString == NULL)
    free(rv);
    return NULL; // Warns about a memory leak here
  rv->s = textString;
  return rv; // Does NOT warn here
}


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Jordy Rose

On Fri, 2 Jul 2010 12:56:30 +1200, Andrew McGregor <[hidden email]>
wrote:
> On Tue, Jun 29, 2010 at 1:03 PM, Jordy Rose <[hidden email]>
wrote:
>
>>
>> >> Why does that matter? Because an ownership_takes() or
>> >> ownership_holds()
>> >> function might have a return value! If it's also
ownership_returns(),

>> >> then
>> >> you're fine, but otherwise you need to set a symbolic return value.
>> (You
>> >> can see how this is done in MallocMemAux(), or in StreamChecker.cpp
>> with
>> >> fopen.) There should probably be a test for this as well.
>> >
>> >
>> > It isn't necessarily the case that an ownership_takes() function that
>> > returns a pointer generated that pointer from an ownership_takes
>> argument,
>> > or allocated anything, so what would you set the region to, in the
>> absence
>> > of other information?  They default to Unknown, yes?
>>
>> I believe so, but for return values from function calls we prefer to
use

>> a
>> "conjured symbol value". This is a kind of value that can be reasoned
>> about
>> symbolically in later statements -- for example, we can bound its value
>> if
>> it's an integer, or mark that it's null or non-null if it's a pointer.
>
>
> So, I'm still not sure I understand what you mean me to do here... could
> you
> elaborate?
>
> Since you seem to be the symbolic value expert, I'll ask about the case
I'm
> trying to solve now.  I added some symbolic checks to the free handling,
> similar to realloc, to prevent warnings in the common case of error
> handling
> for allocation handlers.

Sorry, I should have been clearer! If you take a look at MallocMemAux,
you'll see these lines:

  unsigned Count = C.getNodeBuilder().getCurrentBlockCount();
  SVal RetVal = ValMgr.getConjuredSymbolVal(NULL, CE, CE->getType(),
Count);
  state = state->BindExpr(CE, RetVal);

This sort of code is used for any function call that returns a value. It's
better than just leaving the expression as UnknownVal because it lets us
track assumptions about the value from then on.

Consider this code:

int freeNode (Node *n) __attribute__(ownership_takes(1)) {
  int id = n->id;
  free(n);
  return id;
}

// in some other function
int x = freeNode(last);
if (x < 0) {
  log("invalid node: %d", x);
  return;
}
// from here on x is known to be positive

So all you would need to add is those three lines, for the case in which
you /are/ handling the function (returning true from EvalCallExpr), /not/
calling MallocMemAux (which is already handling the return value), and the
function's return type isn't void.

Alternately, go ahead and make the ownership_takes/ownership_holds cases
part of a new PreVisitCallExpr method. PreVisitCallExpr doesn't require you
to handle the return value, and allows you to stack multiple checks. This
is probably more correct anyway -- it would let users annotate methods that
may eventually need another checker to actually do the evaluation.

As for the example you showed, it should work, except:

> struct it *  __attribute((ownership_returns(malloc))) foo(void) {
>   struct it *rv = malloc(sizeof(struct it));
>   if (!rv)
>     return NULL; // Does not warn here.
>   char *textString = malloc(128*sizeof(char));
>   if(textString == NULL)
>     free(rv);
>     return NULL; // Warns about a memory leak here
>   rv->s = textString;
>   return rv; // Does NOT warn here
> }

...the code is just missing braces around the second if -- the second
"return NULL" is unconditional!

Clang should catch this. Filing a bug. *grin*
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Andrew McGregor-3
Ok, I get what you're saying, PreVisit seems the right answer.

On Fri, Jul 2, 2010 at 2:41 PM, Jordy Rose <[hidden email]> wrote:


> struct it *  __attribute((ownership_returns(malloc))) foo(void) {
>   struct it *rv = malloc(sizeof(struct it));
>   if (!rv)
>     return NULL; // Does not warn here.
>   char *textString = malloc(128*sizeof(char));
>   if(textString == NULL)
>     free(rv);
>     return NULL; // Warns about a memory leak here
>   rv->s = textString;
>   return rv; // Does NOT warn here
> }

...the code is just missing braces around the second if -- the second
"return NULL" is unconditional!

Clang should catch this. Filing a bug. *grin*

D'oh!

So looking at this version:

void  __attribute((ownership_returns(malloc))) foo2(void) {
  struct it *rv = malloc(sizeof(struct it));
  if (!rv)
    return NULL;
  char *textString = malloc(128*sizeof(char));
  if(textString == NULL) {
    free(rv);
    return NULL;
  }
  rv->s = textString;
  return rv; // warns of a leak here
}

How could I make the assignment before the final return relinquish ownership of the pointer?

Andrew

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Ted Kremenek
On Jul 1, 2010, at 8:08 PM, Andrew McGregor wrote:

Ok, I get what you're saying, PreVisit seems the right answer.

On Fri, Jul 2, 2010 at 2:41 PM, Jordy Rose <[hidden email]> wrote:


> struct it *  __attribute((ownership_returns(malloc))) foo(void) {
>   struct it *rv = malloc(sizeof(struct it));
>   if (!rv)
>     return NULL; // Does not warn here.
>   char *textString = malloc(128*sizeof(char));
>   if(textString == NULL)
>     free(rv);
>     return NULL; // Warns about a memory leak here
>   rv->s = textString;
>   return rv; // Does NOT warn here
> }

...the code is just missing braces around the second if -- the second
"return NULL" is unconditional!

Clang should catch this. Filing a bug. *grin*

D'oh!

So looking at this version:

void  __attribute((ownership_returns(malloc))) foo2(void) {
  struct it *rv = malloc(sizeof(struct it));
  if (!rv)
    return NULL;
  char *textString = malloc(128*sizeof(char));
  if(textString == NULL) {
    free(rv);
    return NULL;
  }
  rv->s = textString;
  return rv; // warns of a leak here
}

How could I make the assignment before the final return relinquish ownership of the pointer?

CheckerVisitor also supports PreVisitBind (which is callback that occurs before the RHS gets bound to the LHS).  You can use that to monitor ownership transfer.  We can also add PostVisitBind if that would be useful.

That said, what are the semantics of the ownership algorithm?  Does a leak get flagged here, or does the escape of the value to a field silence the warning?

FWIW, ownership checking in the presence of data containers has been researched quite a bit.  Here's some off-hand references that might be useful:

Static Detection of Leaks in Polymorphic Containers, ICSE 2006

A practical flow-sensitive and context-sensitive C and C++ memory leak detector
http://portal.acm.org/citation.cfm?doid=781131.781150


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Andrew McGregor-3
So, prior to deducing ownership annotations (which I think I see how to do now, for non-pathological code), here's my latest version of the attributes.

The PreVisitBind implements the same algorithm as already used by the Objective C ownership checker: if the pointer escaped from this scope by assignment, let it go.  However, assigning to fields of a stack-storage structure does not transfer ownership.

The remaining issue is still that void foo(void ** it) {it=malloc(42);} warns.  How would I check for assignment to a pointee of an argument in PreVisitBind?

This is a git diff, if that won't apply I have plenty of options for regenerating it. (As an aside, why isn't the project using git?)

Andrew

On Fri, Jul 2, 2010 at 5:18 PM, Ted Kremenek <[hidden email]> wrote:
On Jul 1, 2010, at 8:08 PM, Andrew McGregor wrote:

Ok, I get what you're saying, PreVisit seems the right answer.

On Fri, Jul 2, 2010 at 2:41 PM, Jordy Rose <[hidden email]> wrote:


> struct it *  __attribute((ownership_returns(malloc))) foo(void) {
>   struct it *rv = malloc(sizeof(struct it));
>   if (!rv)
>     return NULL; // Does not warn here.
>   char *textString = malloc(128*sizeof(char));
>   if(textString == NULL)
>     free(rv);
>     return NULL; // Warns about a memory leak here
>   rv->s = textString;
>   return rv; // Does NOT warn here
> }

...the code is just missing braces around the second if -- the second
"return NULL" is unconditional!

Clang should catch this. Filing a bug. *grin*

D'oh!

So looking at this version:

void  __attribute((ownership_returns(malloc))) foo2(void) {
  struct it *rv = malloc(sizeof(struct it));
  if (!rv)
    return NULL;
  char *textString = malloc(128*sizeof(char));
  if(textString == NULL) {
    free(rv);
    return NULL;
  }
  rv->s = textString;
  return rv; // warns of a leak here
}

How could I make the assignment before the final return relinquish ownership of the pointer?

CheckerVisitor also supports PreVisitBind (which is callback that occurs before the RHS gets bound to the LHS).  You can use that to monitor ownership transfer.  We can also add PostVisitBind if that would be useful.

That said, what are the semantics of the ownership algorithm?  Does a leak get flagged here, or does the escape of the value to a field silence the warning?

FWIW, ownership checking in the presence of data containers has been researched quite a bit.  Here's some off-hand references that might be useful:

Static Detection of Leaks in Polymorphic Containers, ICSE 2006

A practical flow-sensitive and context-sensitive C and C++ memory leak detector
http://portal.acm.org/citation.cfm?doid=781131.781150



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

clang-ownership-pointers.patch (43K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Ownership attribute for malloc etc. checking

Alexei Svitkine
> The remaining issue is still that void foo(void ** it) {it=malloc(42);} warns.

Do you mean void foo(void ** it) {*it=malloc(42);} ?

Because the one you mentioned should warn. ;)

-Alexei
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
12