CFGElement changes and initializers addition (with patch)

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

CFGElement changes and initializers addition (with patch)

Marcin Świderski
Hi,

I've done some work on CFG. Biggest change is that CFGElement can no longer be converted to Stmt* (through static_cast nor dyn_cast). This is needed to properly handle all kinds of CFGElement: Statement, Initializer, ImplicitDtor and Scope. I've fixed some of the code that was assuming the CFGElement to represent Stmt* and left rest of it with appropriate comments.

I've also added Initializer CFGElements creation in CFG. Initializer CFGElement returns CXXBaseOrMemberInitializer object instead of Stmt.

For ImplicitDtor CFGElements I'm planning to add methods for getting:
- CXXDestructorDecl for destructor to call,
- Location of place destructor was called,
- Exact statement that created the object for which destructor is called (Expr for temporary object and VarDecl for automatic object).

I think that it would also be useful to add methods for transparently getting:
- Start location of the CFGElement,
- End location of the CFGElement.

Now a few questions:
- Do you have any suggestions on what more should be added to CFGElement interface?
- In valid AST, can the CXXBaseOrMemberInitializer::getInit() method return null pointer?

Cheers,
Marcin

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

initializers.patch (37K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Ted Kremenek
Hi Marcin,

Awesome work.  I'll comment on your specific points in your email, and then respond with comments to the patch inline below.

On Aug 16, 2010, at 4:41 PM, Marcin Świderski wrote:

> Hi,
>
> I've done some work on CFG. Biggest change is that CFGElement can no longer be converted to Stmt* (through static_cast nor dyn_cast).

I think dyn_cast<> and cast<> is reasonable).  dyn_cast is useful for clients wanting to check if the CFGElement wraps a Stmt* and then using the Stmt* directly.  e.g.:

  CFGElement E = ...
  if (Stmt *S = dyn_cast<Stmt>(E))
    ...

Although I guess with that approach we end up doing two checks; one to do the downcast itself and one to check the pointer.  With your API we will only do one check in practice.


> This is needed to properly handle all kinds of CFGElement: Statement, Initializer, ImplicitDtor and Scope. I've fixed some of the code that was assuming the CFGElement to represent Stmt* and left rest of it with appropriate comments.

Great!

> I've also added Initializer CFGElements creation in CFG. Initializer CFGElement returns CXXBaseOrMemberInitializer object instead of Stmt.

Great!

>
> For ImplicitDtor CFGElements I'm planning to add methods for getting:
> - CXXDestructorDecl for destructor to call,
> - Location of place destructor was called,
> - Exact statement that created the object for which destructor is called (Expr for temporary object and VarDecl for automatic object).

Excellent.

>
> I think that it would also be useful to add methods for transparently getting:
> - Start location of the CFGElement,
> - End location of the CFGElement.

This would be potentially useful, although I don't think we have any clients of this yet.  I'd rather we add this later if the need presents itself.  Note that the location may not even be defined, as a base or member initializer may not even be explicitly written.  Until the need presents itself, we might want clients to deliberately reason about such things, and let them decide their own policy on what locations they want to use.

>
> Now a few questions:
> - Do you have any suggestions on what more should be added to CFGElement interface?

I think this is a good start for now.  We can always extend CFGElement later.

> - In valid AST, can the CXXBaseOrMemberInitializer::getInit() method return null pointer?

I don't think so.  It looks like it is always a ParenListExpr, but I'm not 100% certain.

Comments on the patch itself.  Most of it is stylistic.  I'm very happy with most of it:

Index: include/clang/Analysis/FlowSensitive/DataflowSolver.h
===================================================================
--- include/clang/Analysis/FlowSensitive/DataflowSolver.h (revision 111178)
+++ include/clang/Analysis/FlowSensitive/DataflowSolver.h (working copy)
@@ -273,8 +273,19 @@
   void ProcessBlock(const CFGBlock* B, bool recordStmtValues,
                     dataflow::forward_analysis_tag) {
 
-    for (StmtItr I=ItrTraits::StmtBegin(B), E=ItrTraits::StmtEnd(B); I!=E;++I)
-      ProcessStmt(*I, recordStmtValues, AnalysisDirTag());
+    for (StmtItr I=ItrTraits::StmtBegin(B), E=ItrTraits::StmtEnd(B); I!=E;++I) {
+      switch (I->getKind()) {
+      case CFGElement::Statement:
+        ProcessStmt(I->getStmt(), recordStmtValues, AnalysisDirTag());
+        break;
+      // FIXME: (CFGElement) Should handle other cases.
+      case CFGElement::Initializer:
+      case CFGElement::ImplicitDtor:
+      // Handle scope just to shut up a warning.
+      case CFGElement::Scope:
+        break;
+      }
+    }
 
     TF.VisitTerminator(const_cast<CFGBlock*>(B));
   }
@@ -284,8 +295,19 @@
 
     TF.VisitTerminator(const_cast<CFGBlock*>(B));
 
-    for (StmtItr I=ItrTraits::StmtBegin(B), E=ItrTraits::StmtEnd(B); I!=E;++I)
-      ProcessStmt(*I, recordStmtValues, AnalysisDirTag());
+    for (StmtItr I=ItrTraits::StmtBegin(B), E=ItrTraits::StmtEnd(B); I!=E;++I) {
+      switch (I->getKind()) {
+      case CFGElement::Statement:
+        ProcessStmt(I->getStmt(), recordStmtValues, AnalysisDirTag());
+        break;
+      // FIXME: (CFGElement) Should handle other cases.
+      case CFGElement::Initializer:
+      case CFGElement::ImplicitDtor:
+      // Handle scope just to shut up a warning.
+      case CFGElement::Scope:
+        break;
+      }
+    }
   }


This looks fine for now.  Later we will want to extend the DataflowSolver to issue callbacks for initializers and destructors (although probably not scopes).

 
   void ProcessStmt(const Stmt* S, bool record, dataflow::forward_analysis_tag) {
Index: include/clang/Analysis/CFG.h
===================================================================
--- include/clang/Analysis/CFG.h (revision 111178)
+++ include/clang/Analysis/CFG.h (working copy)
@@ -28,8 +28,11 @@
 }
 namespace clang {
   class Decl;
+  class VarDecl;
   class Stmt;
   class Expr;
+  class CXXBaseOrMemberInitializer;
+  class CXXDestructorDecl;
   class CFG;
   class PrinterHelper;
   class LangOptions;

@@ -37,19 +40,92 @@
 
 /// CFGElement - Represents a top-level expression in a basic block.
 class CFGElement {
-  llvm::PointerIntPair<Stmt *, 2> Data;
 public:
-  enum Type { StartScope, EndScope };
-  explicit CFGElement() {}
-  CFGElement(Stmt *S, bool lvalue) : Data(S, lvalue ? 1 : 0) {}
-  CFGElement(Stmt *S, Type t) : Data(S, t == StartScope ? 2 : 3) {}
-  Stmt *getStmt() const { return Data.getPointer(); }
-  bool asLValue() const { return Data.getInt() == 1; }
-  bool asStartScope() const { return Data.getInt() == 2; }
-  bool asEndScope() const { return Data.getInt() == 3; }
-  bool asDtor() const { return Data.getInt() == 4; }
-  operator Stmt*() const { return getStmt(); }
-  operator bool() const { return getStmt() != 0; }
+  // Common interface.
+  enum Kind {
+    Statement,
+    Initializer,
+    ImplicitDtor,
+    Scope
+  };
+  
+  // Create invalid element.
+  CFGElement()
+      : PrimData(0, Statement)
+      , SecData(0, 0) {}
+  
+  Kind getKind() const { return Kind(PrimData.getInt()); }
+  
+  bool isValid() const { return getKind() != Statement || getStmt(); }
+  operator bool() const { return isValid(); }
+  
+  // Statements interface.
+  enum StmtSubKind {
+    SomeStmt,
+    LValueExpr
+  };

Instead of having a StmtSubKind and ImplicitDtorKind, why not just have one set of enums?  e.g.:

enum Kind {
   Statement,
   LvalStatment,
   BEGIN_STATEMENTS = Statement,
   END_STATEMENT = LvalStatement,
   Initializer,
   AutomaticObjectDtor
   TemporaryObjectDtor
   BEGIN_DTOR = AutomaticObjectDtor,
   END_DTOR = TemporaryObjectDtor,
   StartScope,
   EndScope,
   BEGIN_SCOPE = StartScope,
   END_SCOPE = EndScope
}

Since these are all mutually exclusive possibilities (6 in total), we only need 3 bits.  This means we can represent the CFGElement with a single llvm::PointerIntPair<void*, 3> instead of two llvm::PointerIntPair<void*, 2>.  Also, if we discard scopes, we can just represent the set of possibilities with 2 bits.

+  
+  static CFGElement MakeStatementElem(Stmt* S, bool asLValue) {
+    return CFGElement(Statement, S, asLValue ? LValueExpr : SomeStmt, 0);
+  }

How about just 'Make' instead of 'MakeStatementElem'?

+  
+  bool isStatement() const { return getKind() == Statement; }


For consistency, let's call this isStmt().

+  
+  bool asLValue() const {
+    assert (isStatement() && "CFGElement is not a Statement.");
+    return SecData.getInt() == LValueExpr;
+  }
+  Stmt *getStmt() const {
+    assert (isStatement() && "CFGElement is not a Statement.");
+    return static_cast<Stmt*>(PrimData.getPointer());
+  }

It seems a little cumbersome for users to have to check if it is a Stmt* before getting the value.  We could instead do:

  return isStmt() ? static_cast<Stmt*>(Data.getPointer()) : NULL

and have the client just use 'getStmt()' instead of 'isStmt()' + 'getStmt()'.  The first approach has one less check, while the second is simpler for clients.

@Jordy, Zhongxing: What do you prefer?

+
+  // Initializer interface.
+  static CFGElement MakeInitializerElem(CXXBaseOrMemberInitializer* I) {
+    return CFGElement(Initializer, I, 0, 0);
+  }

We can just use 'Make' instead of 'MakeInitializerElem'.

+  
+  bool isInitializer() const { return getKind() == Initializer; }
+  
+  CXXBaseOrMemberInitializer* getInitializer() const {
+    assert (isInitializer() && "CFGElement is not an Initializer");
+    return static_cast<CXXBaseOrMemberInitializer*>(PrimData.getPointer());
+  }

Same comment with 'getStmt()'.  All this is doing is requiring clients to check if it is an initializer before doing the "downcast".

+  
+  // ImplicitDtor interface.
+  enum ImplicitDtorKind {
+    AutomaticObjectDtor,
+    TemporaryObjectDtor
+  };
+  
+  bool isImplicitDtor() const { return getKind() == ImplicitDtor; }
+  
+  // Scope interface.
+  enum ScopeSubKind {
+    StartScope,
+    EndScope
+  };

Per my comment above, these extra enums seem unnecessary and expand the size requirements of CFGElement.

+  
+  static CFGElement MakeScopeElem(Stmt* S, ScopeSubKind K) {
+    return CFGElement(Scope, S, K, 0);
+  }

Again, we can just have this be 'Make'.

+  
+  bool isScope() const { return getKind() == Scope; }
+  
+  ScopeSubKind getScopeKind() const {
+    assert (isScope() && "CFGElement is not a Scope.");
+    return ScopeSubKind(SecData.getInt());
+  }
+  bool isStartScope() const { return getScopeKind() == StartScope; }
+  bool isEndScope() const { return getScopeKind() == EndScope; }
+  

+private:
+  llvm::PointerIntPair<void*, 2> PrimData;
+  llvm::PointerIntPair<void*, 2> SecData;

We only need one llvm::PointerIntPair.

+  
+  CFGElement(Kind K, void* Prim, unsigned SubK, void* Sec)
+      : PrimData(Prim, K)
+      , SecData(Sec, SubK) {}
 };
 
 /// CFGBlock - Represents a single basic block in a source-level CFG.
@@ -240,13 +316,16 @@
   }
   
   void appendStmt(Stmt* Statement, BumpVectorContext &C, bool asLValue) {
-      Stmts.push_back(CFGElement(Statement, asLValue), C);
-  }  
+    Stmts.push_back(CFGElement::MakeStatementElem(Statement, asLValue), C);
+  }
+  void appendInitializer(CXXBaseOrMemberInitializer* I, BumpVectorContext& C) {
+    Stmts.push_back(CFGElement::MakeInitializerElem(I), C);
+  }
   void StartScope(Stmt* S, BumpVectorContext &C) {
-    Stmts.push_back(CFGElement(S, CFGElement::StartScope), C);
+    Stmts.push_back(CFGElement::MakeScopeElem(S, CFGElement::StartScope), C);
   }
   void EndScope(Stmt* S, BumpVectorContext &C) {
-    Stmts.push_back(CFGElement(S, CFGElement::EndScope), C);
+    Stmts.push_back(CFGElement::MakeScopeElem(S, CFGElement::EndScope), C);
   }
 };
 
@@ -263,14 +342,20 @@
   //===--------------------------------------------------------------------===//
   // CFG Construction & Manipulation.
   //===--------------------------------------------------------------------===//
-
+  
+  enum UnstableFeaturesFlags {
+    AddScopesFlag         = 0x1,
+    AddInitializersFlag   = 0x2,
+    AddImplicitDtorsFlag  = 0x4
+  };
+  
   /// buildCFG - Builds a CFG from an AST.  The responsibility to free the
   ///   constructed CFG belongs to the caller.
   static CFG* buildCFG(const Decl *D, Stmt* AST, ASTContext *C,
                        bool pruneTriviallyFalseEdges = true,
                        bool AddEHEdges = false,
-                       bool AddScopes = false /* NOT FULLY IMPLEMENTED.
-                                                 NOT READY FOR GENERAL USE. */);
+                       int AddUnstable = 0 /* NOT FULLY IMPLEMENTED.
+                                              NOT READY FOR GENERAL USE. */);

I'd rather pass specific bools than have the client pass a bitmask.  Alternatively, pass a CFGBuilderOptions struct, whose bits are initialized to the appropriate default settings.  Let's just keep the API clean and fool proof.

 
   /// createBlock - Create a new block in the CFG.  The CFG owns the block;
   ///  the caller should not directly free it.
@@ -324,8 +409,10 @@
   void VisitBlockStmts(CALLBACK& O) const {
     for (const_iterator I=begin(), E=end(); I != E; ++I)
       for (CFGBlock::const_iterator BI=(*I)->begin(), BE=(*I)->end();
-           BI != BE; ++BI)
-        O(*BI);
+           BI != BE; ++BI) {
+        if (BI->isStatement())
+          O(BI->getStmt());
+      }
   }

How about we change this to 'VisitCFGElements', and update the clients?  For now, the clients can do the check for statements, but at least we know that specific clients are filtering the results.

 
   //===--------------------------------------------------------------------===//
@@ -398,18 +485,6 @@
 
 namespace llvm {
 
-/// Implement simplify_type for CFGElement, so that we can dyn_cast from
-/// CFGElement to a specific Stmt class.
-template <> struct simplify_type<const ::clang::CFGElement> {
-  typedef ::clang::Stmt* SimpleType;
-  static SimpleType getSimplifiedValue(const ::clang::CFGElement &Val) {
-    return Val.getStmt();
-  }
-};
-  
-template <> struct simplify_type< ::clang::CFGElement>
-  : public simplify_type<const ::clang::CFGElement> {};
-  


I really prefer to leave this in.  This leads to really concise check-and-use of CFGElements.



 // Traits for: CFGBlock
 
 template <> struct GraphTraits< ::clang::CFGBlock* > {
Index: include/clang/Analysis/ProgramPoint.h
===================================================================
--- include/clang/Analysis/ProgramPoint.h (revision 111178)
+++ include/clang/Analysis/ProgramPoint.h (working copy)
@@ -118,6 +118,7 @@
     return B->empty() ? CFGElement() : B->front();
   }
   
+  // FIXME: (CFGElement) This should be removed entirely
   const Stmt *getFirstStmt() const {
     return getFirstElement().getStmt();
   }

Since this has a single caller, I agree that the logic should eventually be moved to BugReporter (since the query is client-specific).


@@ -135,11 +136,17 @@
   const CFGBlock* getBlock() const {
     return reinterpret_cast<const CFGBlock*>(getData1());
   }
-
-  const Stmt* getLastStmt() const {
+  
+  const CFGElement getLastElement() const {
     const CFGBlock* B = getBlock();
     return B->empty() ? CFGElement() : B->back();
   }
+  
+  // FIXME: (CFGElement) This should be removed entirely
+  const Stmt* getLastStmt() const {
+    const CFGBlock* B = getBlock();
+    return B->empty() ? 0 : B->back().getStmt();
+  }

There is no caller for getLastStmt.  Just remove it; no reason to add getLastElement either.

 
   const Stmt* getTerminator() const {
     return getBlock()->getTerminator();
Index: include/clang/Checker/PathSensitive/GRCoreEngine.h
===================================================================
--- include/clang/Checker/PathSensitive/GRCoreEngine.h (revision 111178)
+++ include/clang/Checker/PathSensitive/GRCoreEngine.h (working copy)
@@ -259,7 +259,11 @@
 
   /// getStmt - Return the current block-level expression associated with
   ///  this builder.
-  const Stmt* getStmt() const { return B[Idx]; }
+  const Stmt* getStmt() const {
+    // FIXME: (CFGElement) Should this return statement for statement elements
+    // or should this return CFGElement?
+    return B[Idx].isStatement() ? B[Idx].getStmt() : 0;
+  }

I think it is fine for this to be:

  return dyn_cast<Stmt>(B[Idx]);

Why make it more complicated?

 
   /// getBlock - Return the CFGBlock associated with the block-level expression
   ///  of this builder.
Index: include/clang/AST/DeclCXX.h
===================================================================
--- include/clang/AST/DeclCXX.h (revision 111178)
+++ include/clang/AST/DeclCXX.h (working copy)
@@ -1461,7 +1461,24 @@
   init_const_iterator init_end() const {
     return BaseOrMemberInitializers + NumBaseOrMemberInitializers;
   }
-
+  
+  typedef std::reverse_iterator<init_iterator> init_reverse_iterator;
+  typedef std::reverse_iterator<init_const_iterator> init_reverse_const_iterator;
+  
+  init_reverse_iterator init_rbegin() {
+    return init_reverse_iterator(init_end());
+  }
+  init_reverse_const_iterator init_rbegin() const {
+    return init_reverse_const_iterator(init_end());
+  }
+  
+  init_reverse_iterator init_rend() {
+    return init_reverse_iterator(init_begin());
+  }
+  init_reverse_const_iterator init_rend() const {
+    return init_reverse_const_iterator(init_begin());
+  }
+  
   /// getNumArgs - Determine the number of arguments used to
   /// initialize the member or base.
   unsigned getNumBaseOrMemberInitializers() const {
Index: lib/Analysis/CFG.cpp
===================================================================
--- lib/Analysis/CFG.cpp (revision 111178)
+++ lib/Analysis/CFG.cpp (working copy)
@@ -101,7 +101,7 @@
   // buildCFG - Used by external clients to construct the CFG.
   CFG* buildCFG(const Decl *D, Stmt *Statement, ASTContext *C,
                 bool pruneTriviallyFalseEdges, bool AddEHEdges,
-                bool AddScopes);
+                int AddUnstable);


Let's just have a CFGBuilderOptions argument.  It cleans all of this up.

 
 private:
   // Visitors to walk an AST and construct the CFG.
@@ -175,11 +175,15 @@
   CFGBlock *addStmt(Stmt *S) {
     return Visit(S, AddStmtChoice::AlwaysAdd);
   }
+  CFGBlock *addInitializer(CXXBaseOrMemberInitializer* I);
 
   void AppendStmt(CFGBlock *B, Stmt *S,
                   AddStmtChoice asc = AddStmtChoice::AlwaysAdd) {
     B->appendStmt(S, cfg->getBumpVectorContext(), asc.asLValue());
   }
+  void AppendInitializer(CFGBlock* B, CXXBaseOrMemberInitializer* I) {
+    B->appendInitializer(I, cfg->getBumpVectorContext());
+  }
 
   void AddSuccessor(CFGBlock *B, CFGBlock *S) {
     B->addSuccessor(S, cfg->getBumpVectorContext());
@@ -228,6 +232,8 @@
 
   // True iff scope start and scope end notes should be added to the CFG.
   bool AddScopes;
+  bool AddInitializers;
+  bool AddImplicitDtors;
 };
 
 // FIXME: Add support for dependent-sized array types in C++?
@@ -251,7 +257,7 @@
 ///  NULL.
 CFG* CFGBuilder::buildCFG(const Decl *D, Stmt* Statement, ASTContext* C,
                           bool pruneTriviallyFalseEdges,
-                          bool addehedges, bool AddScopes) {
+                          bool addehedges, int AddUnstable) {
 
   AddEHEdges = addehedges;
   PruneTriviallyFalseEdges = pruneTriviallyFalseEdges;
@@ -261,7 +267,10 @@
   if (!Statement)
     return NULL;
 
-  this->AddScopes = AddScopes;
+  AddScopes = AddUnstable & CFG::AddScopesFlag;
+  AddInitializers = AddUnstable & CFG::AddInitializersFlag;
+  AddImplicitDtors = AddUnstable & CFG::AddImplicitDtorsFlag;
+  

No need for an 'AddUnstable' if we have a CFGBuilderOptions object.


   badCFG = false;
 
   // Create an empty block that will serve as the exit block for the CFG.  Since
@@ -275,8 +284,10 @@
   CFGBlock* B = addStmt(Statement);
 
   if (const CXXConstructorDecl *CD = dyn_cast_or_null<CXXConstructorDecl>(D)) {
-    // FIXME: Add code for base initializers and member initializers.
-    (void)CD;
+    for (CXXConstructorDecl::init_reverse_const_iterator I = CD->init_rbegin()
+        , E = CD->init_rend(); I != E; ++I) {
+      B = addInitializer(*I);
+    }

Looks nice and simple.  We should check for 'badCFG' after processed initializer and abort early if we have trouble.


   }
   if (!B)
     B = Succ;
@@ -344,6 +355,21 @@
   return true;
 }
 
+/// addInitializer - Add C++ base or member initializer element to CFG.
+CFGBlock* CFGBuilder::addInitializer(CXXBaseOrMemberInitializer* I) {
+  if (!AddInitializers)
+    return Block;
+  
+  autoCreateBlock();
+  AppendInitializer(Block, I);
+  
+  if (!I->getInit())
+    // FIXME: Remove this check if is unnecessery.
+    return Block;
+  
+  return Visit(I->getInit());

This should probably be addStmt().  We probably want the initializer values to be added to the CFGBlock as block-level expressions.  The reason is that we want:

  a(a)

to be properly sequenced in the CFG, just like we do for DeclStmts, e.g.:

  int x = x

+
 /// Visit - Walk the subtree of a statement and add extra
 ///   blocks for ternary operators, &&, and ||.  We also process "," and
 ///   DeclStmts (which may contain nested control-flow).
@@ -1834,10 +1860,10 @@
 ///  CFG is returned to the caller.
 CFG* CFG::buildCFG(const Decl *D, Stmt* Statement, ASTContext *C,
                    bool PruneTriviallyFalse,
-                   bool AddEHEdges, bool AddScopes) {
+                   bool AddEHEdges, int AddUnstable) {
   CFGBuilder Builder;
   return Builder.buildCFG(D, Statement, C, PruneTriviallyFalse,
-                          AddEHEdges, AddScopes);
+                          AddEHEdges, AddUnstable);
 }
 
 //===----------------------------------------------------------------------===//
@@ -1875,16 +1901,25 @@
   llvm::SmallPtrSet<Expr*,50> SubExprAssignments;
 
   for (CFG::iterator I=cfg.begin(), E=cfg.end(); I != E; ++I)
-    for (CFGBlock::iterator BI=(*I)->begin(), EI=(*I)->end(); BI != EI; ++BI)
-      FindSubExprAssignments(*BI, SubExprAssignments);
+    for (CFGBlock::iterator BI=(*I)->begin(), EI=(*I)->end(); BI != EI; ++BI) {
+      if (BI->isStatement()) {
+        FindSubExprAssignments(BI->getStmt(), SubExprAssignments);
+      } else if (BI->isInitializer()) {
+        Stmt* S = BI->getInitializer()->getInit();
+        FindSubExprAssignments(S, SubExprAssignments);
+      }
+    }


Looks good.

 
   for (CFG::iterator I=cfg.begin(), E=cfg.end(); I != E; ++I) {
 
     // Iterate over the statements again on identify the Expr* and Stmt* at the
     // block-level that are block-level expressions.
 
-    for (CFGBlock::iterator BI=(*I)->begin(), EI=(*I)->end(); BI != EI; ++BI)
-      if (Expr* Exp = dyn_cast<Expr>(*BI)) {
+    for (CFGBlock::iterator BI=(*I)->begin(), EI=(*I)->end(); BI != EI; ++BI) {
+      if (!BI->isStatement())
+        continue;
+      
+      if (Expr* Exp = dyn_cast<Expr>(BI->getStmt())) {
 
         if (BinaryOperator* B = dyn_cast<BinaryOperator>(Exp)) {
           // Assignment expressions that are not nested within another
@@ -1905,6 +1940,7 @@
         unsigned x = M->size();
         (*M)[Exp] = x;
       }
+    }


Looks good.

 
     // Look at terminators.  The condition is a block-level expression.
 
@@ -1967,7 +2003,8 @@
       unsigned j = 1;
       for (CFGBlock::const_iterator BI = (*I)->begin(), BEnd = (*I)->end() ;
            BI != BEnd; ++BI, ++j )
-        StmtMap[*BI] = std::make_pair((*I)->getBlockID(),j);
+        if (BI->isStatement())
+          StmtMap[BI->getStmt()] = std::make_pair((*I)->getBlockID(),j);
       }
   }
 
@@ -2095,47 +2132,70 @@
 } // end anonymous namespace
 
 
-static void print_stmt(llvm::raw_ostream &OS, StmtPrinterHelper* Helper,
+static void print_elem(llvm::raw_ostream &OS, StmtPrinterHelper* Helper,
                        const CFGElement &E) {
-  Stmt *Terminator = E;
+  switch (E.getKind()) {
+  
+  case CFGElement::Statement:{
+    Stmt *Terminator = E.getStmt();
+    
+    if (Helper) {
 
-  if (E.asStartScope()) {
-    OS << "start scope\n";
-    return;
-  }
-  if (E.asEndScope()) {
-    OS << "end scope\n";
-    return;
-  }
+      // special printing for statement-expressions.
+      if (StmtExpr* SE = dyn_cast<StmtExpr>(Terminator)) {
+        CompoundStmt* Sub = SE->getSubStmt();
 
-  if (Helper) {
-    // special printing for statement-expressions.
-    if (StmtExpr* SE = dyn_cast<StmtExpr>(Terminator)) {
-      CompoundStmt* Sub = SE->getSubStmt();
+        if (Sub->child_begin() != Sub->child_end()) {
+          OS << "({ ... ; ";
+          Helper->handledStmt(*SE->getSubStmt()->body_rbegin(),OS);
+          OS << " })\n";
+          return;
+        }
+      }
 
-      if (Sub->child_begin() != Sub->child_end()) {
-        OS << "({ ... ; ";
-        Helper->handledStmt(*SE->getSubStmt()->body_rbegin(),OS);
-        OS << " })\n";
-        return;
+      // special printing for comma expressions.
+      if (BinaryOperator* B = dyn_cast<BinaryOperator>(Terminator)) {
+        if (B->getOpcode() == BinaryOperator::Comma) {
+          OS << "... , ";
+          Helper->handledStmt(B->getRHS(),OS);
+          OS << '\n';
+          return;
+        }
       }
     }
 
-    // special printing for comma expressions.
-    if (BinaryOperator* B = dyn_cast<BinaryOperator>(Terminator)) {
-      if (B->getOpcode() == BinaryOperator::Comma) {
-        OS << "... , ";
-        Helper->handledStmt(B->getRHS(),OS);
-        OS << '\n';
-        return;
-      }
+    Terminator->printPretty(OS, Helper, PrintingPolicy(Helper->getLangOpts()));
+
+    // Expressions need a newline.
+    if (isa<Expr>(Terminator)) OS << '\n';
+    break;
+  }
+  case CFGElement::Initializer: {
+    CXXBaseOrMemberInitializer* I = E.getInitializer();
+    if (I->isBaseInitializer()) {
+      OS << "/* Base initializer */ ";
+      OS << I->getBaseClass()->getAsCXXRecordDecl()->getName();
+    } else {
+      OS << "/* Member initializer */ " << I->getMember()->getName();
     }
+    OS << "(";
+    if (Expr* IE = I->getInit())
+      // FIXME: Currently this prints nothing in most cases. It depends on
+      // StmtPrinter::VisitCXXConstructExpr method.
+      IE->printPretty(OS, Helper, PrintingPolicy(Helper->getLangOpts()));
+    OS << ")\n";
+    break;
   }
-
-  Terminator->printPretty(OS, Helper, PrintingPolicy(Helper->getLangOpts()));
-
-  // Expressions need a newline.
-  if (isa<Expr>(Terminator)) OS << '\n';
+  case CFGElement::ImplicitDtor:
+    OS << "TODO: print implicit destructor desc\n";
+    break;
+  
+  case CFGElement::Scope:
+    if (E.isStartScope())
+      OS << "/* Start scope */\n";
+    else OS << "/* End scope */\n";
+    break;
+  }
 }

Looks fine.
 
 static void print_block(llvm::raw_ostream& OS, const CFG* cfg,
@@ -2190,7 +2250,7 @@
     OS << ":\n";
   }
 
-  // Iterate through the statements in the block and print them.
+  // Iterate through the elements in the block and print them.
   unsigned j = 1;
 
   for (CFGBlock::const_iterator I = B.begin(), E = B.end() ;
@@ -2205,7 +2265,7 @@
     if (Helper)
       Helper->setStmtID(j);
 
-    print_stmt(OS,Helper,*I);
+    print_elem(OS,Helper,*I);
   }
 
   // Print the terminator of this block.
Index: lib/Analysis/ReachableCode.cpp
===================================================================
--- lib/Analysis/ReachableCode.cpp (revision 111178)
+++ lib/Analysis/ReachableCode.cpp (working copy)
@@ -31,9 +31,14 @@
   R1 = R2 = SourceRange();
 
 top:
-  if (sn < b.size())
+  if (sn < b.size()) {
+    // FIXME: (CFGElement) Should handle other CFGElement kinds here also.
+    if (!b[sn].isStatement()) {
+      ++sn;
+      goto top;
+    }
     S = b[sn].getStmt();
-  else if (b.getTerminator())
+  } else if (b.getTerminator())
     S = b.getTerminator();
   else
     return SourceLocation();
@@ -43,6 +48,7 @@
       const BinaryOperator *BO = cast<BinaryOperator>(S);
       if (BO->getOpcode() == BinaryOperator::Comma) {
         if (sn+1 < b.size())
+          // Here we can have only statement CFGElement.
           return b[sn+1].getStmt()->getLocStart();
         const CFGBlock *n = &b;
         while (1) {
@@ -54,6 +60,7 @@
           if (n->pred_size() != 1)
             return SourceLocation();
           if (!n->empty())
+            // Here we can have only statement CFGElement.
             return n[0][0].getStmt()->getLocStart();
         }
       }
Index: lib/Sema/AnalysisBasedWarnings.cpp
===================================================================
--- lib/Sema/AnalysisBasedWarnings.cpp (revision 111178)
+++ lib/Sema/AnalysisBasedWarnings.cpp (working copy)
@@ -114,6 +114,8 @@
     if (!live[B.getBlockID()])
       continue;
     if (B.size() == 0) {
+      // FIXME: (CFGElement) This should be reimplemented when (or if) scopes
+      // will be added to CFG.
       if (B.getTerminator() && isa<CXXTryStmt>(B.getTerminator())) {
         HasAbnormalEdge = true;
         continue;
@@ -123,66 +125,83 @@
       HasPlainEdge = true;
       continue;
     }
-    Stmt *S = B[B.size()-1];
-    if (isa<ReturnStmt>(S)) {
-      HasLiveReturn = true;
-      continue;
-    }
-    if (isa<ObjCAtThrowStmt>(S)) {
-      HasFakeEdge = true;
-      continue;
-    }
-    if (isa<CXXThrowExpr>(S)) {
-      HasFakeEdge = true;
-      continue;
-    }
-    if (const AsmStmt *AS = dyn_cast<AsmStmt>(S)) {
-      if (AS->isMSAsm()) {
-        HasFakeEdge = true;
+    CFGElement SE = B[B.size() - 1];
+    switch (SE.getKind()) {
+    case CFGElement::Statement: {
+      Stmt *S = SE.getStmt();
+      if (isa<ReturnStmt>(S)) {
         HasLiveReturn = true;
         continue;
       }
-    }
-    if (isa<CXXTryStmt>(S)) {
-      HasAbnormalEdge = true;
-      continue;
-    }
-
-    bool NoReturnEdge = false;
-    if (CallExpr *C = dyn_cast<CallExpr>(S)) {
-      if (std::find(B.succ_begin(), B.succ_end(), &cfg->getExit())
-            == B.succ_end()) {
-        HasAbnormalEdge = true;
+      if (isa<ObjCAtThrowStmt>(S)) {
+        HasFakeEdge = true;
         continue;
       }
-      Expr *CEE = C->getCallee()->IgnoreParenCasts();
-      if (getFunctionExtInfo(CEE->getType()).getNoReturn()) {
-        NoReturnEdge = true;
+      if (isa<CXXThrowExpr>(S)) {
         HasFakeEdge = true;
-      } else if (DeclRefExpr *DRE = dyn_cast<DeclRefExpr>(CEE)) {
-        ValueDecl *VD = DRE->getDecl();
-        if (VD->hasAttr<NoReturnAttr>()) {
-          NoReturnEdge = true;
+        continue;
+      }
+      if (const AsmStmt *AS = dyn_cast<AsmStmt>(S)) {
+        if (AS->isMSAsm()) {
           HasFakeEdge = true;
+          HasLiveReturn = true;
+          continue;
         }
       }
-    }
-    // FIXME: Remove this hack once temporaries and their destructors are
-    // modeled correctly by the CFG.
-    if (CXXExprWithTemporaries *E = dyn_cast<CXXExprWithTemporaries>(S)) {
-      for (unsigned I = 0, N = E->getNumTemporaries(); I != N; ++I) {
-        const FunctionDecl *FD = E->getTemporary(I)->getDestructor();
-        if (FD->hasAttr<NoReturnAttr>() ||
-            FD->getType()->getAs<FunctionType>()->getNoReturnAttr()) {
+      if (isa<CXXTryStmt>(S)) {
+        HasAbnormalEdge = true;
+        continue;
+      }
+
+      bool NoReturnEdge = false;
+      if (CallExpr *C = dyn_cast<CallExpr>(S)) {
+        if (std::find(B.succ_begin(), B.succ_end(), &cfg->getExit())
+              == B.succ_end()) {
+          HasAbnormalEdge = true;
+          continue;
+        }
+        Expr *CEE = C->getCallee()->IgnoreParenCasts();
+        if (getFunctionExtInfo(CEE->getType()).getNoReturn()) {
           NoReturnEdge = true;
           HasFakeEdge = true;
-          break;
+        } else if (DeclRefExpr *DRE = dyn_cast<DeclRefExpr>(CEE)) {
+          ValueDecl *VD = DRE->getDecl();
+          if (VD->hasAttr<NoReturnAttr>()) {
+            NoReturnEdge = true;
+            HasFakeEdge = true;
+          }
         }
       }
+      // FIXME: Remove this hack once temporaries and their destructors are
+      // modeled correctly by the CFG.
+      if (CXXExprWithTemporaries *E = dyn_cast<CXXExprWithTemporaries>(S)) {
+        for (unsigned I = 0, N = E->getNumTemporaries(); I != N; ++I) {
+          const FunctionDecl *FD = E->getTemporary(I)->getDestructor();
+          if (FD->hasAttr<NoReturnAttr>() ||
+              FD->getType()->getAs<FunctionType>()->getNoReturnAttr()) {
+            NoReturnEdge = true;
+            HasFakeEdge = true;
+            break;
+          }
+        }
+      }
+      // FIXME: Add noreturn message sends.
+      if (NoReturnEdge == false)
+        HasPlainEdge = true;
+      break;
     }
-    // FIXME: Add noreturn message sends.
-    if (NoReturnEdge == false)
-      HasPlainEdge = true;
+    case CFGElement::Initializer: {
+      // FIXME: (CFGElement) Handle initializer as last element.
+      break;
+    }
+    case CFGElement::ImplicitDtor: {
+      // FIXME: (CFGElement) Handle implicit destructor call as last element.
+      break;
+    }
+    case CFGElement::Scope:
+      // Add just to shut up warning.
+      break;
+    }
   }
   if (!HasPlainEdge) {
     if (HasLiveReturn)
Index: lib/Checker/UnreachableCodeChecker.cpp
===================================================================
--- lib/Checker/UnreachableCodeChecker.cpp (revision 111178)
+++ lib/Checker/UnreachableCodeChecker.cpp (working copy)
@@ -116,10 +116,12 @@
     // FIXME: This should be extended to include other unreachable markers,
     // such as llvm_unreachable.
     if (!CB->empty()) {
-      const Stmt *First = CB->front();
-      if (const CallExpr *CE = dyn_cast<CallExpr>(First)) {
-        if (CE->isBuiltinCall(Ctx) == Builtin::BI__builtin_unreachable)
-          continue;
+      CFGElement First = CB->front();
+      if (First.isStatement()) {
+        if (const CallExpr *CE = dyn_cast<CallExpr>(First.getStmt())) {
+          if (CE->isBuiltinCall(Ctx) == Builtin::BI__builtin_unreachable)
+            continue;
+        }
       }
     }
 
@@ -173,12 +175,14 @@
 
 // Find the Stmt* in a CFGBlock for reporting a warning
 const Stmt *UnreachableCodeChecker::getUnreachableStmt(const CFGBlock *CB) {
-  if (CB->size() > 0)
-    return CB->front().getStmt();
-  else if (const Stmt *S = CB->getTerminator())
+  // FIXME: (CFGElement) Should we handle other CFGElement kinds here as well?
+  for (CFGBlock::const_iterator I = CB->begin(), E = CB->end(); I != E; ++I) {
+    if (I->isStatement())
+      return I->getStmt();
+  }
+  if (const Stmt *S = CB->getTerminator())
     return S;
-  else
-    return 0;
+  return 0;
 }
 
 // Determines if the path to this CFGBlock contained an element that infers this


Overall, great work!


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Jordy Rose
In reply to this post by Marcin Świderski

Wow, good work.

I agree with most of what Ted said. A few additional comments:

- I like the single enum rather than kind and subkind, but there are 7
elements, not 6. So it won't fit in a PointerIntPair anyway, though having
a Stmt* field (or void*) and an enum field would be preferable to two
PointerIntPairs.
I'm still not totally convinced we need scope markers in the CFG, though.

- Why don't we make this a static type hierarchy like SVal or MemRegion?
We'd have CFGStmt, CFGInitializer, CFGDestructor, and CFGScope(Marker)
inheriting from CFGElement. That way we can have, among other things, a
CFGStmt that simplifies to a Stmt. Instead of asserts, we can just have the
type-specific methods on the subclasses.

- In UnreachableCodeChecker::getUnreachableStmt(), isn't the block
guaranteed to have an initial statement for the same reason as in
ReachableCode.cpp?

- Very tiny comment: most of the LLVM codebase has no indentation on blank
lines.

Great to get more C++ support in the analyzer!
Jordy
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Marcin Świderski
Hi Jordy, Ted

Thanks for the appreciation and for comments. Some comments inline:

W dniu 18 sierpnia 2010 07:47 użytkownik Jordy Rose <[hidden email]> napisał:

Wow, good work.

I agree with most of what Ted said. A few additional comments:

- I like the single enum rather than kind and subkind, but there are 7
elements, not 6. So it won't fit in a PointerIntPair anyway, though having
a Stmt* field (or void*) and an enum field would be preferable to two
PointerIntPairs.
I'm still not totally convinced we need scope markers in the CFG, though.

For both destructor kinds I will need space for two pointers. For automatic object I need it's declaration and statement that cause destructor to be called. For temporary object I need expression bound to the temporary and the full expression that contains the temporary.

I could allocate separate struct for those and point to it from CFGElement. If we drop the scopes and move information about kind of destructor to this struct, then CFGElement will need only one PointerIntPair.

If we would want scopes modeled we could use same struct as for destructors. So we would end up with something like this:

- Statement and LValStatement with pointer to Stmt*,
- Initializer with pointer to CXXBaseOrMemberInitializer*,
- Extendend CFGElement with pointer to struct with information about destructor or scope.

Our kind enum would look like this:
enum Kind {
  Statement,
  LvalStatment,
  BEGIN_STATEMENT = Statement,
  END_STATEMENT = LvalStatement,
  Initializer,

  ExtendedElement, // This would be internal, client would never get this
                              // with getKind() method.

  AutomaticObjectDtor,
  TemporaryObjectDtor,
  BEGIN_DTOR = AutomaticObjectDtor,
  END_DTOR = TemporaryObjectDtor,
  StartScope,
  EndScope,
  BEGIN_SCOPE = StartScope,
  END_SCOPE = EndScope
};

This way we will have space efficient CFG for Obj-C and C languages. For C++ CFG will occupy less memory if there will be more statement then destructor CFGElements.

- Why don't we make this a static type hierarchy like SVal or MemRegion?
We'd have CFGStmt, CFGInitializer, CFGDestructor, and CFGScope(Marker)
inheriting from CFGElement. That way we can have, among other things, a
CFGStmt that simplifies to a Stmt. Instead of asserts, we can just have the
type-specific methods on the subclasses.

Yes, this will be much better approach. I will change this.

- In UnreachableCodeChecker::getUnreachableStmt(), isn't the block
guaranteed to have an initial statement for the same reason as in
ReachableCode.cpp?

Do you mean that if block is unreachable it will start with a statement and not with initializer, destructor or scope?
 
- Very tiny comment: most of the LLVM codebase has no indentation on blank
lines.

I will try to follow this convention then.
 
Great to get more C++ support in the analyzer!
Jordy

Cheers,
Marcin

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Ted Kremenek
In reply to this post by Jordy Rose

On Aug 17, 2010, at 10:47 PM, Jordy Rose wrote:

>
> - I like the single enum rather than kind and subkind, but there are 7
> elements, not 6. So it won't fit in a PointerIntPair anyway, though having
> a Stmt* field (or void*) and an enum field would be preferable to two
> PointerIntPairs.
> I'm still not totally convinced we need scope markers in the CFG, though.

I don't think we need the scope markers either, and would actually prefer we took them out.

>
> - Why don't we make this a static type hierarchy like SVal or MemRegion?
> We'd have CFGStmt, CFGInitializer, CFGDestructor, and CFGScope(Marker)
> inheriting from CFGElement. That way we can have, among other things, a
> CFGStmt that simplifies to a Stmt. Instead of asserts, we can just have the
> type-specific methods on the subclasses.

I had this thought as well.  The approach would be closer to SVal than MemRegion; I'd prefer we didn't have virtual functions in these classes, and keep the size of CFGElement object constant.  The nice thing about this approach is that it (a) uses strong-typing to enforce the APIs and (b) we can associate accessor (e.g., 'isLValue()') with only the appropriate CFGElement sub-class (e.g., CFGStmt).


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Jordy Rose
In reply to this post by Marcin Świderski

On Wed, 18 Aug 2010 09:11:29 +0200, Marcin Świderski
<[hidden email]> wrote:
> W dniu 18 sierpnia 2010 07:47 użytkownik Jordy Rose
> <[hidden email]>napisał:
>> - I like the single enum rather than kind and subkind, but there are 7
>> elements, not 6. So it won't fit in a PointerIntPair anyway, though
>> having
>> a Stmt* field (or void*) and an enum field would be preferable to two
>> PointerIntPairs.
>> I'm still not totally convinced we need scope markers in the CFG,
though.
>>
> For both destructor kinds I will need space for two pointers. For
> automatic
> object I need it's declaration and statement that cause destructor to be
> called. For temporary object I need expression bound to the temporary
and
> the full expression that contains the temporary.

Hm. This might be true, but I'm not sure. I think the element's position
in the CFG is good enough that we don't need "the statement that causes the
destructor to be called". (Pseudo-destructors already count as statements
so we don't have to worry about those.)

The temporary one is a little more worrisome, but again you might be able
to get away with just the expression whose value is the temporary, and
leave out the context. We certainly need to keep track of both parts when
/building/ the CFG, but once it's complete we should be okay.


>> - In UnreachableCodeChecker::getUnreachableStmt(), isn't the block
>> guaranteed to have an initial statement for the same reason as in
>> ReachableCode.cpp?
>>
> Do you mean that if block is unreachable it will start with a statement
> and not with initializer, destructor or scope?

Hm, again I forget about scope. An unreachable block certainly can't start
with an initializer or destructor, but it could very well start a scope.
But this means the changes in ReachableCode.cpp have to handle this case as
well.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Ted Kremenek

On Aug 19, 2010, at 1:00 PM, Jordy Rose wrote:

>> For both destructor kinds I will need space for two pointers. For
>> automatic
>> object I need it's declaration and statement that cause destructor to be
>> called. For temporary object I need expression bound to the temporary
> and
>> the full expression that contains the temporary.
>
> Hm. This might be true, but I'm not sure. I think the element's position
> in the CFG is good enough that we don't need "the statement that causes the
> destructor to be called". (Pseudo-destructors already count as statements
> so we don't have to worry about those.)
>
> The temporary one is a little more worrisome, but again you might be able
> to get away with just the expression whose value is the temporary, and
> leave out the context. We certainly need to keep track of both parts when
> /building/ the CFG, but once it's complete we should be okay.

I also don't think we need to hyper optimize here.  We can burn two pointers; we just want it as svelt as possible.  The number of CFGElements is on order of the number of statements, not expressions, in a function.  If you think about the amount of data we keep track in the ASTs themsevles, burning a few extra bytes for a short-lived CFG is not really a big deal.  We just don't want it to be too bloated.

For destructors, I would like to retain enough information so that we can precisely emit good diagnostics in the static analyzer.  This includes being able to reference where the object was created and roughly where the object was destroyed.  I agree that we could possible not include the statement that causes the destructor be called *as long* as it is easy to recreate this information and vend it through a clean API to the client.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Ted Kremenek
In reply to this post by Marcin Świderski

On Aug 18, 2010, at 12:11 AM, Marcin Świderski wrote:

Hi Jordy, Ted

Thanks for the appreciation and for comments. Some comments inline:

W dniu 18 sierpnia 2010 07:47 użytkownik Jordy Rose <[hidden email]> napisał:

Wow, good work.

I agree with most of what Ted said. A few additional comments:

- I like the single enum rather than kind and subkind, but there are 7
elements, not 6. So it won't fit in a PointerIntPair anyway, though having
a Stmt* field (or void*) and an enum field would be preferable to two
PointerIntPairs.
I'm still not totally convinced we need scope markers in the CFG, though.

For both destructor kinds I will need space for two pointers. For automatic object I need it's declaration and statement that cause destructor to be called. For temporary object I need expression bound to the temporary and the full expression that contains the temporary.

Repeating what I said in my reply to Jordy, I think it's fine to burn an extra pointer if it gives us flexibility to track more information that would be useful for diagnostics.


I could allocate separate struct for those and point to it from CFGElement. If we drop the scopes and move information about kind of destructor to this struct, then CFGElement will need only one PointerIntPair.

That is also a possibility.  It could be allocated from the BumpPtrAllocator.


If we would want scopes modeled we could use same struct as for destructors. So we would end up with something like this:

- Statement and LValStatement with pointer to Stmt*,
- Initializer with pointer to CXXBaseOrMemberInitializer*,
- Extendend CFGElement with pointer to struct with information about destructor or scope.

Our kind enum would look like this:
enum Kind {
  Statement,
  LvalStatment,
  BEGIN_STATEMENT = Statement,
  END_STATEMENT = LvalStatement,
  Initializer,

  ExtendedElement, // This would be internal, client would never get this
                              // with getKind() method.

  AutomaticObjectDtor,
  TemporaryObjectDtor,
  BEGIN_DTOR = AutomaticObjectDtor,
  END_DTOR = TemporaryObjectDtor,
  StartScope,
  EndScope,
  BEGIN_SCOPE = StartScope,
  END_SCOPE = EndScope
};

What is the purpose of the 'ExtendedElement' kind?


This way we will have space efficient CFG for Obj-C and C languages. For C++ CFG will occupy less memory if there will be more statement then destructor CFGElements.

- Why don't we make this a static type hierarchy like SVal or MemRegion?
We'd have CFGStmt, CFGInitializer, CFGDestructor, and CFGScope(Marker)
inheriting from CFGElement. That way we can have, among other things, a
CFGStmt that simplifies to a Stmt. Instead of asserts, we can just have the
type-specific methods on the subclasses.

Yes, this will be much better approach. I will change this.

Sounds great.


- In UnreachableCodeChecker::getUnreachableStmt(), isn't the block
guaranteed to have an initial statement for the same reason as in
ReachableCode.cpp?

Do you mean that if block is unreachable it will start with a statement and not with initializer, destructor or scope?

Yes, an unreachable block could certainly start with an initializer if it had no initializer argument and the previous initializer called a no-return function.  Gross, but I guess possible.

Moreover, a call to builtin_unreachable is not guaranteed to be the first CFGElement in a CFGBlock.  I think the checker's logic is likely wrong here.  It should be iterating over the block, looking for this specific CallExpr.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Fwd: CFGElement changes and initializers addition (with patch)

Marcin Świderski
Sorry, I still forget to choose "Replay all" option...

W dniu 20 sierpnia 2010 05:35 użytkownik Ted Kremenek <[hidden email]> napisał:


On Aug 18, 2010, at 12:11 AM, Marcin Świderski wrote:

Hi Jordy, Ted

Thanks for the appreciation and for comments. Some comments inline:

W dniu 18 sierpnia 2010 07:47 użytkownik Jordy Rose <[hidden email]> napisał:

Wow, good work.

I agree with most of what Ted said. A few additional comments:

- I like the single enum rather than kind and subkind, but there are 7
elements, not 6. So it won't fit in a PointerIntPair anyway, though having
a Stmt* field (or void*) and an enum field would be preferable to two
PointerIntPairs.
I'm still not totally convinced we need scope markers in the CFG, though.

For both destructor kinds I will need space for two pointers. For automatic object I need it's declaration and statement that cause destructor to be called. For temporary object I need expression bound to the temporary and the full expression that contains the temporary.

Repeating what I said in my reply to Jordy, I think it's fine to burn an extra pointer if it gives us flexibility to track more information that would be useful for diagnostics.

So I have three acceptable solutions.
1. Two pointers for each element.
2. One pointer for each element, but in case of destructor it points to allocated pair of pointers.
3. One pointer for each element, and location of call to destructor is calculated on demand.

1 is wasteful, because only destructors need two pointers. 3 will give as possibly unclean API for clients. 2 seams to be the best and I will implement it.

I could allocate separate struct for those and point to it from CFGElement. If we drop the scopes and move information about kind of destructor to this struct, then CFGElement will need only one PointerIntPair.

That is also a possibility.  It could be allocated from the BumpPtrAllocator.

When will be the memory allocated with BumpPtrAllocator freed? If I will allocate memory for use in one instance of CFG, can I forget about it (not call Deallocate)? 

If we would want scopes modeled we could use same struct as for destructors. So we would end up with something like this:

- Statement and LValStatement with pointer to Stmt*,
- Initializer with pointer to CXXBaseOrMemberInitializer*,
- Extendend CFGElement with pointer to struct with information about destructor or scope.

Our kind enum would look like this:
enum Kind {
  Statement,
  LvalStatment,
  BEGIN_STATEMENT = Statement,
  END_STATEMENT = LvalStatement,
  Initializer,

  ExtendedElement, // This would be internal, client would never get this
                              // with getKind() method.

  AutomaticObjectDtor,
  TemporaryObjectDtor,
  BEGIN_DTOR = AutomaticObjectDtor,
  END_DTOR = TemporaryObjectDtor,
  StartScope,
  EndScope,
  BEGIN_SCOPE = StartScope,
  END_SCOPE = EndScope
};

What is the purpose of the 'ExtendedElement' kind?

Int part of PointerIntPair where pointer part points to pointer can have only 2 bits. ExtendedElement kind marks that the kind (- int(ExtendedElement)) is stored in int part of first PointerIntPair in pointed struct. It is an implementation detail realy.

This way we will have space efficient CFG for Obj-C and C languages. For C++ CFG will occupy less memory if there will be more statement then destructor CFGElements.

- Why don't we make this a static type hierarchy like SVal or MemRegion?
We'd have CFGStmt, CFGInitializer, CFGDestructor, and CFGScope(Marker)
inheriting from CFGElement. That way we can have, among other things, a
CFGStmt that simplifies to a Stmt. Instead of asserts, we can just have the
type-specific methods on the subclasses.

Yes, this will be much better approach. I will change this.

Sounds great.
 
The hierarchy is almost ready. I'll send a patch with this and both destructors implemented after weekend.
 

- In UnreachableCodeChecker::getUnreachableStmt(), isn't the block
guaranteed to have an initial statement for the same reason as in
ReachableCode.cpp?

Do you mean that if block is unreachable it will start with a statement and not with initializer, destructor or scope?

Yes, an unreachable block could certainly start with an initializer if it had no initializer argument and the previous initializer called a no-return function.  Gross, but I guess possible.

Moreover, a call to builtin_unreachable is not guaranteed to be the first CFGElement in a CFGBlock.  I think the checker's logic is likely wrong here.  It should be iterating over the block, looking for this specific CallExpr.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Fwd: CFGElement changes and initializers addition (with patch)

Marcin Świderski
In reply to this post by Jordy Rose
Sorry, I still forget to choose "Replay all" option.

W dniu 19 sierpnia 2010 22:00 użytkownik Jordy Rose <[hidden email]> napisał:


On Wed, 18 Aug 2010 09:11:29 +0200, Marcin Świderski
<[hidden email]> wrote:
> W dniu 18 sierpnia 2010 07:47 użytkownik Jordy Rose
> <[hidden email]>napisał:
>> - I like the single enum rather than kind and subkind, but there are 7
>> elements, not 6. So it won't fit in a PointerIntPair anyway, though
>> having
>> a Stmt* field (or void*) and an enum field would be preferable to two
>> PointerIntPairs.
>> I'm still not totally convinced we need scope markers in the CFG,
though.
>>
> For both destructor kinds I will need space for two pointers. For
> automatic
> object I need it's declaration and statement that cause destructor to be
> called. For temporary object I need expression bound to the temporary
and
> the full expression that contains the temporary.

Hm. This might be true, but I'm not sure. I think the element's position
in the CFG is good enough that we don't need "the statement that causes the
destructor to be called". (Pseudo-destructors already count as statements
so we don't have to worry about those.)

The temporary one is a little more worrisome, but again you might be able
to get away with just the expression whose value is the temporary, and
leave out the context. We certainly need to keep track of both parts when
/building/ the CFG, but once it's complete we should be okay.

Location where the destructor was called won't be needed much I suppose. Instead of explicitly giving this position we could provide means of calculating it. If this would be 'kay I think that I can squeeze everything into one PointerIntPair.

One thing that would be needed and could be questionable (I think) would be end of scope element. This element wouldn't have start of scope element counterpart, and it would be used only if sequance of destructors for automatic objects would not end with block termination statement. CFG clients would have to ignore it always as it would be of no use to them.


>> - In UnreachableCodeChecker::getUnreachableStmt(), isn't the block
>> guaranteed to have an initial statement for the same reason as in
>> ReachableCode.cpp?
>>
> Do you mean that if block is unreachable it will start with a statement
> and not with initializer, destructor or scope?

Hm, again I forget about scope. An unreachable block certainly can't start
with an initializer or destructor, but it could very well start a scope.
But this means the changes in ReachableCode.cpp have to handle this case as
well.
We could write code like this:

while (true) {
  std::string a;
  break;
  break;
}

From CFGBuilder point of view each break will cause 'a' to be destroyed, so the block with second break as terminator will start with destructor of 'a'. As for initializers I don't see any case that would produce block with it as first element, but I think that it could be handled just to be on the safe side.

- Marcin


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Ted Kremenek
On Aug 19, 2010, at 11:34 PM, Marcin Świderski wrote:

> Location where the destructor was called won't be needed much I suppose. Instead of explicitly giving this position we could provide means of calculating it. If this would be 'kay I think that I can squeeze everything into one PointerIntPair.

It will be needed by analyzer diagnostics.  As long as we can reconstruct it when necessary that's fine by me.

>
> One thing that would be needed and could be questionable (I think) would be end of scope element. This element wouldn't have start of scope element counterpart, and it would be used only if sequance of destructors for automatic objects would not end with block termination statement. CFG clients would have to ignore it always as it would be of no use to them.

I'm not quite certain what this means.  We can have a block that contains a bunch destructor calls with a single successor to the block in the outer scope.  An explicit "end scope element" isn't strictly needed.



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Zhongxing Xu
2010/8/20 Ted Kremenek <[hidden email]>:

> On Aug 19, 2010, at 11:34 PM, Marcin Świderski wrote:
>
>> Location where the destructor was called won't be needed much I suppose. Instead of explicitly giving this position we could provide means of calculating it. If this would be 'kay I think that I can squeeze everything into one PointerIntPair.
>
> It will be needed by analyzer diagnostics.  As long as we can reconstruct it when necessary that's fine by me.
>
>>
>> One thing that would be needed and could be questionable (I think) would be end of scope element. This element wouldn't have start of scope element counterpart, and it would be used only if sequance of destructors for automatic objects would not end with block termination statement. CFG clients would have to ignore it always as it would be of no use to them.
>
> I'm not quite certain what this means.  We can have a block that contains a bunch destructor calls with a single successor to the block in the outer scope.  An explicit "end scope element" isn't strictly needed.
>

Sorry for the late reply. There is a case where we need explicit scope
delimiters. We need them to detect resource leaks accurately. Consider
code:

void f() {
  {

  }

}

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Zhongxing Xu
On Tue, Aug 24, 2010 at 10:41 AM, Zhongxing Xu <[hidden email]> wrote:

> 2010/8/20 Ted Kremenek <[hidden email]>:
>> On Aug 19, 2010, at 11:34 PM, Marcin Świderski wrote:
>>
>>> Location where the destructor was called won't be needed much I suppose. Instead of explicitly giving this position we could provide means of calculating it. If this would be 'kay I think that I can squeeze everything into one PointerIntPair.
>>
>> It will be needed by analyzer diagnostics.  As long as we can reconstruct it when necessary that's fine by me.
>>
>>>
>>> One thing that would be needed and could be questionable (I think) would be end of scope element. This element wouldn't have start of scope element counterpart, and it would be used only if sequance of destructors for automatic objects would not end with block termination statement. CFG clients would have to ignore it always as it would be of no use to them.
>>
>> I'm not quite certain what this means.  We can have a block that contains a bunch destructor calls with a single successor to the block in the outer scope.  An explicit "end scope element" isn't strictly needed.
>>
>
> Sorry for the late reply. There is a case where we need explicit scope
> delimiters. We need them to detect resource leaks accurately. Consider
> code:
>
> void f() {
>  {
>
>  }
>
> }
>

Sorry, I mistakenly pressed 'send' button.

void f() {
  {
      int *p = malloc(4);
  }
  ...
}

Without a scope delimiter, we can only detect that the block pointed
to by 'p' is leaked when getting to the end of 'f'. That's far from
where it happened.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Zhongxing Xu
In reply to this post by Ted Kremenek
> It seems a little cumbersome for users to have to check if it is a Stmt* before getting the value.  We could instead do:
>
>  return isStmt() ? static_cast<Stmt*>(Data.getPointer()) : NULL
>
> and have the client just use 'getStmt()' instead of 'isStmt()' + 'getStmt()'.  The first approach has one less check, while the second is simpler for clients.
>
> @Jordy, Zhongxing: What do you prefer?

I prefer the latter and rename it to getAsStmt(). This pattern is used
many times in the front end.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Jordy Rose
In reply to this post by Zhongxing Xu

On Tue, 24 Aug 2010 10:44:42 +0800, Zhongxing Xu <[hidden email]>
wrote:
> On Tue, Aug 24, 2010 at 10:41 AM, Zhongxing Xu <[hidden email]>
> wrote:
>> 2010/8/20 Ted Kremenek <[hidden email]>:
>>> On Aug 19, 2010, at 11:34 PM, Marcin Świderski wrote:
>>>> One thing that would be needed and could be questionable (I think)
>>>> would be end of scope element. This element wouldn't have start of
>>>> scope element counterpart, and it would be used only if sequance of
>>>> destructors for automatic objects would not end with block
termination
>>>> statement. CFG clients would have to ignore it always as it would be
of

>>>> no use to them.
>>>
>>> I'm not quite certain what this means.  We can have a block that
>>> contains a bunch destructor calls with a single successor to the block
>>> in the outer scope.  An explicit "end scope element" isn't strictly
>>> needed.
>>>
>>
>> Sorry for the late reply. There is a case where we need explicit scope
>> delimiters. We need them to detect resource leaks accurately. Consider
>> code:
>
> void f() {
>   {
>       int *p = malloc(4);
>   }
>   ...
> }
>
> Without a scope delimiter, we can only detect that the block pointed
> to by 'p' is leaked when getting to the end of 'f'. That's far from
> where it happened.

Unless we have destructor nodes for /all/ objects, whether the destructor
is trivial or not.

(Not trying to push either side, just presenting another possibility.)
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Zhongxing Xu
On Tue, Aug 24, 2010 at 11:18 AM, Jordy Rose <[hidden email]> wrote:

>
> On Tue, 24 Aug 2010 10:44:42 +0800, Zhongxing Xu <[hidden email]>
> wrote:
>> On Tue, Aug 24, 2010 at 10:41 AM, Zhongxing Xu <[hidden email]>
>> wrote:
>>> 2010/8/20 Ted Kremenek <[hidden email]>:
>>>> On Aug 19, 2010, at 11:34 PM, Marcin Świderski wrote:
>>>>> One thing that would be needed and could be questionable (I think)
>>>>> would be end of scope element. This element wouldn't have start of
>>>>> scope element counterpart, and it would be used only if sequance of
>>>>> destructors for automatic objects would not end with block
> termination
>>>>> statement. CFG clients would have to ignore it always as it would be
> of
>>>>> no use to them.
>>>>
>>>> I'm not quite certain what this means.  We can have a block that
>>>> contains a bunch destructor calls with a single successor to the block
>>>> in the outer scope.  An explicit "end scope element" isn't strictly
>>>> needed.
>>>>
>>>
>>> Sorry for the late reply. There is a case where we need explicit scope
>>> delimiters. We need them to detect resource leaks accurately. Consider
>>> code:
>>
>> void f() {
>>   {
>>       int *p = malloc(4);
>>   }
>>   ...
>> }
>>
>> Without a scope delimiter, we can only detect that the block pointed
>> to by 'p' is leaked when getting to the end of 'f'. That's far from
>> where it happened.
>
> Unless we have destructor nodes for /all/ objects, whether the destructor
> is trivial or not.
>
> (Not trying to push either side, just presenting another possibility.)
>

You mean creating a destructor node for 'p' in this case?

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Zhongxing Xu
In reply to this post by Ted Kremenek
>
> +/// addInitializer - Add C++ base or member initializer element to CFG.
> +CFGBlock* CFGBuilder::addInitializer(CXXBaseOrMemberInitializer* I) {
> +  if (!AddInitializers)
> +    return Block;
> +
> +  autoCreateBlock();
> +  AppendInitializer(Block, I);
> +
> +  if (!I->getInit())
> +    // FIXME: Remove this check if is unnecessery.
> +    return Block;
> +
> +  return Visit(I->getInit());
>
> This should probably be addStmt().  We probably want the initializer values to be added to the CFGBlock as block-level expressions.  The reason is that we want:
>
>  a(a)
>
> to be properly sequenced in the CFG, just like we do for DeclStmts, e.g.:
>
>  int x = x
>

There is some subtleties here. Currently we don't lift initializers of
CXXConstructExpr to block-level exprs. The benefits is that by seeing
the VarDecl first we can evaluate the constructor directly into the
object being constructed. This is done with the help of
AggExprVisitor. A object pointer is passed to AggExprVisitor as
'DestPtr'.

Note that now only initializers of CallExpr is lifted to block-level
expr, because we need to place CallEnter/CallExit nodes around them.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Jordy Rose
In reply to this post by Zhongxing Xu

On Tue, 24 Aug 2010 11:20:20 +0800, Zhongxing Xu <[hidden email]>
wrote:
> On Tue, Aug 24, 2010 at 11:18 AM, Jordy Rose <[hidden email]>
wrote:
>>
>> On Tue, 24 Aug 2010 10:44:42 +0800, Zhongxing Xu
<[hidden email]>

>> wrote:
>>> On Tue, Aug 24, 2010 at 10:41 AM, Zhongxing Xu <[hidden email]>
>>> wrote:
>>>> 2010/8/20 Ted Kremenek <[hidden email]>:
>>>>> On Aug 19, 2010, at 11:34 PM, Marcin Świderski wrote:
>>>>>> One thing that would be needed and could be questionable (I think)
>>>>>> would be end of scope element. This element wouldn't have start of
>>>>>> scope element counterpart, and it would be used only if sequance of
>>>>>> destructors for automatic objects would not end with block
>> termination
>>>>>> statement. CFG clients would have to ignore it always as it would
be
>> of
>>>>>> no use to them.
>>>>>
>>>>> I'm not quite certain what this means.  We can have a block that
>>>>> contains a bunch destructor calls with a single successor to the
block
>>>>> in the outer scope.  An explicit "end scope element" isn't strictly
>>>>> needed.
>>>>>
>>>>
>>>> Sorry for the late reply. There is a case where we need explicit
scope
>>>> delimiters. We need them to detect resource leaks accurately.
Consider

>>>> code:
>>>
>>> void f() {
>>>   {
>>>       int *p = malloc(4);
>>>   }
>>>   ...
>>> }
>>>
>>> Without a scope delimiter, we can only detect that the block pointed
>>> to by 'p' is leaked when getting to the end of 'f'. That's far from
>>> where it happened.
>>
>> Unless we have destructor nodes for /all/ objects, whether the
destructor
>> is trivial or not.
>>
>> (Not trying to push either side, just presenting another possibility.)
>>
>
> You mean creating a destructor node for 'p' in this case?

Yeah. I guess I don't mean "all objects" but "all variables plus all
temporaries with non-trivial destructors". I figure checkers are going to
be able to visit destructor nodes as well, so this would be almost as good
as a VisitEndScope. (Which itself is more powerful than VisitEndAnalysis,
as you pointed out.)
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: CFGElement changes and initializers addition (with patch)

Marcin Świderski
2010/8/24 Jordy Rose <[hidden email]>

On Tue, 24 Aug 2010 11:20:20 +0800, Zhongxing Xu <[hidden email]>
wrote:
> On Tue, Aug 24, 2010 at 11:18 AM, Jordy Rose <[hidden email]>
wrote:
>>
>> On Tue, 24 Aug 2010 10:44:42 +0800, Zhongxing Xu
<[hidden email]>
>> wrote:
>>> On Tue, Aug 24, 2010 at 10:41 AM, Zhongxing Xu <[hidden email]>
>>> wrote:
>>>> 2010/8/20 Ted Kremenek <[hidden email]>:
>>>>> On Aug 19, 2010, at 11:34 PM, Marcin Świderski wrote:
>>>>>> One thing that would be needed and could be questionable (I think)
>>>>>> would be end of scope element. This element wouldn't have start of
>>>>>> scope element counterpart, and it would be used only if sequance of
>>>>>> destructors for automatic objects would not end with block
>> termination
>>>>>> statement. CFG clients would have to ignore it always as it would
be
>> of
>>>>>> no use to them.
>>>>>
>>>>> I'm not quite certain what this means.  We can have a block that
>>>>> contains a bunch destructor calls with a single successor to the
block
>>>>> in the outer scope.  An explicit "end scope element" isn't strictly
>>>>> needed.
>>>>>
>>>>
>>>> Sorry for the late reply. There is a case where we need explicit
scope
>>>> delimiters. We need them to detect resource leaks accurately.
Consider
>>>> code:
>>>
>>> void f() {
>>>   {
>>>       int *p = malloc(4);
>>>   }
>>>   ...
>>> }
>>>
>>> Without a scope delimiter, we can only detect that the block pointed
>>> to by 'p' is leaked when getting to the end of 'f'. That's far from
>>> where it happened.
>>
>> Unless we have destructor nodes for /all/ objects, whether the
destructor
>> is trivial or not.
>>
>> (Not trying to push either side, just presenting another possibility.)
>>
>
> You mean creating a destructor node for 'p' in this case?

Yeah. I guess I don't mean "all objects" but "all variables plus all
temporaries with non-trivial destructors". I figure checkers are going to
be able to visit destructor nodes as well, so this would be almost as good
as a VisitEndScope. (Which itself is more powerful than VisitEndAnalysis,
as you pointed out.)

Implicit call to C++ destructor is part of control flow of a program and it should have representation in CFG. However variable going out of scope is not IMO, so I don't think that this would be a good idea. And for C we can jump over variable initialization. So should we create destructor node for 'p' in this case also?:

void f() {
  goto label;
  {
    int* p = malloc(4);
label:
    ...
  }
  ...
}

As I understand for delimiter approach we need StartScope and EndScope delimiters to know where in CFG we enter and leave scopes but without jump. In case of jumps it would be for client to deduce what scopes will be left and entered? I mean we don't need any more information in CFG for this?

If that is the case I understand that that what is to be done is:
- Adding implicit scopes for selection and iteration statements in case the body is not a compound statement but declares some variables,
- Not adding EndScope delimiter if scope ends with jump as it would produce unnecessary unreachable CFGBlock.

I think that while I'm working on C++ initializers/destructors I could also add this.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: CFGElement changes and initializers addition (with patch)

Zhongxing Xu
In reply to this post by Marcin Świderski
2010/8/20 Marcin Świderski <[hidden email]>:

> Sorry, I still forget to choose "Replay all" option...
> W dniu 20 sierpnia 2010 05:35 użytkownik Ted Kremenek <[hidden email]>
> napisał:
>>
>> On Aug 18, 2010, at 12:11 AM, Marcin Świderski wrote:
>>
>> Hi Jordy, Ted
>> Thanks for the appreciation and for comments. Some comments inline:
>> W dniu 18 sierpnia 2010 07:47 użytkownik Jordy Rose
>> <[hidden email]> napisał:
>>>
>>> Wow, good work.
>>>
>>> I agree with most of what Ted said. A few additional comments:
>>>
>>> - I like the single enum rather than kind and subkind, but there are 7
>>> elements, not 6. So it won't fit in a PointerIntPair anyway, though
>>> having
>>> a Stmt* field (or void*) and an enum field would be preferable to two
>>> PointerIntPairs.
>>> I'm still not totally convinced we need scope markers in the CFG, though.
>>>
>> For both destructor kinds I will need space for two pointers. For
>> automatic object I need it's declaration and statement that cause destructor
>> to be called. For temporary object I need expression bound to the temporary
>> and the full expression that contains the temporary.
>>
>> Repeating what I said in my reply to Jordy, I think it's fine to burn an
>> extra pointer if it gives us flexibility to track more information that
>> would be useful for diagnostics.
>
> So I have three acceptable solutions.
> 1. Two pointers for each element.
> 2. One pointer for each element, but in case of destructor it points to
> allocated pair of pointers.
> 3. One pointer for each element, and location of call to destructor is
> calculated on demand.
> 1 is wasteful, because only destructors need two pointers. 3 will give as
> possibly unclean API for clients. 2 seams to be the best and I will
> implement it.
>>

I think 1 is affordable, and it would provide a clean implementation.
Suppose we analyze 1 million lines of code, roughly equivalent to 1
million stmts. Each CFGElement takes a pointer of 8 bytes. It only
consumes 7 MB extra memory.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
12