Allowing checkers to mark symbols as live

classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Allowing checkers to mark symbols as live

Jordy Rose
The last bit of infrastructure I'd like to add for CStringChecker: the
ability for checkers to mark symbols live. This lets checkers use conjured
symbols for data that lives in the GDM, rather than in the store. This
doesn't affect EvalDeadSymbols.

I think it's pretty straightforward, but since it's another new callback I
figured it was worth waiting for comments before committing. (When we work
out a general mechanism for checker caching we can make this a cached
callback.)
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

MarkLiveSymbols.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek
Hi Jordy,

I have mixed feelings about adding this, and I think this needs strong justification.  Why is it needed?  What symbols do we need to keep around that aren't related to live regions, or the values bound to live regions?  Right now symbols can indicate that they are derived from other symbols (or are related to regions), and their liveness is extended if they thing they derive from is also live.  If a symbol is no longer live, why do we need to continue tracking state?

My concern about adding this is that it artificially will extend the lifetime of symbols, causing the GRState to accumulate extra state that defeats state caching.  The current set of checkers already do a bad job of removing stale data from the GDM for dead symbols.  This would exacerbate the problem.

Also, from the way you've implemented the callback, the checkers will be called before the state is crawled by RemoveDeadBindings.  This means that a Checker basically gets to unconditionally mark a symbol as live regardless of whether or not the rest of the analysis would mark the symbol live.  That seems too unconditional to me.  It also extends the lifetime of other derived symbols that would otherwise get pruned from the state.

On Aug 8, 2010, at 1:20 PM, Jordy Rose wrote:

> The last bit of infrastructure I'd like to add for CStringChecker: the
> ability for checkers to mark symbols live. This lets checkers use conjured
> symbols for data that lives in the GDM, rather than in the store. This
> doesn't affect EvalDeadSymbols.
>
> I think it's pretty straightforward, but since it's another new callback I
> figured it was worth waiting for comments before committing. (When we work
> out a general mechanism for checker caching we can make this a cached
> callback.)<MarkLiveSymbols.patch>_______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Jordy Rose

Right, here's the use case in CStringChecker. I guess I should have
explained this first.

1. Look up the strlen of some string.
2. Conjure a symbol to represent it if we don't have one.
3. At the end of the statement, symbols are cleaned. Any information about
the strlen is now lost.
4. Look up the strlen of the same string. (!)

I'll admit that marking these symbols live is a little dangerous,
especially without a callback for dead regions being sweeped. But otherwise
we can get incorrect results from cases like this:

void strlen_liveness(const char *x) {
  if (strlen(x) < 5)
    return;
  if (strlen(x) < 5)
    (void)*(char*)0; // expected-warning{{never executed}}
}

This isn't entirely contrived; if the second call to strlen is replaced
with a strcpy, we could warn about buffer overflows.

There is no reference to the symbol besides what's in the GDM. A
SymbolRegionValue is for region contents, not metadata, and a SymbolDerived
requires a symbolic parent. SymbolExtent, besides currently having a
dedicated use, has the wrong semantics in that it isn't dependent on being
at a certain place in the code.

This is intentionally before RemoveDeadBindings, because
RemoveDeadBindings wipes the constraints for a symbol, and that would
defeat the purpose here.

I guess it wasn't as simple as I thought! One possibility would be to
either change SymbolExtent or add a new symbol with path-sensitive fields
like SymbolConjured. Would that be a better solution here?


On Mon, 9 Aug 2010 10:15:51 -0700, Ted Kremenek <[hidden email]>
wrote:
> Hi Jordy,
>
> I have mixed feelings about adding this, and I think this needs strong
> justification.  Why is it needed?  What symbols do we need to keep
around
> that aren't related to live regions, or the values bound to live
regions?
> Right now symbols can indicate that they are derived from other symbols
(or
> are related to regions), and their liveness is extended if they thing
they
> derive from is also live.  If a symbol is no longer live, why do we need
to
> continue tracking state?
>
> My concern about adding this is that it artificially will extend the
> lifetime of symbols, causing the GRState to accumulate extra state that
> defeats state caching.  The current set of checkers already do a bad job
of
> removing stale data from the GDM for dead symbols.  This would
exacerbate
> the problem.
>
> Also, from the way you've implemented the callback, the checkers will be
> called before the state is crawled by RemoveDeadBindings.  This means
that
> a Checker basically gets to unconditionally mark a symbol as live
> regardless of whether or not the rest of the analysis would mark the
symbol
> live.  That seems too unconditional to me.  It also extends the lifetime
of

> other derived symbols that would otherwise get pruned from the state.
>
> On Aug 8, 2010, at 1:20 PM, Jordy Rose wrote:
>
>> The last bit of infrastructure I'd like to add for CStringChecker: the
>> ability for checkers to mark symbols live. This lets checkers use
>> conjured
>> symbols for data that lives in the GDM, rather than in the store. This
>> doesn't affect EvalDeadSymbols.
>>
>> I think it's pretty straightforward, but since it's another new
callback
>> I
>> figured it was worth waiting for comments before committing. (When we
>> work
>> out a general mechanism for checker caching we can make this a cached
>> callback.)
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek

On Aug 9, 2010, at 12:31 PM, Jordy Rose wrote:

> 1. Look up the strlen of some string.
> 2. Conjure a symbol to represent it if we don't have one.
> 3. At the end of the statement, symbols are cleaned. Any information about
> the strlen is now lost.

I don't see how these symbols are getting lost.  This is the code from SymbolManager::isLive()

  if (const SymbolExtent *extent = dyn_cast<SymbolExtent>(sym)) {
    const MemRegion *Base = extent->getRegion()->getBaseRegion();
    if (const VarRegion *VR = dyn_cast<VarRegion>(Base))
      return isLive(VR);
    if (const SymbolicRegion *SR = dyn_cast<SymbolicRegion>(Base))
      return isLive(SR->getSymbol());
    return false;
  }

As long as the SymbolicRegion 'SR' is live, the symbol representing the extent should be live.  In your example, 'x' binds to a SymbolicRegion.  Since 'x' is live, so is that binding, and thus so is the symbol representing the extent.  If this not working?
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek
In reply to this post by Jordy Rose
On Aug 9, 2010, at 12:31 PM, Jordy Rose wrote:

> There is no reference to the symbol besides what's in the GDM. A
> SymbolRegionValue is for region contents, not metadata, and a SymbolDerived
> requires a symbolic parent. SymbolExtent, besides currently having a
> dedicated use, has the wrong semantics in that it isn't dependent on being
> at a certain place in the code.

The extent is a property of the region.  It's only symbolic if the region is symbolic; otherwise it is a value we should be able to compute up front.

For symbolic constraints, the dependence on where we are in the code is modeled by the constraints on the symbol.  The symbol itself doesn't need to record any more context other than being based on the symbolic region (which should capture all the context we need).  Why do we need to capture any more context in the SymbolExtent?
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek
In reply to this post by Jordy Rose
On Aug 9, 2010, at 12:31 PM, Jordy Rose wrote:

> I guess it wasn't as simple as I thought! One possibility would be to
> either change SymbolExtent or add a new symbol with path-sensitive fields
> like SymbolConjured. Would that be a better solution here?

I could have overlooked something here, but my impression is that I don't think any of this is needed.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Jordy Rose
In reply to this post by Ted Kremenek

On Mon, 9 Aug 2010 12:56:07 -0700, Ted Kremenek <[hidden email]>
wrote:

> On Aug 9, 2010, at 12:31 PM, Jordy Rose wrote:
>
>> 1. Look up the strlen of some string.
>> 2. Conjure a symbol to represent it if we don't have one.
>> 3. At the end of the statement, symbols are cleaned. Any information
>> about
>> the strlen is now lost.
>
> I don't see how these symbols are getting lost.  This is the code from
> SymbolManager::isLive()
>
>   if (const SymbolExtent *extent = dyn_cast<SymbolExtent>(sym)) {
>     const MemRegion *Base = extent->getRegion()->getBaseRegion();
>     if (const VarRegion *VR = dyn_cast<VarRegion>(Base))
>       return isLive(VR);
>     if (const SymbolicRegion *SR = dyn_cast<SymbolicRegion>(Base))
>       return isLive(SR->getSymbol());
>     return false;
>   }
>
> As long as the SymbolicRegion 'SR' is live, the symbol representing the
> extent should be live.  In your example, 'x' binds to a SymbolicRegion.
> Since 'x' is live, so is that binding, and thus so is the symbol
> representing the extent.  If this not working?

No, it's just that the extent usually isn't the string length. For this
definition the strlen and extent are clearly different.

char x[5] = "abc"

In addition, information about string lengths is easily invalidated (hence
the motivation for ProcessRegionChange), while extents stay constant for
the entire life of the region. So SymbolExtent would have to be modified to
be suitable for this.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek
On Aug 9, 2010, at 1:14 PM, Jordy Rose wrote:

>> As long as the SymbolicRegion 'SR' is live, the symbol representing the
>> extent should be live.  In your example, 'x' binds to a SymbolicRegion.
>> Since 'x' is live, so is that binding, and thus so is the symbol
>> representing the extent.  If this not working?
>
> No, it's just that the extent usually isn't the string length.

Ah right.

> For this
> definition the strlen and extent are clearly different.
>
> char x[5] = "abc"
>
> In addition, information about string lengths is easily invalidated (hence
> the motivation for ProcessRegionChange), while extents stay constant for
> the entire life of the region. So SymbolExtent would have to be modified to
> be suitable for this.

You are right that the length of a string can change over time, but I think we can still represent them in a way similar to SymbolExtent.  I still don't believe that we need different symbols to represent the string length, but rather that the constraints on that symbol are allowed to change.

>From SymbolManager::isLive()'s perspective, how about allowing a "checker-defined" derived symbol that is either based on a symbol or a MemRegion.  It can then have an associated checker "tag" (like we do for tagging ExplodedNodes).  The tag would be used to distinguish different kinds of derived symbols, but SymbolManager would keep the symbol live as long as the thing it is derived from is live.  Would that work?

Note that this approach also allows us to model SymbolExtents as just a special case, with a reserved "tag" representing extents.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Jordy Rose

On Mon, 9 Aug 2010 13:31:51 -0700, Ted Kremenek <[hidden email]>
wrote:
> On Aug 9, 2010, at 1:14 PM, Jordy Rose wrote:
>
>>> As long as the SymbolicRegion 'SR' is live, the symbol representing
the
>>> extent should be live.  In your example, 'x' binds to a
SymbolicRegion.

>>> Since 'x' is live, so is that binding, and thus so is the symbol
>>> representing the extent.  If this not working?
>>
>> No, it's just that the extent usually isn't the string length.
>
> Ah right.
>
>> For this
>> definition the strlen and extent are clearly different.
>>
>> char x[5] = "abc"
>>
>> In addition, information about string lengths is easily invalidated
>> (hence
>> the motivation for ProcessRegionChange), while extents stay constant
for
>> the entire life of the region. So SymbolExtent would have to be
modified
>> to
>> be suitable for this.
>
> You are right that the length of a string can change over time, but I
> think we can still represent them in a way similar to SymbolExtent.  I
> still don't believe that we need different symbols to represent the
string
> length, but rather that the constraints on that symbol are allowed to
> change.
>
> From SymbolManager::isLive()'s perspective, how about allowing a
> "checker-defined" derived symbol that is either based on a symbol or a
> MemRegion.  It can then have an associated checker "tag" (like we do for
> tagging ExplodedNodes).  The tag would be used to distinguish different
> kinds of derived symbols, but SymbolManager would keep the symbol live
as
> long as the thing it is derived from is live.  Would that work?
>
> Note that this approach also allows us to model SymbolExtents as just a
> special case, with a reserved "tag" representing extents.


I've started working on this, but I'm a little worried about "constraints
on that symbol [being] allowed to change". Up till now symbols have been
immutable. While it wouldn't be too hard to add a
ClearConstraints(SymbolRef) method to ConstraintManager, I'm concerned that
it won't be clear when to clear constraints on a symbol and when it's
better to replace it with a new one.

We could restrict constraint-clearing to these region metadata symbols,
since they don't have a way to be replaced. But it still feels odd.

I'll keep going in this direction for now.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek

On Aug 9, 2010, at 6:16 PM, Jordy Rose wrote:

> I've started working on this, but I'm a little worried about "constraints
> on that symbol [being] allowed to change". Up till now symbols have been
> immutable. While it wouldn't be too hard to add a
> ClearConstraints(SymbolRef) method to ConstraintManager, I'm concerned that
> it won't be clear when to clear constraints on a symbol and when it's
> better to replace it with a new one.

I was thinking about this, and this is a very valid point.  We've been treating symbols as "immutable" values that we gradually add additional constraints.

> We could restrict constraint-clearing to these region metadata symbols,
> since they don't have a way to be replaced. But it still feels odd.

Yes it does feel odd.  Odd enough that it's probably wrong.

I guess you are right that the length is a location-sensitive symbol (similar to conjured symbols, with the creation points embedded into the symbol), but that doesn't 100% feel like the right solution either.  Whenever the string length may have changed (e.g., by a store), I suppose that would be a new symbol value to track the length.  But even that is overly conservative; the length of the string doesn't change unless you insert a null character.

At any rate, if you decide to model the extent with a new symbol, the best solution to me from the liveness perspective is to use something that is similar to a derived symbol (like SymbolExtent) that is live (at least from the perspective of SymbolManager::isLive) if the MemRegion/Symbol it derives from is live.  It could include a creation point like conjured symbols if you wanted to incorporate that information in as well, but adding that location-context unless we need it seems undesirable to me (since it artificially causes path bifurcation).
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek
In reply to this post by Jordy Rose
On Aug 9, 2010, at 6:16 PM, Jordy Rose wrote:

> We could restrict constraint-clearing to these region metadata symbols,
> since they don't have a way to be replaced. But it still feels odd.
>
> I'll keep going in this direction for now

You've convinced me that this isn't the right solution on its own; we should keep symbols as being immutable values.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Zhongxing Xu
In reply to this post by Ted Kremenek
I'm inclined to Jordy's original proposal that Checkers should be able
to control some symbols that only they know what the symbol means,
when the symbol dies.

The benefit is that symbols still remains what it is: a representation
of a value generated somewhere, and immutable. The checker could
generate a symbol to represent what it thinks representing the length
of the string, and next time when it change its mind, it can generate
a new symbol, and mark the old one dead. This seems very natural to
me. And it won't pollute the state with unnecessary symbols, because
it knows exactly when the symbol is live and dead.

On Tue, Aug 10, 2010 at 1:15 AM, Ted Kremenek <[hidden email]> wrote:
> Hi Jordy,
>
> I have mixed feelings about adding this, and I think this needs strong justification.  Why is it needed?  What symbols do we need to keep around that aren't related to live regions, or the values bound to live regions?  Right now symbols can indicate that they are derived from other symbols (or are related to regions), and their liveness is extended if they thing they derive from is also live.  If a symbol is no longer live, why do we need to continue tracking state?
>
> My concern about adding this is that it artificially will extend the lifetime of symbols, causing the GRState to accumulate extra state that defeats state caching.  The current set of checkers already do a bad job of removing stale data from the GDM for dead symbols.

Why? Checkers get informed by callback EvalDeadSymbols().

>
> Also, from the way you've implemented the callback, the checkers will be called before the state is crawled by RemoveDeadBindings.  This means that a Checker basically gets to unconditionally mark a symbol as live regardless of whether or not the rest of the analysis would mark the symbol live.  That seems too unconditional to me.  It also extends the lifetime of other derived symbols that would otherwise get pruned from the state.
>
> On Aug 8, 2010, at 1:20 PM, Jordy Rose wrote:
>
>> The last bit of infrastructure I'd like to add for CStringChecker: the
>> ability for checkers to mark symbols live. This lets checkers use conjured
>> symbols for data that lives in the GDM, rather than in the store. This
>> doesn't affect EvalDeadSymbols.
>>
>> I think it's pretty straightforward, but since it's another new callback I
>> figured it was worth waiting for comments before committing. (When we work
>> out a general mechanism for checker caching we can make this a cached
>> callback.)<MarkLiveSymbols.patch>_______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek
On Aug 9, 2010, at 7:17 PM, Zhongxing Xu wrote:

I'm inclined to Jordy's original proposal that Checkers should be able
to control some symbols that only they know what the symbol means,
when the symbol dies.

My point was that the checker shouldn't have to do anything special to extend the lifetime of a symbol.  For symbols that annotate MemRegions, the lifetime of the symbol is inherently bound to the lifetime of the MemRegion.  It seems far more natural to me, more efficient, and far less error prone to have SymbolManager::isLive() understand that the lifetime of one symbol is tied to another (declaratively) then having the checker doing something ad hoc on the side.


The benefit is that symbols still remains what it is: a representation
of a value generated somewhere, and immutable. The checker could
generate a symbol to represent what it thinks representing the length
of the string, and next time when it change its mind, it can generate
a new symbol, and mark the old one dead. This seems very natural to
me. And it won't pollute the state with unnecessary symbols, because
it knows exactly when the symbol is live and dead.

On Tue, Aug 10, 2010 at 1:15 AM, Ted Kremenek <[hidden email]> wrote:
Hi Jordy,

I have mixed feelings about adding this, and I think this needs strong justification. Why is it needed?  What symbols do we need to keep around that aren't related to live regions, or the values bound to live regions?  Right now symbols can indicate that they are derived from other symbols (or are related to regions), and their liveness is extended if they thing they derive from is also live.  If a symbol is no longer live, why do we need to continue tracking state?

My concern about adding this is that it artificially will extend the lifetime of symbols, causing the GRState to accumulate extra state that defeats state caching.  The current set of checkers already do a bad job of removing stale data from the GDM for dead symbols.

Why? Checkers get informed by callback EvalDeadSymbols().

That requires Checkers being good citizens and implement that callback.  The only Checkers that should need to implement that callback are ones that do something special (e.g., leak detection checkers) when a symbol is no longer live.  The more we can move to a place where the management of symbols is more declarative than imperative then the performance problems involved with artificially keeping track of a symbol longer than necessary just get defined away.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Zhongxing Xu
On Tue, Aug 10, 2010 at 11:43 AM, Ted Kremenek <[hidden email]> wrote:

> On Aug 9, 2010, at 7:17 PM, Zhongxing Xu wrote:
>
> I'm inclined to Jordy's original proposal that Checkers should be able
> to control some symbols that only they know what the symbol means,
> when the symbol dies.
>
> My point was that the checker shouldn't have to do anything special to
> extend the lifetime of a symbol.  For symbols that annotate MemRegions, the
> lifetime of the symbol is inherently bound to the lifetime of the MemRegion.
>  It seems far more natural to me, more efficient, and far less error prone
> to have SymbolManager::isLive() understand that the lifetime of one symbol
> is tied to another (declaratively) then having the checker doing something
> ad hoc on the side.
>

Make sense. I overlooked the fact that checkers alone can't decide the
liveness of a symbol. Then some kind of derived symbol is justified.
To keep symbols immutable, checkers need to make multiple symbols.
SymbolReaper needs a way to recognize the 'current' symbol.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek
On Aug 9, 2010, at 9:09 PM, Zhongxing Xu wrote:

> On Tue, Aug 10, 2010 at 11:43 AM, Ted Kremenek <[hidden email]> wrote:
>> On Aug 9, 2010, at 7:17 PM, Zhongxing Xu wrote:
>>
>> I'm inclined to Jordy's original proposal that Checkers should be able
>> to control some symbols that only they know what the symbol means,
>> when the symbol dies.
>>
>> My point was that the checker shouldn't have to do anything special to
>> extend the lifetime of a symbol.  For symbols that annotate MemRegions, the
>> lifetime of the symbol is inherently bound to the lifetime of the MemRegion.
>>  It seems far more natural to me, more efficient, and far less error prone
>> to have SymbolManager::isLive() understand that the lifetime of one symbol
>> is tied to another (declaratively) then having the checker doing something
>> ad hoc on the side.
>>
>
> Make sense. I overlooked the fact that checkers alone can't decide the
> liveness of a symbol. Then some kind of derived symbol is justified.
> To keep symbols immutable, checkers need to make multiple symbols.
> SymbolReaper needs a way to recognize the 'current' symbol.

My thoughts exactly.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Jordy Rose
In reply to this post by Ted Kremenek

On Mon, 9 Aug 2010 20:43:31 -0700, Ted Kremenek <[hidden email]>
wrote:
> On Aug 9, 2010, at 7:17 PM, Zhongxing Xu wrote:
>
>> I'm inclined to Jordy's original proposal that Checkers should be able
>> to control some symbols that only they know what the symbol means,
>> when the symbol dies.
>
> My point was that the checker shouldn't have to do anything special to
> extend the lifetime of a symbol.  For symbols that annotate MemRegions,
the
> lifetime of the symbol is inherently bound to the lifetime of the
> MemRegion.  It seems far more natural to me, more efficient, and far
less
> error prone to have SymbolManager::isLive() understand that the lifetime
of
> one symbol is tied to another (declaratively) then having the checker
doing
> something ad hoc on the side.

How can you kill such a symbol, though? If we keep immutable symbols
around until their associated regions die, and the string length gets
invalidated and requested several times, we end up with several dead
symbols that are still tied to a live region.

(I'm not objecting to finding a better solution than MarkLiveSymbols, just
trying to bring up all the requirements. The current scheme has the problem
that the checker doesn't know when the symbols /should/ die.)
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Ted Kremenek
On Aug 9, 2010, at 9:30 PM, Jordy Rose wrote:

> How can you kill such a symbol, though? If we keep immutable symbols
> around until their associated regions die, and the string length gets
> invalidated and requested several times, we end up with several dead
> symbols that are still tied to a live region.
>
> (I'm not objecting to finding a better solution than MarkLiveSymbols, just
> trying to bring up all the requirements. The current scheme has the problem
> that the checker doesn't know when the symbols /should/ die.)

Thanks Jordy, I think I understand now.   If I understand things correctly, the problem here is that RemoveDeadBindings only crawls the Environment and the Store.  This extra symbol is not referenced in either, but instead only lives in the GDM (which records the symbol currently used to represent the string length).  So the purpose of MarkLiveSymbols is to find symbols that were not covered by either.  Not having MarkLiveSymbols only worked if we only used a single symbol to represent the string length (in which case a side map in the GDM isn't necessary), instead of having different symbols.

In that case, I'd be fine with adding MarkLiveSymbols, as long as the mechanism to determine whether or not a symbol was dead was done by SymbolReaper and SymbolManager, and not logic specific to the Checker.  Basically, MarkLiveSymbols allows GRExprEngine to crawl the set of symbols that could be live/dead, and no more.  As we evolve our APIs, and possibly make the GDM a little less opaque, it would be nice that "symbol maps" in the GDM were just automatically crawled instead of the Checker having to implement boilerplate like this.

@Zhongxing: what do you think?
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Zhongxing Xu
On Tue, Aug 10, 2010 at 12:50 PM, Ted Kremenek <[hidden email]> wrote:

> On Aug 9, 2010, at 9:30 PM, Jordy Rose wrote:
>
>> How can you kill such a symbol, though? If we keep immutable symbols
>> around until their associated regions die, and the string length gets
>> invalidated and requested several times, we end up with several dead
>> symbols that are still tied to a live region.
>>
>> (I'm not objecting to finding a better solution than MarkLiveSymbols, just
>> trying to bring up all the requirements. The current scheme has the problem
>> that the checker doesn't know when the symbols /should/ die.)
>
> Thanks Jordy, I think I understand now.   If I understand things correctly, the problem here is that RemoveDeadBindings only crawls the Environment and the Store.  This extra symbol is not referenced in either, but instead only lives in the GDM (which records the symbol currently used to represent the string length).  So the purpose of MarkLiveSymbols is to find symbols that were not covered by either.  Not having MarkLiveSymbols only worked if we only used a single symbol to represent the string length (in which case a side map in the GDM isn't necessary), instead of having different symbols.
>
> In that case, I'd be fine with adding MarkLiveSymbols, as long as the mechanism to determine whether or not a symbol was dead was done by SymbolReaper and SymbolManager, and not logic specific to the Checker.  Basically, MarkLiveSymbols allows GRExprEngine to crawl the set of symbols that could be live/dead, and no more.  As we evolve our APIs, and possibly make the GDM a little less opaque, it would be nice that "symbol maps" in the GDM were just automatically crawled instead of the Checker having to implement boilerplate like this.
>
> @Zhongxing: what do you think?

Exactly.  We want to keep symbols immutable, and this requires the
checker to generate multiple symbols to associated with a region. The
liveness of these symbols are determined by two components: the
checker knows which is the current symbol; the symbol reaper knows
whether the symbol's associated region is still live.

So MarkLiveSymbols only let the checker say 'I think these symbols
should be live', nothing more. It's still the symbol reaper to decide
the symbol liveness.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Jordy Rose

On Tue, 10 Aug 2010 13:05:28 +0800, Zhongxing Xu <[hidden email]>
wrote:
> On Tue, Aug 10, 2010 at 12:50 PM, Ted Kremenek <[hidden email]>
wrote:

>> On Aug 9, 2010, at 9:30 PM, Jordy Rose wrote:
>>
>>> How can you kill such a symbol, though? If we keep immutable symbols
>>> around until their associated regions die, and the string length gets
>>> invalidated and requested several times, we end up with several dead
>>> symbols that are still tied to a live region.
>>>
>>> (I'm not objecting to finding a better solution than MarkLiveSymbols,
>>> just
>>> trying to bring up all the requirements. The current scheme has the
>>> problem
>>> that the checker doesn't know when the symbols /should/ die.)
>>
>> Thanks Jordy, I think I understand now.   If I understand things
>> correctly, the problem here is that RemoveDeadBindings only crawls the
>> Environment and the Store.  This extra symbol is not referenced in
>> either, but instead only lives in the GDM (which records the symbol
>> currently used to represent the string length).  So the purpose of
>> MarkLiveSymbols is to find symbols that were not covered by either.
 Not
>> having MarkLiveSymbols only worked if we only used a single symbol to
>> represent the string length (in which case a side map in the GDM isn't
>> necessary), instead of having different symbols.
>>
>> In that case, I'd be fine with adding MarkLiveSymbols, as long as the
>> mechanism to determine whether or not a symbol was dead was done by
>> SymbolReaper and SymbolManager, and not logic specific to the Checker.
>>  Basically, MarkLiveSymbols allows GRExprEngine to crawl the set of
>> symbols that could be live/dead, and no more.  As we evolve our APIs,
and
>> possibly make the GDM a little less opaque, it would be nice that
"symbol

>> maps" in the GDM were just automatically crawled instead of the Checker
>> having to implement boilerplate like this.
>>
>> @Zhongxing: what do you think?
>
> Exactly.  We want to keep symbols immutable, and this requires the
> checker to generate multiple symbols to associated with a region. The
> liveness of these symbols are determined by two components: the
> checker knows which is the current symbol; the symbol reaper knows
> whether the symbol's associated region is still live.
>
> So MarkLiveSymbols only let the checker say 'I think these symbols
> should be live', nothing more. It's still the symbol reaper to decide
> the symbol liveness.

Right now if you say "this should be live", the SymbolReaper will save
that symbol unconditionally. The trouble is what Ted originally brought up:
that even when the associated region dies the strlen symbol will still be
kept around. There's no way (yet) to say "this should be live if its
associated region is live and dead otherwise".

Off-the-top-of-my-head ideas to get around this include adding a callback
for dead region sweeping and making a new symbol that lives if a parent
region lives /and/ it's been markLive()ed. Neither of these gets rid of
MarkLiveSymbols, though.

If we wanted crawlable symbol maps, we could probably work something into
GRStateTrait, much like how SVals allow you to iterate over the symbol refs
in the expression.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Allowing checkers to mark symbols as live

Zhongxing Xu
On Tue, Aug 10, 2010 at 4:14 PM, Jordy Rose <[hidden email]> wrote:

>
> On Tue, 10 Aug 2010 13:05:28 +0800, Zhongxing Xu <[hidden email]>
> wrote:
>> On Tue, Aug 10, 2010 at 12:50 PM, Ted Kremenek <[hidden email]>
> wrote:
>>> On Aug 9, 2010, at 9:30 PM, Jordy Rose wrote:
>>>
>>>> How can you kill such a symbol, though? If we keep immutable symbols
>>>> around until their associated regions die, and the string length gets
>>>> invalidated and requested several times, we end up with several dead
>>>> symbols that are still tied to a live region.
>>>>
>>>> (I'm not objecting to finding a better solution than MarkLiveSymbols,
>>>> just
>>>> trying to bring up all the requirements. The current scheme has the
>>>> problem
>>>> that the checker doesn't know when the symbols /should/ die.)
>>>
>>> Thanks Jordy, I think I understand now.   If I understand things
>>> correctly, the problem here is that RemoveDeadBindings only crawls the
>>> Environment and the Store.  This extra symbol is not referenced in
>>> either, but instead only lives in the GDM (which records the symbol
>>> currently used to represent the string length).  So the purpose of
>>> MarkLiveSymbols is to find symbols that were not covered by either.
>  Not
>>> having MarkLiveSymbols only worked if we only used a single symbol to
>>> represent the string length (in which case a side map in the GDM isn't
>>> necessary), instead of having different symbols.
>>>
>>> In that case, I'd be fine with adding MarkLiveSymbols, as long as the
>>> mechanism to determine whether or not a symbol was dead was done by
>>> SymbolReaper and SymbolManager, and not logic specific to the Checker.
>>>  Basically, MarkLiveSymbols allows GRExprEngine to crawl the set of
>>> symbols that could be live/dead, and no more.  As we evolve our APIs,
> and
>>> possibly make the GDM a little less opaque, it would be nice that
> "symbol
>>> maps" in the GDM were just automatically crawled instead of the Checker
>>> having to implement boilerplate like this.
>>>
>>> @Zhongxing: what do you think?
>>
>> Exactly.  We want to keep symbols immutable, and this requires the
>> checker to generate multiple symbols to associated with a region. The
>> liveness of these symbols are determined by two components: the
>> checker knows which is the current symbol; the symbol reaper knows
>> whether the symbol's associated region is still live.
>>
>> So MarkLiveSymbols only let the checker say 'I think these symbols
>> should be live', nothing more. It's still the symbol reaper to decide
>> the symbol liveness.
>
> Right now if you say "this should be live", the SymbolReaper will save
> that symbol unconditionally. The trouble is what Ted originally brought up:
> that even when the associated region dies the strlen symbol will still be
> kept around. There's no way (yet) to say "this should be live if its
> associated region is live and dead otherwise".

Just as your patch, before RemoveDeadBindings, checkers mark live
symbols, SymbolReaper cache them. When SymbolReaper is queried about
the liveness of those symbols, it first check if its associated region
is alive, if so, it return true. otherwise return false.

>
> Off-the-top-of-my-head ideas to get around this include adding a callback
> for dead region sweeping and making a new symbol that lives if a parent
> region lives /and/ it's been markLive()ed. Neither of these gets rid of
> MarkLiveSymbols, though.
>
> If we wanted crawlable symbol maps, we could probably work something into
> GRStateTrait, much like how SVals allow you to iterate over the symbol refs
> in the expression.
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
12