How to extract a symbol stored in LazyCompoundVal?

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
Hmm, weird.

I suspect that assignment was handled with "small struct optimization", i.e. field-by-field rather than lazily (cf. RegionStoreManager::tryBindSmallStruct).

Could you do a State->dump() to verify that? If it shows that there's no default binding but instead there are two derived symbols bound to two different offsets, then the information about the "whole struct symbol" is already more or less lost: the static analyzer no longer remembers that this whole structure is the same as pos1, but it does remember that its fields, separately, are exactly the same as they were in pos1, which is what you see by looking at the fields separately.

Generally we don't have many checkers that track structures as a whole and we don't really know how *should* the checker API look like in order to make such checkers easy to implement. The only such checker that we have is IteratorChecker and it kinda tries to do something but it's not very convenient. For C++ objects i'm thinking of tracking a "whole structure symbol" artificially, so that it didn't have anything to do with the actual contents of the structure but more with its semantic meaning: it would be preserved by const operations (even if they mutate memory contents of mutable fields) or through copies/moves and additionally you would be able to attach state traits to it without thinking about manually modeling copies/moves.

I guess in your case, which seems to be more like a C world, the ad-hoc solution would be to do something like

    let's see...
    pos2.x comes from pos1...
    pos2.y also comes from pos1...
    aha, got it!
    the whole pos2 comes from pos1!

You will *anyway* have to do this because the programmer is free to copy the structure field-by-field manually instead of just assigning the structure. This would also happen in C++ if the structure has a non-trivial constructor. For the same reason it's not enough to check only 'x' but skip 'y': the programmer can easily overwrite one field but not the other field.

Finally, i'm surprised that it returns a UndefinedVal (i.e., in particular, it allows you to unwrap the Optional) instead of None. This sounds like a bug. But it might be because the structure does indeed have an undefined default binding (eg., this happens when it's allocated by malloc() or operator new). It'd make sense because assigning every field wouldn't overwrite the default binding. Which, in turn, should remind you that relying on the "structure symbol" in order to figure out what the contents of the structure are is not a good idea unless your structure is immutable and completely opaque or you somehow know that it's freshly created. But direct bindings to fields are actually always trustworthy. That's how our memory model works.


On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
I tried State->dump() and it shows there is no default binding for a struct variable that copies another. See below. I certainly can (and should) use the field symbols for my work. I was curious about the internals of the analyzer engine. Thank you for the detailed explanation!

struct XY pos1 = next_pos(10, 20); // Binding: pos1 -> conj_$3{struct XY, LC1, S45538, #1}
move_to_pos(pos1);
// evalCall, State->dump():
//
// Store (direct and default bindings), 0x4176b88 :
// (GlobalInternalSpaceRegion,0,default) : conj_$1{int, LC1, S45538, #1}
// (GlobalSystemSpaceRegion,0,default) : conj_$2{int, LC1, S45538, #1}
// (pos1,0,default) : conj_$3{struct XY, LC1, S45538, #1}
//
// Expressions by stack frame:
// #0 Calling main                                                                                        
// (LC1, S45573) move_to_pos : &code{move_to_pos}
// (LC1, S45581) pos1 : lazyCompoundVal{0x4176b88,pos1}
//
// Ranges are empty.
//
// StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> conj_$3{struct XY, LC1, S45538, #1}

struct XY pos2 = pos1; // Binding: pos2 -> lazyCompoundVal{0x4176b88,pos1}
move_to_pos(pos2);
// evalCall, State->dump():
//
// Store (direct and default bindings), 0x4176b88 :
// (GlobalInternalSpaceRegion,0,default) : conj_$1{int, LC1, S45538, #1}
// (GlobalSystemSpaceRegion,0,default) : conj_$2{int, LC1, S45538, #1}
// (pos1,0,default) : conj_$3{struct XY, LC1, S45538, #1}
//
// Expressions by stack frame:
// #0 Calling main
// (LC1, S45618) move_to_pos : &code{move_to_pos}
// (LC1, S45626) pos2 : lazyCompoundVal{0x4182e58,pos2}
//
// Ranges are empty.
//
// StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> Undefined

On Wed, 26 Jun 2019 at 12:15, Artem Dergachev <[hidden email]> wrote:
Hmm, weird.

I suspect that assignment was handled with "small struct optimization", i.e. field-by-field rather than lazily (cf. RegionStoreManager::tryBindSmallStruct).

Could you do a State->dump() to verify that? If it shows that there's no default binding but instead there are two derived symbols bound to two different offsets, then the information about the "whole struct symbol" is already more or less lost: the static analyzer no longer remembers that this whole structure is the same as pos1, but it does remember that its fields, separately, are exactly the same as they were in pos1, which is what you see by looking at the fields separately.

Generally we don't have many checkers that track structures as a whole and we don't really know how *should* the checker API look like in order to make such checkers easy to implement. The only such checker that we have is IteratorChecker and it kinda tries to do something but it's not very convenient. For C++ objects i'm thinking of tracking a "whole structure symbol" artificially, so that it didn't have anything to do with the actual contents of the structure but more with its semantic meaning: it would be preserved by const operations (even if they mutate memory contents of mutable fields) or through copies/moves and additionally you would be able to attach state traits to it without thinking about manually modeling copies/moves.

I guess in your case, which seems to be more like a C world, the ad-hoc solution would be to do something like

    let's see...
    pos2.x comes from pos1...
    pos2.y also comes from pos1...
    aha, got it!
    the whole pos2 comes from pos1!

You will *anyway* have to do this because the programmer is free to copy the structure field-by-field manually instead of just assigning the structure. This would also happen in C++ if the structure has a non-trivial constructor. For the same reason it's not enough to check only 'x' but skip 'y': the programmer can easily overwrite one field but not the other field.

Finally, i'm surprised that it returns a UndefinedVal (i.e., in particular, it allows you to unwrap the Optional) instead of None. This sounds like a bug. But it might be because the structure does indeed have an undefined default binding (eg., this happens when it's allocated by malloc() or operator new). It'd make sense because assigning every field wouldn't overwrite the default binding. Which, in turn, should remind you that relying on the "structure symbol" in order to figure out what the contents of the structure are is not a good idea unless your structure is immutable and completely opaque or you somehow know that it's freshly created. But direct bindings to fields are actually always trustworthy. That's how our memory model works.


On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
You shouldn't do the "StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> Undefined" part; it's not what i suggested and it doesn't work because Arg0Reg is already "dead" and its value has been garbage-collected (i assume that by Arg0Reg you mean the VarRegion for pos2).

Like, if pos2 is never used later in the program, then we don't need to remember its value. So if move_to_pos(pos2) is the last use of pos2, we'll drop the binding and the Store would be empty.

Accessing dead regions will produce unexpected results; the code for getSVal()/getBinding() assumes that the region is live (or that the expression is active when you try to retrieve a value of an expression).

In this case it's like "hmm, the user is reading from a local variable, but it has no bindings, which means it has never been written to (otherwise i would have remembered it), which means that it's undefined behavior and contents of the variable are undefined". The real reason why Store doesn't remember any bindings is because it has correctly forgot about them.

This is why we include the (lazy) copy of the old Store with the LazyCompoundValue. In order to extract data from lazyCompoundVal{0x4182e58,pos2}, you need to load the variable from the Store 0x4182e58, *not* from the current Store. The region is still live in that old store, but in the current store it's no longer there.


This is why in order to obtain

On 6/26/19 3:22 PM, Torry Chen wrote:
I tried State->dump() and it shows there is no default binding for a struct variable that copies another. See below. I certainly can (and should) use the field symbols for my work. I was curious about the internals of the analyzer engine. Thank you for the detailed explanation!

struct XY pos1 = next_pos(10, 20); // Binding: pos1 -> conj_$3{struct XY, LC1, S45538, #1}
move_to_pos(pos1);
// evalCall, State->dump():
//
// Store (direct and default bindings), 0x4176b88 :
// (GlobalInternalSpaceRegion,0,default) : conj_$1{int, LC1, S45538, #1}
// (GlobalSystemSpaceRegion,0,default) : conj_$2{int, LC1, S45538, #1}
// (pos1,0,default) : conj_$3{struct XY, LC1, S45538, #1}
//
// Expressions by stack frame:
// #0 Calling main                                                                                        
// (LC1, S45573) move_to_pos : &code{move_to_pos}
// (LC1, S45581) pos1 : lazyCompoundVal{0x4176b88,pos1}
//
// Ranges are empty.
//
// StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> conj_$3{struct XY, LC1, S45538, #1}

struct XY pos2 = pos1; // Binding: pos2 -> lazyCompoundVal{0x4176b88,pos1}
move_to_pos(pos2);
// evalCall, State->dump():
//
// Store (direct and default bindings), 0x4176b88 :
// (GlobalInternalSpaceRegion,0,default) : conj_$1{int, LC1, S45538, #1}
// (GlobalSystemSpaceRegion,0,default) : conj_$2{int, LC1, S45538, #1}
// (pos1,0,default) : conj_$3{struct XY, LC1, S45538, #1}
//
// Expressions by stack frame:
// #0 Calling main
// (LC1, S45618) move_to_pos : &code{move_to_pos}
// (LC1, S45626) pos2 : lazyCompoundVal{0x4182e58,pos2}
//
// Ranges are empty.
//
// StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> Undefined

On Wed, 26 Jun 2019 at 12:15, Artem Dergachev <[hidden email]> wrote:
Hmm, weird.

I suspect that assignment was handled with "small struct optimization", i.e. field-by-field rather than lazily (cf. RegionStoreManager::tryBindSmallStruct).

Could you do a State->dump() to verify that? If it shows that there's no default binding but instead there are two derived symbols bound to two different offsets, then the information about the "whole struct symbol" is already more or less lost: the static analyzer no longer remembers that this whole structure is the same as pos1, but it does remember that its fields, separately, are exactly the same as they were in pos1, which is what you see by looking at the fields separately.

Generally we don't have many checkers that track structures as a whole and we don't really know how *should* the checker API look like in order to make such checkers easy to implement. The only such checker that we have is IteratorChecker and it kinda tries to do something but it's not very convenient. For C++ objects i'm thinking of tracking a "whole structure symbol" artificially, so that it didn't have anything to do with the actual contents of the structure but more with its semantic meaning: it would be preserved by const operations (even if they mutate memory contents of mutable fields) or through copies/moves and additionally you would be able to attach state traits to it without thinking about manually modeling copies/moves.

I guess in your case, which seems to be more like a C world, the ad-hoc solution would be to do something like

    let's see...
    pos2.x comes from pos1...
    pos2.y also comes from pos1...
    aha, got it!
    the whole pos2 comes from pos1!

You will *anyway* have to do this because the programmer is free to copy the structure field-by-field manually instead of just assigning the structure. This would also happen in C++ if the structure has a non-trivial constructor. For the same reason it's not enough to check only 'x' but skip 'y': the programmer can easily overwrite one field but not the other field.

Finally, i'm surprised that it returns a UndefinedVal (i.e., in particular, it allows you to unwrap the Optional) instead of None. This sounds like a bug. But it might be because the structure does indeed have an undefined default binding (eg., this happens when it's allocated by malloc() or operator new). It'd make sense because assigning every field wouldn't overwrite the default binding. Which, in turn, should remind you that relying on the "structure symbol" in order to figure out what the contents of the structure are is not a good idea unless your structure is immutable and completely opaque or you somehow know that it's freshly created. But direct bindings to fields are actually always trustworthy. That's how our memory model works.


On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
On 6/26/19 5:29 PM, Artem Dergachev wrote:
You shouldn't do the "StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> Undefined" part; it's not what i suggested and it doesn't work because Arg0Reg is already "dead" and its value has been garbage-collected (i assume that by Arg0Reg you mean the VarRegion for pos2).

Like, if pos2 is never used later in the program, then we don't need to remember its value. So if move_to_pos(pos2) is the last use of pos2, we'll drop the binding and the Store would be empty.

Accessing dead regions will produce unexpected results; the code for getSVal()/getBinding() assumes that the region is live (or that the expression is active when you try to retrieve a value of an expression).

In this case it's like "hmm, the user is reading from a local variable, but it has no bindings, which means it has never been written to (otherwise i would have remembered it), which means that it's undefined behavior and contents of the variable are undefined". The real reason why Store doesn't remember any bindings is because it has correctly forgot about them.

This is why we include the (lazy) copy of the old Store with the LazyCompoundValue. In order to extract data from lazyCompoundVal{0x4182e58,pos2}, you need to load the variable from the Store 0x4182e58, *not* from the current Store. The region is still live in that old store, but in the current store it's no longer there.


This is why in order to obtain


Whoops, didn't finish my email :) I mean, the thing with `getDefaultBinding(LCV)` is that it knows that it needs to look up the value in the correct Store, so that's what you should use.



On 6/26/19 3:22 PM, Torry Chen wrote:
I tried State->dump() and it shows there is no default binding for a struct variable that copies another. See below. I certainly can (and should) use the field symbols for my work. I was curious about the internals of the analyzer engine. Thank you for the detailed explanation!

struct XY pos1 = next_pos(10, 20); // Binding: pos1 -> conj_$3{struct XY, LC1, S45538, #1}
move_to_pos(pos1);
// evalCall, State->dump():
//
// Store (direct and default bindings), 0x4176b88 :
// (GlobalInternalSpaceRegion,0,default) : conj_$1{int, LC1, S45538, #1}
// (GlobalSystemSpaceRegion,0,default) : conj_$2{int, LC1, S45538, #1}
// (pos1,0,default) : conj_$3{struct XY, LC1, S45538, #1}
//
// Expressions by stack frame:
// #0 Calling main                                                                                        
// (LC1, S45573) move_to_pos : &code{move_to_pos}
// (LC1, S45581) pos1 : lazyCompoundVal{0x4176b88,pos1}
//
// Ranges are empty.
//
// StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> conj_$3{struct XY, LC1, S45538, #1}

struct XY pos2 = pos1; // Binding: pos2 -> lazyCompoundVal{0x4176b88,pos1}
move_to_pos(pos2);
// evalCall, State->dump():
//
// Store (direct and default bindings), 0x4176b88 :
// (GlobalInternalSpaceRegion,0,default) : conj_$1{int, LC1, S45538, #1}
// (GlobalSystemSpaceRegion,0,default) : conj_$2{int, LC1, S45538, #1}
// (pos1,0,default) : conj_$3{struct XY, LC1, S45538, #1}
//
// Expressions by stack frame:
// #0 Calling main
// (LC1, S45618) move_to_pos : &code{move_to_pos}
// (LC1, S45626) pos2 : lazyCompoundVal{0x4182e58,pos2}
//
// Ranges are empty.
//
// StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> Undefined

On Wed, 26 Jun 2019 at 12:15, Artem Dergachev <[hidden email]> wrote:
Hmm, weird.

I suspect that assignment was handled with "small struct optimization", i.e. field-by-field rather than lazily (cf. RegionStoreManager::tryBindSmallStruct).

Could you do a State->dump() to verify that? If it shows that there's no default binding but instead there are two derived symbols bound to two different offsets, then the information about the "whole struct symbol" is already more or less lost: the static analyzer no longer remembers that this whole structure is the same as pos1, but it does remember that its fields, separately, are exactly the same as they were in pos1, which is what you see by looking at the fields separately.

Generally we don't have many checkers that track structures as a whole and we don't really know how *should* the checker API look like in order to make such checkers easy to implement. The only such checker that we have is IteratorChecker and it kinda tries to do something but it's not very convenient. For C++ objects i'm thinking of tracking a "whole structure symbol" artificially, so that it didn't have anything to do with the actual contents of the structure but more with its semantic meaning: it would be preserved by const operations (even if they mutate memory contents of mutable fields) or through copies/moves and additionally you would be able to attach state traits to it without thinking about manually modeling copies/moves.

I guess in your case, which seems to be more like a C world, the ad-hoc solution would be to do something like

    let's see...
    pos2.x comes from pos1...
    pos2.y also comes from pos1...
    aha, got it!
    the whole pos2 comes from pos1!

You will *anyway* have to do this because the programmer is free to copy the structure field-by-field manually instead of just assigning the structure. This would also happen in C++ if the structure has a non-trivial constructor. For the same reason it's not enough to check only 'x' but skip 'y': the programmer can easily overwrite one field but not the other field.

Finally, i'm surprised that it returns a UndefinedVal (i.e., in particular, it allows you to unwrap the Optional) instead of None. This sounds like a bug. But it might be because the structure does indeed have an undefined default binding (eg., this happens when it's allocated by malloc() or operator new). It'd make sense because assigning every field wouldn't overwrite the default binding. Which, in turn, should remind you that relying on the "structure symbol" in order to figure out what the contents of the structure are is not a good idea unless your structure is immutable and completely opaque or you somehow know that it's freshly created. But direct bindings to fields are actually always trustworthy. That's how our memory model works.


On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev





_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
Sorry for a mistake: I was trying both methods but pasted a wrong line in the previous email. The following is what I did with getDefaultBinding, still getting Undefined SVal when calling with pos2. Both pos1 and pos2 are used after the calls and they shouldn't be dead.

// evalCall for move_to_pos(struct XY pos):
ProgramStateRef State = C.getState();
const SVal Pos = C.getSVal(CE->getArg(0));
StoreManager &StoreMgr = State->getStateManager().getStoreManager();
auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
auto LCVal = StoreMgr.getDefaultBinding(*LCV);
LCVal->dump(); // --> conj_$3 when calling with pos1 but Undefined for pos2

// Code to check:
struct XY pos1 = next_pos(10, 20);
move_to_pos(pos1);
struct XY pos2 = pos1;
move_to_pos(pos2);
printf("X %ld Y %ld\n", pos1.X, pos1.Y);
printf("X %ld Y %ld\n", pos2.X, pos2.Y);

On Wed, 26 Jun 2019 at 17:31, Artem Dergachev <[hidden email]> wrote:
On 6/26/19 5:29 PM, Artem Dergachev wrote:
You shouldn't do the "StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> Undefined" part; it's not what i suggested and it doesn't work because Arg0Reg is already "dead" and its value has been garbage-collected (i assume that by Arg0Reg you mean the VarRegion for pos2).

Like, if pos2 is never used later in the program, then we don't need to remember its value. So if move_to_pos(pos2) is the last use of pos2, we'll drop the binding and the Store would be empty.

Accessing dead regions will produce unexpected results; the code for getSVal()/getBinding() assumes that the region is live (or that the expression is active when you try to retrieve a value of an expression).

In this case it's like "hmm, the user is reading from a local variable, but it has no bindings, which means it has never been written to (otherwise i would have remembered it), which means that it's undefined behavior and contents of the variable are undefined". The real reason why Store doesn't remember any bindings is because it has correctly forgot about them.

This is why we include the (lazy) copy of the old Store with the LazyCompoundValue. In order to extract data from lazyCompoundVal{0x4182e58,pos2}, you need to load the variable from the Store 0x4182e58, *not* from the current Store. The region is still live in that old store, but in the current store it's no longer there.


This is why in order to obtain


Whoops, didn't finish my email :) I mean, the thing with `getDefaultBinding(LCV)` is that it knows that it needs to look up the value in the correct Store, so that's what you should use.



On 6/26/19 3:22 PM, Torry Chen wrote:
I tried State->dump() and it shows there is no default binding for a struct variable that copies another. See below. I certainly can (and should) use the field symbols for my work. I was curious about the internals of the analyzer engine. Thank you for the detailed explanation!

struct XY pos1 = next_pos(10, 20); // Binding: pos1 -> conj_$3{struct XY, LC1, S45538, #1}
move_to_pos(pos1);
// evalCall, State->dump():
//
// Store (direct and default bindings), 0x4176b88 :
// (GlobalInternalSpaceRegion,0,default) : conj_$1{int, LC1, S45538, #1}
// (GlobalSystemSpaceRegion,0,default) : conj_$2{int, LC1, S45538, #1}
// (pos1,0,default) : conj_$3{struct XY, LC1, S45538, #1}
//
// Expressions by stack frame:
// #0 Calling main                                                                                        
// (LC1, S45573) move_to_pos : &code{move_to_pos}
// (LC1, S45581) pos1 : lazyCompoundVal{0x4176b88,pos1}
//
// Ranges are empty.
//
// StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> conj_$3{struct XY, LC1, S45538, #1}

struct XY pos2 = pos1; // Binding: pos2 -> lazyCompoundVal{0x4176b88,pos1}
move_to_pos(pos2);
// evalCall, State->dump():
//
// Store (direct and default bindings), 0x4176b88 :
// (GlobalInternalSpaceRegion,0,default) : conj_$1{int, LC1, S45538, #1}
// (GlobalSystemSpaceRegion,0,default) : conj_$2{int, LC1, S45538, #1}
// (pos1,0,default) : conj_$3{struct XY, LC1, S45538, #1}
//
// Expressions by stack frame:
// #0 Calling main
// (LC1, S45618) move_to_pos : &code{move_to_pos}
// (LC1, S45626) pos2 : lazyCompoundVal{0x4182e58,pos2}
//
// Ranges are empty.
//
// StoreMgr.getBinding(St, loc::MemRegionVal(Arg0Reg)) --> Undefined

On Wed, 26 Jun 2019 at 12:15, Artem Dergachev <[hidden email]> wrote:
Hmm, weird.

I suspect that assignment was handled with "small struct optimization", i.e. field-by-field rather than lazily (cf. RegionStoreManager::tryBindSmallStruct).

Could you do a State->dump() to verify that? If it shows that there's no default binding but instead there are two derived symbols bound to two different offsets, then the information about the "whole struct symbol" is already more or less lost: the static analyzer no longer remembers that this whole structure is the same as pos1, but it does remember that its fields, separately, are exactly the same as they were in pos1, which is what you see by looking at the fields separately.

Generally we don't have many checkers that track structures as a whole and we don't really know how *should* the checker API look like in order to make such checkers easy to implement. The only such checker that we have is IteratorChecker and it kinda tries to do something but it's not very convenient. For C++ objects i'm thinking of tracking a "whole structure symbol" artificially, so that it didn't have anything to do with the actual contents of the structure but more with its semantic meaning: it would be preserved by const operations (even if they mutate memory contents of mutable fields) or through copies/moves and additionally you would be able to attach state traits to it without thinking about manually modeling copies/moves.

I guess in your case, which seems to be more like a C world, the ad-hoc solution would be to do something like

    let's see...
    pos2.x comes from pos1...
    pos2.y also comes from pos1...
    aha, got it!
    the whole pos2 comes from pos1!

You will *anyway* have to do this because the programmer is free to copy the structure field-by-field manually instead of just assigning the structure. This would also happen in C++ if the structure has a non-trivial constructor. For the same reason it's not enough to check only 'x' but skip 'y': the programmer can easily overwrite one field but not the other field.

Finally, i'm surprised that it returns a UndefinedVal (i.e., in particular, it allows you to unwrap the Optional) instead of None. This sounds like a bug. But it might be because the structure does indeed have an undefined default binding (eg., this happens when it's allocated by malloc() or operator new). It'd make sense because assigning every field wouldn't overwrite the default binding. Which, in turn, should remind you that relying on the "structure symbol" in order to figure out what the contents of the structure are is not a good idea unless your structure is immutable and completely opaque or you somehow know that it's freshly created. But direct bindings to fields are actually always trustworthy. That's how our memory model works.


On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev





_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev
Mmm, weird. I tried and for me it crashes unwrapping an empty optional. My only guess is - do you build your clang with assertions enabled? Otherwise your checker would behave in undefined manner in this scenario. Could you check if the optional actually does contain a value?

On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
(in my case it's None because of small struct optimization; you see the value as lazyCompoundVal{0x5d4bb38,pos1} in checkBind but during the actual bind it gets unwrapped into two symbols)

On 6/27/19 2:22 PM, Artem Dergachev wrote:
Mmm, weird. I tried and for me it crashes unwrapping an empty optional. My only guess is - do you build your clang with assertions enabled? Otherwise your checker would behave in undefined manner in this scenario. Could you check if the optional actually does contain a value?

On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
I build my Clang in Release mode without assertions. For pos2, LCVal = StoreMgr.getDefaultBinding(*LCV) indeed returns None. I'm surprised LCVal->dump() didn't crash.

So this seems to be an expected behavior for getDefaultBinding() due to small struct optimization. Can I retrieve the two field symbols for pos2 in this case (in evalCall or checkPreCall)? Or you could point me to the code where the small struct optimization happens with unwrapped binds.

On Thu, 27 Jun 2019 at 14:32, Artem Dergachev <[hidden email]> wrote:
(in my case it's None because of small struct optimization; you see the value as lazyCompoundVal{0x5d4bb38,pos1} in checkBind but during the actual bind it gets unwrapped into two symbols)

On 6/27/19 2:22 PM, Artem Dergachev wrote:
Mmm, weird. I tried and for me it crashes unwrapping an empty optional. My only guess is - do you build your clang with assertions enabled? Otherwise your checker would behave in undefined manner in this scenario. Could you check if the optional actually does contain a value?

On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How to extract a symbol stored in LazyCompoundVal?

Joan Lluch via cfe-dev
Yes, totally, just take a MemRegionManager, construct the right FieldRegion of the LCV's parent region, and then do getBinding() of that region within the LCV's Store.

Small structures are unwrapped in RegionStoreManager::tryBindSmallStruct().

 

On 6/27/19 3:45 PM, Torry Chen wrote:
I build my Clang in Release mode without assertions. For pos2, LCVal = StoreMgr.getDefaultBinding(*LCV) indeed returns None. I'm surprised LCVal->dump() didn't crash.

So this seems to be an expected behavior for getDefaultBinding() due to small struct optimization. Can I retrieve the two field symbols for pos2 in this case (in evalCall or checkPreCall)? Or you could point me to the code where the small struct optimization happens with unwrapped binds.

On Thu, 27 Jun 2019 at 14:32, Artem Dergachev <[hidden email]> wrote:
(in my case it's None because of small struct optimization; you see the value as lazyCompoundVal{0x5d4bb38,pos1} in checkBind but during the actual bind it gets unwrapped into two symbols)

On 6/27/19 2:22 PM, Artem Dergachev wrote:
Mmm, weird. I tried and for me it crashes unwrapping an empty optional. My only guess is - do you build your clang with assertions enabled? Otherwise your checker would behave in undefined manner in this scenario. Could you check if the optional actually does contain a value?

On 6/25/19 9:10 PM, Torry Chen wrote:
Thank you Artem! It seems StoreManager::getDefaultBinding() won't work if the struct variable is copied. As shown below, getDefaultBinding() returns an undefined SVal.

I could go down into fields to get the derived symbols for X and Y respectively, and then use getParentSymbol() to get the symbol for the whole struct. This looks cumbersome though. Is there a more convenient way to get the symbol for the whole struct in this case?

// checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
struct XY pos1 = next_pos(10, 20);

// checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
struct XY pos2 = pos1;

move_to_pos(pos2);


/** evalCall for move_to_pos():
  SVal Pos = C.getSVal(CE->getArg(0));
  ProgramStateRef State = C.getState();
  StoreManager &StoreMgr = State->getStateManager().getStoreManager();
  auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
  SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
  LCSVal.dump() // <- Undefined
  ...
  const Store St = LCV->getCVData()->getStore();
  const SVal FieldSVal = StoreMgr.getBinding(St, loc::MemRegionVal(FieldReg));
  FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, #1},pos1->X}

  const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
  const auto ParentSym = SD->getParentSymbol();
  ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
**/


On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <[hidden email]> wrote:
The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a Store object. You can access it with getStore() and then read it with the help of a StoreManager.

Hmm, we seem to already have a convenient API for that, you can do StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly if all you need is a default-bound conjured symbol. But if you want to lookup, say, specific fields in the structure (X and Y separately), you'll need to do getBinding() on manually constructed FieldRegions (in your case it doesn't look very useful because the whole structure is conjured anyway).

I guess at this point you might like the chapter 5 of my old workbook (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf), as for now it seems to be the only place where different kinds of values are explained.


On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
My project has a struct type as follows and I'm writing a checker for some functions that take the struct value as an argument. In the checkPreCall function I see the argument is an LazyCompoundVal, not a symbol as it would be for a primitive type. I tried a few ways to extract the symbol from the LazyCompountVal with no luck. Hope to get some help here.

struct XY {
  uint64_t X;
  uint64_t Y;
};

...
// checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
struct XY pos1 = next_pos(...);  

// checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
move_to_pos(pos1);


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev





_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev