Assertion on valid use of highlighting whitespace

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Assertion on valid use of highlighting whitespace

Tom Care
Hi all,

Whilst fixing bug 7377, I ran into an assertion when trying to highlight code in the following warning:

flag ' ' results in undefined behavior in 'p' conversion specifier

triggered by: printf("% p", p);

Assertion failed: (StartColNo <= EndColNo && "Trying to highlight whitespace??"), function HighlightRange, file /Volumes/Data/Users/tcare/Projects/llvm/tools/clang/lib/Frontend/TextDiagnosticPrinter.cpp, line 138.

The assertion is triggered by trying to highlight the format specifier ("% p")

This assertion seems valid, except for this rare case. Maybe there are others?

Any ideas on how to fix this?

Tom
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Tom Care
I think there might be other things broken in this function too.

I'm having problems trying to get highlighting/fixits working for parts of printf statements (rather than the whole thing).

Currently, if I give a SourceLocation within a printf formatting string, it will highlight it from the beginning until the end of the format string (rather than the end I specified). For example:

/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~~

When what I expect is:
/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~

This ends up breaking the fixits, which delete the rest of the format string rather than the invalid part.

The offending line is lib/Frontend/TextDiagnosticPrinter.cpp:116
      // Add in the length of the token, so that we cover multi-char tokens.
      EndColNo += Lexer::MeasureTokenLength(End, SM, *LangOpts);

I'm sure this line has importance, but I don't think it is correct.

I'm a bit lost in this section! Does anyone have any ideas?

Tom
On Jun 17, 2010, at 12:05 PM, Tom Care wrote:

Hi all,

Whilst fixing bug 7377, I ran into an assertion when trying to highlight code in the following warning:

flag ' ' results in undefined behavior in 'p' conversion specifier

triggered by: printf("% p", p);

Assertion failed: (StartColNo <= EndColNo && "Trying to highlight whitespace??"), function HighlightRange, file /Volumes/Data/Users/tcare/Projects/llvm/tools/clang/lib/Frontend/TextDiagnosticPrinter.cpp, line 138.

The assertion is triggered by trying to highlight the format specifier ("% p")

This assertion seems valid, except for this rare case. Maybe there are others?

Any ideas on how to fix this?

Tom
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Douglas Gregor

On Jun 17, 2010, at 7:47 PM, Tom Care wrote:

I think there might be other things broken in this function too.

I'm having problems trying to get highlighting/fixits working for parts of printf statements (rather than the whole thing).

Currently, if I give a SourceLocation within a printf formatting string, it will highlight it from the beginning until the end of the format string (rather than the end I specified). For example:

/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~~

When what I expect is:
/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~

This ends up breaking the fixits, which delete the rest of the format string rather than the invalid part.

The offending line is lib/Frontend/TextDiagnosticPrinter.cpp:116
      // Add in the length of the token, so that we cover multi-char tokens.
      EndColNo += Lexer::MeasureTokenLength(End, SM, *LangOpts);

I'm sure this line has importance, but I don't think it is correct.

I'm a bit lost in this section! Does anyone have any ideas?

The mental model for a source range is that it is a pair of source locations [B, E), where both B and E are expected to point at the beginning of the token. That line above is relexing the token at the location E to find the end of the token. Since you've adjusted the ranges to point at specific characters, when you're getting is E adjusted to the end of the "token" where you're pointing... which is in the middle of a string literal, hence the somewhat odd highlighting behavior.

We would need to introduce a new notion (e.g., a new "CharacterSourceRange") that forgoes this adjustment and treats the ending source location in the range as character-level positions in the source rather than token-level positions. CharacterSourceRange would only be used in limited places when needed, such as FixItHint and perhaps Diagnostic, and all affected clients would need to be updated.

  - Doug


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Chris Lattner
I'll implement this.

On Jun 18, 2010, at 1:26 PM, Douglas Gregor wrote:


On Jun 17, 2010, at 7:47 PM, Tom Care wrote:

I think there might be other things broken in this function too.

I'm having problems trying to get highlighting/fixits working for parts of printf statements (rather than the whole thing).

Currently, if I give a SourceLocation within a printf formatting string, it will highlight it from the beginning until the end of the format string (rather than the end I specified). For example:

/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~~

When what I expect is:
/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~

This ends up breaking the fixits, which delete the rest of the format string rather than the invalid part.

The offending line is lib/Frontend/TextDiagnosticPrinter.cpp:116
      // Add in the length of the token, so that we cover multi-char tokens.
      EndColNo += Lexer::MeasureTokenLength(End, SM, *LangOpts);

I'm sure this line has importance, but I don't think it is correct.

I'm a bit lost in this section! Does anyone have any ideas?

The mental model for a source range is that it is a pair of source locations [B, E), where both B and E are expected to point at the beginning of the token. That line above is relexing the token at the location E to find the end of the token. Since you've adjusted the ranges to point at specific characters, when you're getting is E adjusted to the end of the "token" where you're pointing... which is in the middle of a string literal, hence the somewhat odd highlighting behavior.

We would need to introduce a new notion (e.g., a new "CharacterSourceRange") that forgoes this adjustment and treats the ending source location in the range as character-level positions in the source rather than token-level positions. CharacterSourceRange would only be used in limited places when needed, such as FixItHint and perhaps Diagnostic, and all affected clients would need to be updated.

  - Doug

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Chris Lattner
In reply to this post by Douglas Gregor
r106338 should do this.  Please let me know if you run into any problems with it.

-Chris

On Jun 18, 2010, at 1:26 PM, Douglas Gregor wrote:


On Jun 17, 2010, at 7:47 PM, Tom Care wrote:

I think there might be other things broken in this function too.

I'm having problems trying to get highlighting/fixits working for parts of printf statements (rather than the whole thing).

Currently, if I give a SourceLocation within a printf formatting string, it will highlight it from the beginning until the end of the format string (rather than the end I specified). For example:

/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~~

When what I expect is:
/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~

This ends up breaking the fixits, which delete the rest of the format string rather than the invalid part.

The offending line is lib/Frontend/TextDiagnosticPrinter.cpp:116
      // Add in the length of the token, so that we cover multi-char tokens.
      EndColNo += Lexer::MeasureTokenLength(End, SM, *LangOpts);

I'm sure this line has importance, but I don't think it is correct.

I'm a bit lost in this section! Does anyone have any ideas?

The mental model for a source range is that it is a pair of source locations [B, E), where both B and E are expected to point at the beginning of the token. That line above is relexing the token at the location E to find the end of the token. Since you've adjusted the ranges to point at specific characters, when you're getting is E adjusted to the end of the "token" where you're pointing... which is in the middle of a string literal, hence the somewhat odd highlighting behavior.

We would need to introduce a new notion (e.g., a new "CharacterSourceRange") that forgoes this adjustment and treats the ending source location in the range as character-level positions in the source rather than token-level positions. CharacterSourceRange would only be used in limited places when needed, such as FixItHint and perhaps Diagnostic, and all affected clients would need to be updated.

  - Doug

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Tom Care
Hi Chris,

Works great except now I'm getting off by 1 problems at the end of the string:

/tmp/fixit.c:41:12: warning: flag '0' is ignored when flag '-' is present
  printf("%0-f", 1.23);
          ~^~

Should my SourceLocation for the end byte be pointing to the 'f' or the '"'? It's currently pointing to the 'f'.

Tom

On Jun 18, 2010, at 3:45 PM, Chris Lattner wrote:

r106338 should do this.  Please let me know if you run into any problems with it.

-Chris

On Jun 18, 2010, at 1:26 PM, Douglas Gregor wrote:


On Jun 17, 2010, at 7:47 PM, Tom Care wrote:

I think there might be other things broken in this function too.

I'm having problems trying to get highlighting/fixits working for parts of printf statements (rather than the whole thing).

Currently, if I give a SourceLocation within a printf formatting string, it will highlight it from the beginning until the end of the format string (rather than the end I specified). For example:

/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~~

When what I expect is:
/tmp/fixit.c:13:15: warning: precision used with 'n' conversion specifier,
      resulting in undefined behavior [-Wformat]
  printf("%100.100n", (int*) 0);
              ^~~~

This ends up breaking the fixits, which delete the rest of the format string rather than the invalid part.

The offending line is lib/Frontend/TextDiagnosticPrinter.cpp:116
      // Add in the length of the token, so that we cover multi-char tokens.
      EndColNo += Lexer::MeasureTokenLength(End, SM, *LangOpts);

I'm sure this line has importance, but I don't think it is correct.

I'm a bit lost in this section! Does anyone have any ideas?

The mental model for a source range is that it is a pair of source locations [B, E), where both B and E are expected to point at the beginning of the token. That line above is relexing the token at the location E to find the end of the token. Since you've adjusted the ranges to point at specific characters, when you're getting is E adjusted to the end of the "token" where you're pointing... which is in the middle of a string literal, hence the somewhat odd highlighting behavior.

We would need to introduce a new notion (e.g., a new "CharacterSourceRange") that forgoes this adjustment and treats the ending source location in the range as character-level positions in the source rather than token-level positions. CharacterSourceRange would only be used in limited places when needed, such as FixItHint and perhaps Diagnostic, and all affected clients would need to be updated.

  - Doug

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Chris Lattner

On Jun 21, 2010, at 11:17 AM, Tom Care wrote:

Hi Chris,

Works great except now I'm getting off by 1 problems at the end of the string:

/tmp/fixit.c:41:12: warning: flag '0' is ignored when flag '-' is present
  printf("%0-f", 1.23);
          ~^~

Should my SourceLocation for the end byte be pointing to the 'f' or the '"'? It's currently pointing to the 'f'.

You tell me.  Right now, the ranges are set up for half open ranges [start, end).  If you want fully inclusive ranges [start, end] I can make that work, just let me know what feels most natural from a client standpoint.

-Chris



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Douglas Gregor

On Jun 21, 2010, at 11:36 AM, Chris Lattner wrote:


On Jun 21, 2010, at 11:17 AM, Tom Care wrote:

Hi Chris,

Works great except now I'm getting off by 1 problems at the end of the string:

/tmp/fixit.c:41:12: warning: flag '0' is ignored when flag '-' is present
  printf("%0-f", 1.23);
          ~^~

Should my SourceLocation for the end byte be pointing to the 'f' or the '"'? It's currently pointing to the 'f'.

You tell me.  Right now, the ranges are set up for half open ranges [start, end).  If you want fully inclusive ranges [start, end] I can make that work, just let me know what feels most natural from a client standpoint.

Please keep it as half-open!

- Doug


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Chris Lattner

On Jun 21, 2010, at 11:43 AM, Douglas Gregor wrote:


On Jun 21, 2010, at 11:36 AM, Chris Lattner wrote:


On Jun 21, 2010, at 11:17 AM, Tom Care wrote:

Hi Chris,

Works great except now I'm getting off by 1 problems at the end of the string:

/tmp/fixit.c:41:12: warning: flag '0' is ignored when flag '-' is present
  printf("%0-f", 1.23);
          ~^~

Should my SourceLocation for the end byte be pointing to the 'f' or the '"'? It's currently pointing to the 'f'.

You tell me.  Right now, the ranges are set up for half open ranges [start, end).  If you want fully inclusive ranges [start, end] I can make that work, just let me know what feels most natural from a client standpoint.

Please keep it as half-open!

Ok.  Tom, please add one! :)

-Chris



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Assertion on valid use of highlighting whitespace

Tom Care
OK no problem. I'll update SemaChecking to handle this.

Tom

On Jun 21, 2010, at 12:50 PM, Chris Lattner wrote:


On Jun 21, 2010, at 11:43 AM, Douglas Gregor wrote:


On Jun 21, 2010, at 11:36 AM, Chris Lattner wrote:


On Jun 21, 2010, at 11:17 AM, Tom Care wrote:

Hi Chris,

Works great except now I'm getting off by 1 problems at the end of the string:

/tmp/fixit.c:41:12: warning: flag '0' is ignored when flag '-' is present
  printf("%0-f", 1.23);
          ~^~

Should my SourceLocation for the end byte be pointing to the 'f' or the '"'? It's currently pointing to the 'f'.

You tell me.  Right now, the ranges are set up for half open ranges [start, end).  If you want fully inclusive ranges [start, end] I can make that work, just let me know what feels most natural from a client standpoint.

Please keep it as half-open!

Ok.  Tom, please add one! :)

-Chris




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev