clang-format bug with u8 character literal

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

clang-format bug with u8 character literal

Richard Pennington via cfe-dev
Hi all,

clang-format does not work well with character literal u8 ' c-char '. Example

clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END

Produces

auto c1 = u8 'a';
auto c2 = u'a';

As you can see clang-format adds an additional space between u8 and 'a'.

clang-format --version
clang-format version 4.0.0 (http://llvm.org/git/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (http://llvm.org/git/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Thanks,
Denis Gladkikh

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clang-format bug with u8 character literal

Richard Pennington via cfe-dev
Seems like fix is very simple, clang-format currently does not respect c++1z at all. With patch below I could verify that it can recognize u8'a' as utf8 character literal. Without it it recognize it as two tokens: identifier and character literal. 
My guess that .clang-format should allow to specify LanguageStandard: LS_Cpp17 now, does it sound good? I can try to contribute to clang-format - that will be the first time for me - will be happy to try it out. 

tools/clang                                                                                                                         13:45:12 
╰─ svn diff
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;



On Mar 10, 2017, at 11:09 AM, via cfe-dev <[hidden email]> wrote:

Hi all,

clang-format does not work well with character literal u8 ' c-char '. Example

clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END

Produces

auto c1 = u8 'a';
auto c2 = u'a';

As you can see clang-format adds an additional space between u8 and 'a'.

clang-format --version
clang-format version 4.0.0 (http://llvm.org/git/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (http://llvm.org/git/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Thanks,
Denis Gladkikh
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clang-format bug with u8 character literal

Richard Pennington via cfe-dev
The patch you posted looks good to me. Can you add a test to unittests/Format/FormatTest.cpp too? (Run tests with `ninja FormatTests && ./tools/clang/unittests/Format/FormatTests`).


On Fri, Mar 10, 2017 at 4:52 PM, via cfe-dev <[hidden email]> wrote:
Seems like fix is very simple, clang-format currently does not respect c++1z at all. With patch below I could verify that it can recognize u8'a' as utf8 character literal. Without it it recognize it as two tokens: identifier and character literal. 
My guess that .clang-format should allow to specify LanguageStandard: LS_Cpp17 now, does it sound good? I can try to contribute to clang-format - that will be the first time for me - will be happy to try it out. 

tools/clang                                                                                                                         13:45:12 
╰─ svn diff
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;



On Mar 10, 2017, at 11:09 AM, via cfe-dev <[hidden email]> wrote:

Hi all,

clang-format does not work well with character literal u8 ' c-char '. Example

clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END

Produces

auto c1 = u8 'a';
auto c2 = u'a';

As you can see clang-format adds an additional space between u8 and 'a'.

clang-format --version
clang-format version 4.0.0 (http://llvm.org/git/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (http://llvm.org/git/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Thanks,
Denis Gladkikh
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clang-format bug with u8 character literal

Richard Pennington via cfe-dev
Of course, I also updated documentation to include that Cpp11 modes actually enabled C++14 and C++17 as well. Added two unit tests, one where Cpp03 works as before, and Cpp11 mode recognizes utf8 literal character. 
Thank you for looking into this so quickly!
I would assume that it is too late to include this change in clang 4.0.0, but maybe later this change will be merged also into 4.0.1 or something.

Index: docs/ClangFormatStyleOptions.rst
===================================================================
--- docs/ClangFormatStyleOptions.rst (revision 297506)
+++ docs/ClangFormatStyleOptions.rst (working copy)
@@ -944,7 +944,8 @@
     Use C++03-compatible syntax.
 
   * ``LS_Cpp11`` (in configuration: ``Cpp11``)
-    Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of 
+    ``A<A<int> >``).
 
   * ``LS_Auto`` (in configuration: ``Auto``)
     Automatic detection based on the input.
Index: include/clang/Format/Format.h
===================================================================
--- include/clang/Format/Format.h (revision 297506)
+++ include/clang/Format/Format.h (working copy)
@@ -786,7 +786,8 @@
   enum LanguageStandard {
     /// Use C++03-compatible syntax.
     LS_Cpp03,
-    /// Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    /// Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of
+    /// ``A<A<int> >``).
     LS_Cpp11,
     /// Automatic detection based on the input.
     LS_Auto
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;
Index: unittests/Format/FormatTest.cpp
===================================================================
--- unittests/Format/FormatTest.cpp (revision 297506)
+++ unittests/Format/FormatTest.cpp (working copy)
@@ -10111,6 +10111,19 @@
   EXPECT_EQ(Expected, *Result);
 }
 
+TEST_F(FormatTest, UTF8CharacterLiteralCpp03) {
+  format::FormatStyle Style = format::getLLVMStyle();
+  Style.Standard = FormatStyle::LS_Cpp03;
+  // cpp03 recognize this string as identifier u8 and literal character 'a'
+  EXPECT_EQ("auto c = u8 'a';", format("auto c = u8'a';", Style));
+}
+
+TEST_F(FormatTest, UTF8CharacterLiteralCpp11) {
+  // u8'a' is a C++17 feature, utf8 literal character, LS_Cpp11 covers
+  // all modes, including C++11, C++14 and C++17
+  EXPECT_EQ("auto c = u8'a';", format("auto c = u8'a';"));
+}
+
 } // end namespace
 } // end namespace format
 } // end namespace clang



On Mar 10, 2017, at 3:03 PM, Nico Weber <[hidden email]> wrote:

The patch you posted looks good to me. Can you add a test to unittests/Format/FormatTest.cpp too? (Run tests with `ninja FormatTests && ./tools/clang/unittests/Format/FormatTests`).


On Fri, Mar 10, 2017 at 4:52 PM, via cfe-dev <[hidden email]> wrote:
Seems like fix is very simple, clang-format currently does not respect c++1z at all. With patch below I could verify that it can recognize u8'a' as utf8 character literal. Without it it recognize it as two tokens: identifier and character literal. 
My guess that .clang-format should allow to specify LanguageStandard: LS_Cpp17 now, does it sound good? I can try to contribute to clang-format - that will be the first time for me - will be happy to try it out. 

tools/clang                                                                                                                         13:45:12 
╰─ svn diff
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;



On Mar 10, 2017, at 11:09 AM, via cfe-dev <[hidden email]> wrote:

Hi all,

clang-format does not work well with character literal u8 ' c-char '. Example

clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END

Produces

auto c1 = u8 'a';
auto c2 = u'a';

As you can see clang-format adds an additional space between u8 and 'a'.

clang-format --version
clang-format version 4.0.0 (http://llvm.org/git/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (http://llvm.org/git/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Thanks,
Denis Gladkikh
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clang-format bug with u8 character literal

Richard Pennington via cfe-dev
Hi Nico,

By any chance have you had a time to merge this change?

Thanks,
Denis Gladkikh


On Fri, Mar 10, 2017, at 05:25 PM, via cfe-dev wrote:
Of course, I also updated documentation to include that Cpp11 modes actually enabled C++14 and C++17 as well. Added two unit tests, one where Cpp03 works as before, and Cpp11 mode recognizes utf8 literal character. 
Thank you for looking into this so quickly!
I would assume that it is too late to include this change in clang 4.0.0, but maybe later this change will be merged also into 4.0.1 or something.

Index: docs/ClangFormatStyleOptions.rst
===================================================================
--- docs/ClangFormatStyleOptions.rst (revision 297506)
+++ docs/ClangFormatStyleOptions.rst (working copy)
@@ -944,7 +944,8 @@
     Use C++03-compatible syntax.
 
   * ``LS_Cpp11`` (in configuration: ``Cpp11``)
-    Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of 
+    ``A<A<int> >``).
 
   * ``LS_Auto`` (in configuration: ``Auto``)
     Automatic detection based on the input.
Index: include/clang/Format/Format.h
===================================================================
--- include/clang/Format/Format.h (revision 297506)
+++ include/clang/Format/Format.h (working copy)
@@ -786,7 +786,8 @@
   enum LanguageStandard {
     /// Use C++03-compatible syntax.
     LS_Cpp03,
-    /// Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    /// Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of
+    /// ``A<A<int> >``).
     LS_Cpp11,
     /// Automatic detection based on the input.
     LS_Auto
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;
Index: unittests/Format/FormatTest.cpp
===================================================================
--- unittests/Format/FormatTest.cpp (revision 297506)
+++ unittests/Format/FormatTest.cpp (working copy)
@@ -10111,6 +10111,19 @@
   EXPECT_EQ(Expected, *Result);
 }
 
+TEST_F(FormatTest, UTF8CharacterLiteralCpp03) {
+  format::FormatStyle Style = format::getLLVMStyle();
+  Style.Standard = FormatStyle::LS_Cpp03;
+  // cpp03 recognize this string as identifier u8 and literal character 'a'
+  EXPECT_EQ("auto c = u8 'a';", format("auto c = u8'a';", Style));
+}
+
+TEST_F(FormatTest, UTF8CharacterLiteralCpp11) {
+  // u8'a' is a C++17 feature, utf8 literal character, LS_Cpp11 covers
+  // all modes, including C++11, C++14 and C++17
+  EXPECT_EQ("auto c = u8'a';", format("auto c = u8'a';"));
+}
+
 } // end namespace
 } // end namespace format
 } // end namespace clang



On Mar 10, 2017, at 3:03 PM, Nico Weber <[hidden email]> wrote:

The patch you posted looks good to me. Can you add a test to unittests/Format/FormatTest.cpp too? (Run tests with `ninja FormatTests && ./tools/clang/unittests/Format/FormatTests`).


On Fri, Mar 10, 2017 at 4:52 PM, via cfe-dev <[hidden email]> wrote:
Seems like fix is very simple, clang-format currently does not respect c++1z at all. With patch below I could verify that it can recognize u8'a' as utf8 character literal. Without it it recognize it as two tokens: identifier and character literal. 
My guess that .clang-format should allow to specify LanguageStandard: LS_Cpp17 now, does it sound good? I can try to contribute to clang-format - that will be the first time for me - will be happy to try it out. 

tools/clang                                                                                                                         13:45:12 
╰─ svn diff
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;



On Mar 10, 2017, at 11:09 AM, via cfe-dev <[hidden email]> wrote:

Hi all,

clang-format does not work well with character literal u8 ' c-char '. Example

clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END

Produces

auto c1 = u8 'a';
auto c2 = u'a';

As you can see clang-format adds an additional space between u8 and 'a'.

clang-format --version
clang-format version 4.0.0 (http://llvm.org/git/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (http://llvm.org/git/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Thanks,
Denis Gladkikh
_______________________________________________
cfe-dev mailing list


_______________________________________________
cfe-dev mailing list



_______________________________________________
cfe-dev mailing list


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clang-format bug with u8 character literal

Richard Pennington via cfe-dev
Landed in r299574: http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20170403/189525.html

Sorry for the delay, I missed your first reply back in March.

On Mon, Apr 3, 2017 at 11:51 AM, via cfe-dev <[hidden email]> wrote:
Hi Nico,

By any chance have you had a time to merge this change?

Thanks,
Denis Gladkikh


On Fri, Mar 10, 2017, at 05:25 PM, via cfe-dev wrote:
Of course, I also updated documentation to include that Cpp11 modes actually enabled C++14 and C++17 as well. Added two unit tests, one where Cpp03 works as before, and Cpp11 mode recognizes utf8 literal character. 
Thank you for looking into this so quickly!
I would assume that it is too late to include this change in clang 4.0.0, but maybe later this change will be merged also into 4.0.1 or something.

Index: docs/ClangFormatStyleOptions.rst
===================================================================
--- docs/ClangFormatStyleOptions.rst (revision 297506)
+++ docs/ClangFormatStyleOptions.rst (working copy)
@@ -944,7 +944,8 @@
     Use C++03-compatible syntax.
 
   * ``LS_Cpp11`` (in configuration: ``Cpp11``)
-    Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of 
+    ``A<A<int> >``).
 
   * ``LS_Auto`` (in configuration: ``Auto``)
     Automatic detection based on the input.
Index: include/clang/Format/Format.h
===================================================================
--- include/clang/Format/Format.h (revision 297506)
+++ include/clang/Format/Format.h (working copy)
@@ -786,7 +786,8 @@
   enum LanguageStandard {
     /// Use C++03-compatible syntax.
     LS_Cpp03,
-    /// Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    /// Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of
+    /// ``A<A<int> >``).
     LS_Cpp11,
     /// Automatic detection based on the input.
     LS_Auto
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;
Index: unittests/Format/FormatTest.cpp
===================================================================
--- unittests/Format/FormatTest.cpp (revision 297506)
+++ unittests/Format/FormatTest.cpp (working copy)
@@ -10111,6 +10111,19 @@
   EXPECT_EQ(Expected, *Result);
 }
 
+TEST_F(FormatTest, UTF8CharacterLiteralCpp03) {
+  format::FormatStyle Style = format::getLLVMStyle();
+  Style.Standard = FormatStyle::LS_Cpp03;
+  // cpp03 recognize this string as identifier u8 and literal character 'a'
+  EXPECT_EQ("auto c = u8 'a';", format("auto c = u8'a';", Style));
+}
+
+TEST_F(FormatTest, UTF8CharacterLiteralCpp11) {
+  // u8'a' is a C++17 feature, utf8 literal character, LS_Cpp11 covers
+  // all modes, including C++11, C++14 and C++17
+  EXPECT_EQ("auto c = u8'a';", format("auto c = u8'a';"));
+}
+
 } // end namespace
 } // end namespace format
 } // end namespace clang



On Mar 10, 2017, at 3:03 PM, Nico Weber <[hidden email]> wrote:

The patch you posted looks good to me. Can you add a test to unittests/Format/FormatTest.cpp too? (Run tests with `ninja FormatTests && ./tools/clang/unittests/Format/FormatTests`).


On Fri, Mar 10, 2017 at 4:52 PM, via cfe-dev <[hidden email]> wrote:
Seems like fix is very simple, clang-format currently does not respect c++1z at all. With patch below I could verify that it can recognize u8'a' as utf8 character literal. Without it it recognize it as two tokens: identifier and character literal. 
My guess that .clang-format should allow to specify LanguageStandard: LS_Cpp17 now, does it sound good? I can try to contribute to clang-format - that will be the first time for me - will be happy to try it out. 

tools/clang                                                                                                                         13:45:12 
╰─ svn diff
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;



On Mar 10, 2017, at 11:09 AM, via cfe-dev <[hidden email]> wrote:

Hi all,

clang-format does not work well with character literal u8 ' c-char '. Example

clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END

Produces

auto c1 = u8 'a';
auto c2 = u'a';

As you can see clang-format adds an additional space between u8 and 'a'.

clang-format --version
clang-format version 4.0.0 (http://llvm.org/git/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (http://llvm.org/git/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Thanks,
Denis Gladkikh
_______________________________________________
cfe-dev mailing list


_______________________________________________
cfe-dev mailing list



_______________________________________________
cfe-dev mailing list


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clang-format bug with u8 character literal

Richard Pennington via cfe-dev
Awesome, thanks!
Any possibility to port this change in 4.0.x?


On Wed, Apr 5, 2017, at 11:24 AM, Nico Weber wrote:

Sorry for the delay, I missed your first reply back in March.

On Mon, Apr 3, 2017 at 11:51 AM, via cfe-dev <[hidden email]> wrote:

Hi Nico,

By any chance have you had a time to merge this change?

Thanks,
Denis Gladkikh


On Fri, Mar 10, 2017, at 05:25 PM, via cfe-dev wrote:
Of course, I also updated documentation to include that Cpp11 modes actually enabled C++14 and C++17 as well. Added two unit tests, one where Cpp03 works as before, and Cpp11 mode recognizes utf8 literal character. 
Thank you for looking into this so quickly!
I would assume that it is too late to include this change in clang 4.0.0, but maybe later this change will be merged also into 4.0.1 or something.

Index: docs/ClangFormatStyleOptions.rst
===================================================================
--- docs/ClangFormatStyleOptions.rst (revision 297506)
+++ docs/ClangFormatStyleOptions.rst (working copy)
@@ -944,7 +944,8 @@
     Use C++03-compatible syntax.
 
   * ``LS_Cpp11`` (in configuration: ``Cpp11``)
-    Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of 
+    ``A<A<int> >``).
 
   * ``LS_Auto`` (in configuration: ``Auto``)
     Automatic detection based on the input.
Index: include/clang/Format/Format.h
===================================================================
--- include/clang/Format/Format.h (revision 297506)
+++ include/clang/Format/Format.h (working copy)
@@ -786,7 +786,8 @@
   enum LanguageStandard {
     /// Use C++03-compatible syntax.
     LS_Cpp03,
-    /// Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    /// Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of
+    /// ``A<A<int> >``).
     LS_Cpp11,
     /// Automatic detection based on the input.
     LS_Auto
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;
Index: unittests/Format/FormatTest.cpp
===================================================================
--- unittests/Format/FormatTest.cpp (revision 297506)
+++ unittests/Format/FormatTest.cpp (working copy)
@@ -10111,6 +10111,19 @@
   EXPECT_EQ(Expected, *Result);
 }
 
+TEST_F(FormatTest, UTF8CharacterLiteralCpp03) {
+  format::FormatStyle Style = format::getLLVMStyle();
+  Style.Standard = FormatStyle::LS_Cpp03;
+  // cpp03 recognize this string as identifier u8 and literal character 'a'
+  EXPECT_EQ("auto c = u8 'a';", format("auto c = u8'a';", Style));
+}
+
+TEST_F(FormatTest, UTF8CharacterLiteralCpp11) {
+  // u8'a' is a C++17 feature, utf8 literal character, LS_Cpp11 covers
+  // all modes, including C++11, C++14 and C++17
+  EXPECT_EQ("auto c = u8'a';", format("auto c = u8'a';"));
+}
+
 } // end namespace
 } // end namespace format
 } // end namespace clang



On Mar 10, 2017, at 3:03 PM, Nico Weber <[hidden email]> wrote:

The patch you posted looks good to me. Can you add a test to unittests/Format/FormatTest.cpp too? (Run tests with `ninja FormatTests && ./tools/clang/unittests/Format/FormatTests`).


On Fri, Mar 10, 2017 at 4:52 PM, via cfe-dev <[hidden email]> wrote:
Seems like fix is very simple, clang-format currently does not respect c++1z at all. With patch below I could verify that it can recognize u8'a' as utf8 character literal. Without it it recognize it as two tokens: identifier and character literal. 
My guess that .clang-format should allow to specify LanguageStandard: LS_Cpp17 now, does it sound good? I can try to contribute to clang-format - that will be the first time for me - will be happy to try it out. 

tools/clang                                                                                                                         13:45:12 
╰─ svn diff
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;



On Mar 10, 2017, at 11:09 AM, via cfe-dev <[hidden email]> wrote:

Hi all,

clang-format does not work well with character literal u8 ' c-char '. Example

clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END

Produces

auto c1 = u8 'a';
auto c2 = u'a';

As you can see clang-format adds an additional space between u8 and 'a'.

clang-format --version
clang-format version 4.0.0 (http://llvm.org/git/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (http://llvm.org/git/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Thanks,
Denis Gladkikh
_______________________________________________
cfe-dev mailing list


_______________________________________________
cfe-dev mailing list



_______________________________________________
cfe-dev mailing list


_______________________________________________
cfe-dev mailing list



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: clang-format bug with u8 character literal

Richard Pennington via cfe-dev
http://clang-developers.42468.n3.nabble.com/4-0-1-Release-Schedule-Need-feedback-for-improving-stable-releases-td4056003.html documents how to request merges to 4.0.1 as far as I can tell. If you have an llvm bugzilla login, you can request a merge yourself.

On Wed, Apr 5, 2017 at 2:32 PM, <[hidden email]> wrote:
Awesome, thanks!
Any possibility to port this change in 4.0.x?


On Wed, Apr 5, 2017, at 11:24 AM, Nico Weber wrote:

Sorry for the delay, I missed your first reply back in March.

On Mon, Apr 3, 2017 at 11:51 AM, via cfe-dev <[hidden email]> wrote:

Hi Nico,

By any chance have you had a time to merge this change?

Thanks,
Denis Gladkikh


On Fri, Mar 10, 2017, at 05:25 PM, via cfe-dev wrote:
Of course, I also updated documentation to include that Cpp11 modes actually enabled C++14 and C++17 as well. Added two unit tests, one where Cpp03 works as before, and Cpp11 mode recognizes utf8 literal character. 
Thank you for looking into this so quickly!
I would assume that it is too late to include this change in clang 4.0.0, but maybe later this change will be merged also into 4.0.1 or something.

Index: docs/ClangFormatStyleOptions.rst
===================================================================
--- docs/ClangFormatStyleOptions.rst (revision 297506)
+++ docs/ClangFormatStyleOptions.rst (working copy)
@@ -944,7 +944,8 @@
     Use C++03-compatible syntax.
 
   * ``LS_Cpp11`` (in configuration: ``Cpp11``)
-    Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of 
+    ``A<A<int> >``).
 
   * ``LS_Auto`` (in configuration: ``Auto``)
     Automatic detection based on the input.
Index: include/clang/Format/Format.h
===================================================================
--- include/clang/Format/Format.h (revision 297506)
+++ include/clang/Format/Format.h (working copy)
@@ -786,7 +786,8 @@
   enum LanguageStandard {
     /// Use C++03-compatible syntax.
     LS_Cpp03,
-    /// Use features of C++11 (e.g. ``A<A<int>>`` instead of ``A<A<int> >``).
+    /// Use features of C++11, C++14 and C++1z (e.g. ``A<A<int>>`` instead of
+    /// ``A<A<int> >``).
     LS_Cpp11,
     /// Automatic detection based on the input.
     LS_Auto
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;
Index: unittests/Format/FormatTest.cpp
===================================================================
--- unittests/Format/FormatTest.cpp (revision 297506)
+++ unittests/Format/FormatTest.cpp (working copy)
@@ -10111,6 +10111,19 @@
   EXPECT_EQ(Expected, *Result);
 }
 
+TEST_F(FormatTest, UTF8CharacterLiteralCpp03) {
+  format::FormatStyle Style = format::getLLVMStyle();
+  Style.Standard = FormatStyle::LS_Cpp03;
+  // cpp03 recognize this string as identifier u8 and literal character 'a'
+  EXPECT_EQ("auto c = u8 'a';", format("auto c = u8'a';", Style));
+}
+
+TEST_F(FormatTest, UTF8CharacterLiteralCpp11) {
+  // u8'a' is a C++17 feature, utf8 literal character, LS_Cpp11 covers
+  // all modes, including C++11, C++14 and C++17
+  EXPECT_EQ("auto c = u8'a';", format("auto c = u8'a';"));
+}
+
 } // end namespace
 } // end namespace format
 } // end namespace clang



On Mar 10, 2017, at 3:03 PM, Nico Weber <[hidden email]> wrote:

The patch you posted looks good to me. Can you add a test to unittests/Format/FormatTest.cpp too? (Run tests with `ninja FormatTests && ./tools/clang/unittests/Format/FormatTests`).


On Fri, Mar 10, 2017 at 4:52 PM, via cfe-dev <[hidden email]> wrote:
Seems like fix is very simple, clang-format currently does not respect c++1z at all. With patch below I could verify that it can recognize u8'a' as utf8 character literal. Without it it recognize it as two tokens: identifier and character literal. 
My guess that .clang-format should allow to specify LanguageStandard: LS_Cpp17 now, does it sound good? I can try to contribute to clang-format - that will be the first time for me - will be happy to try it out. 

tools/clang                                                                                                                         13:45:12 
╰─ svn diff
Index: lib/Format/Format.cpp
===================================================================
--- lib/Format/Format.cpp (revision 297506)
+++ lib/Format/Format.cpp (working copy)
@@ -1893,6 +1893,7 @@
   LangOpts.CPlusPlus = 1;
   LangOpts.CPlusPlus11 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.CPlusPlus14 = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
+  LangOpts.CPlusPlus1z = Style.Standard == FormatStyle::LS_Cpp03 ? 0 : 1;
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.IsCpp();
   LangOpts.CXXOperatorNames = AlternativeOperators ? 1 : 0;



On Mar 10, 2017, at 11:09 AM, via cfe-dev <[hidden email]> wrote:

Hi all,

clang-format does not work well with character literal u8 ' c-char '. Example

clang-format <<END
auto c1 = u8'a';
auto c2 = u'a';
END

Produces

auto c1 = u8 'a';
auto c2 = u'a';

As you can see clang-format adds an additional space between u8 and 'a'.

clang-format --version
clang-format version 4.0.0 (http://llvm.org/git/clang.git 559aa046fe3260d8640791f2249d7b0d458b5700) (http://llvm.org/git/llvm.git 4423e351176a92975739dd4ea43c2ff5877236ae)

Not sure if this is the best place to report a bug.

Thanks,
Denis Gladkikh
_______________________________________________
cfe-dev mailing list


_______________________________________________
cfe-dev mailing list



_______________________________________________
cfe-dev mailing list


_______________________________________________
cfe-dev mailing list




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Loading...