Source code documentation

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Source code documentation

Stefan Seefeld
Hello,

I have been watching the CLang project for a while, as I'm interested in
using it for my own project. I'm developing Synopsis
(http://synopsis.fresco.org), which started as a multi-language code
documentation tool, but became actually quite a bit more powerful.

Synopsis is a very modular tool, written in Python, which loads
different parsers (Python, IDL, C, C++) to generate an internal
representation (parse tree, semantic graph, etc.), which then is further
transformed or processed, such as into API documentation, or even new
source code.

I'd be very interested in using CLang as parser backend for C and C++,
and possibly even more for AST transformations (such as code generation).

I read on http://clang.llvm.org/OpenProjects.html that there are plans
to write a code documentation tool based on CLang, so I'd like to know
whether any such work has already started, so as to avoid duplicating
effort.

Also, would anybody be interested in bindings between Synopsis and
CLang, even to the point of - gasp - helping ? :-)

Thanks,
         Stefan


--

       ...ich hab' noch einen Koffer in Berlin...

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Source code documentation

Douglas Gregor

On Mar 9, 2010, at 8:56 AM, Stefan Seefeld wrote:

> Hello,
>
> I have been watching the CLang project for a while, as I'm interested in
> using it for my own project. I'm developing Synopsis
> (http://synopsis.fresco.org), which started as a multi-language code
> documentation tool, but became actually quite a bit more powerful.
>
> Synopsis is a very modular tool, written in Python, which loads
> different parsers (Python, IDL, C, C++) to generate an internal
> representation (parse tree, semantic graph, etc.), which then is further
> transformed or processed, such as into API documentation, or even new
> source code.
>
> I'd be very interested in using CLang as parser backend for C and C++,
> and possibly even more for AST transformations (such as code generation).

That would be *great*.

> I read on http://clang.llvm.org/OpenProjects.html that there are plans
> to write a code documentation tool based on CLang, so I'd like to know
> whether any such work has already started, so as to avoid duplicating
> effort.

No, there hasn't been any work in this area. It's a long-standing wish.

> Also, would anybody be interested in bindings between Synopsis and
> CLang, even to the point of - gasp - helping ? :-)


You should check out the Python bindings we have for the "CIndex" library. They'll obviously need extensions to capture enough of the AST for C++, but that's in the grand plan anyway: to provide a stable interface to explore (but not transform or modify) Clang's AST. If a documentation tool like Synopsis can't use the CIndex library for some reason, CIndex should be extended.

        - Doug


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Source code documentation

Stefan Seefeld
On 03/09/2010 12:03 PM, Douglas Gregor wrote:

> On Mar 9, 2010, at 8:56 AM, Stefan Seefeld wrote:
>
>    
>> Hello,
>>
>> I have been watching the CLang project for a while, as I'm interested in
>> using it for my own project. I'm developing Synopsis
>> (http://synopsis.fresco.org), which started as a multi-language code
>> documentation tool, but became actually quite a bit more powerful.
>>
>> Synopsis is a very modular tool, written in Python, which loads
>> different parsers (Python, IDL, C, C++) to generate an internal
>> representation (parse tree, semantic graph, etc.), which then is further
>> transformed or processed, such as into API documentation, or even new
>> source code.
>>
>> I'd be very interested in using CLang as parser backend for C and C++,
>> and possibly even more for AST transformations (such as code generation).
>>      
> That would be *great*.
>    

Glad to hear that you agree. :-)

Does LLVM participate in GSoC this year ? If so, could we formulate a
project that helps with this (quite substantial) work ?


>> I read on http://clang.llvm.org/OpenProjects.html that there are plans
>> to write a code documentation tool based on CLang, so I'd like to know
>> whether any such work has already started, so as to avoid duplicating
>> effort.
>>      
> No, there hasn't been any work in this area. It's a long-standing wish.
>    

OK.


>> Also, would anybody be interested in bindings between Synopsis and
>> CLang, even to the point of - gasp - helping ? :-)
>>      
>
> You should check out the Python bindings we have for the "CIndex" library. They'll obviously need extensions to capture enough of the AST for C++, but that's in the grand plan anyway: to provide a stable interface to explore (but not transform or modify) Clang's AST. If a documentation tool like Synopsis can't use the CIndex library for some reason, CIndex should be extended.
>    

OK, I will have a look. Given that Synopsis has its own representation
(an ASG), I think a first step would be to translate the CIndex-based
representation produced by CLang into ASG, so as not to disrupt too much
at once.
Then we can look into the two representations to see whether a copy /
translation can be avoided without breaking other features (such as
Synopsis' support for other languages).

Thanks,
         Stefan

--

       ...ich hab' noch einen Koffer in Berlin...

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Source code documentation

Anton Korobeynikov
Hello, Stefan

> Glad to hear that you agree. :-)
> Does LLVM participate in GSoC this year ? If so, could we formulate a
> project that helps with this (quite substantial) work ?
Yes and yes :)

--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Source code documentation

Douglas Gregor
In reply to this post by Stefan Seefeld

On Mar 9, 2010, at 12:20 PM, Stefan Seefeld wrote:

> On 03/09/2010 12:03 PM, Douglas Gregor wrote:
>> On Mar 9, 2010, at 8:56 AM, Stefan Seefeld wrote:
>>
>>  
>>> Hello,
>>>
>>> I have been watching the CLang project for a while, as I'm interested in
>>> using it for my own project. I'm developing Synopsis
>>> (http://synopsis.fresco.org), which started as a multi-language code
>>> documentation tool, but became actually quite a bit more powerful.
>>>
>>> Synopsis is a very modular tool, written in Python, which loads
>>> different parsers (Python, IDL, C, C++) to generate an internal
>>> representation (parse tree, semantic graph, etc.), which then is further
>>> transformed or processed, such as into API documentation, or even new
>>> source code.
>>>
>>> I'd be very interested in using CLang as parser backend for C and C++,
>>> and possibly even more for AST transformations (such as code generation).
>>>    
>> That would be *great*.
>>  
>
> Glad to hear that you agree. :-)
>
> Does LLVM participate in GSoC this year ? If so, could we formulate a project that helps with this (quite substantial) work ?

Yes and yes!

>>> Also, would anybody be interested in bindings between Synopsis and
>>> CLang, even to the point of - gasp - helping ? :-)
>>>    
>>
>> You should check out the Python bindings we have for the "CIndex" library. They'll obviously need extensions to capture enough of the AST for C++, but that's in the grand plan anyway: to provide a stable interface to explore (but not transform or modify) Clang's AST. If a documentation tool like Synopsis can't use the CIndex library for some reason, CIndex should be extended.
>>  
>
> OK, I will have a look. Given that Synopsis has its own representation (an ASG), I think a first step would be to translate the CIndex-based representation produced by CLang into ASG, so as not to disrupt too much at once.
> Then we can look into the two representations to see whether a copy / translation can be avoided without breaking other features (such as Synopsis' support for other languages).


That makes sense. Comment parsing will all be done within Synopsis, I assume?

        - Doug
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Source code documentation

Stefan Seefeld
On 03/09/2010 04:17 PM, Douglas Gregor wrote:

> On Mar 9, 2010, at 12:20 PM, Stefan Seefeld wrote:
>
>    
>> On 03/09/2010 12:03 PM, Douglas Gregor wrote:
>>      
>>> On Mar 9, 2010, at 8:56 AM, Stefan Seefeld wrote:
>>>
>>>        
>> Does LLVM participate in GSoC this year ? If so, could we formulate a project that helps with this (quite substantial) work ?
>>      
> Yes and yes!
>    

OK, great. Let me play with the code a bit, then we may talk about how
this project could shape up.

> That makes sense. Comment parsing will all be done within Synopsis, I assume?
>    

Yes. At present the parser attaches comments to the next declaration it
finds, from where Synopsis then picks it up to process it further
(extract processing instructions, documentation, whatever).

Also, in one mode of operation Synopsis wants to get a position-correct
picture of the entire preprocessed source file, so it can generate a
hyperlinked and otherwise styled version of it. Does CLang provide this
level of detail ?

Thanks,
         Stefan


--

       ...ich hab' noch einen Koffer in Berlin...

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Source code documentation

Douglas Gregor

On Mar 9, 2010, at 4:39 PM, Stefan Seefeld wrote:

> On 03/09/2010 04:17 PM, Douglas Gregor wrote:
>> On Mar 9, 2010, at 12:20 PM, Stefan Seefeld wrote:
>>
>>  
>>> On 03/09/2010 12:03 PM, Douglas Gregor wrote:
>>>    
>>>> On Mar 9, 2010, at 8:56 AM, Stefan Seefeld wrote:
>>>>
>>>>      
>>> Does LLVM participate in GSoC this year ? If so, could we formulate a project that helps with this (quite substantial) work ?
>>>    
>> Yes and yes!
>>  
>
> OK, great. Let me play with the code a bit, then we may talk about how this project could shape up.

Sounds good.

>> That makes sense. Comment parsing will all be done within Synopsis, I assume?
>>  
>
> Yes. At present the parser attaches comments to the next declaration it finds, from where Synopsis then picks it up to process it further (extract processing instructions, documentation, whatever).

Okay. We don't really have this functionality in Clang yet. Comments are passed through to the AST consumer, and we have a hack that tries to find the comment associated with a declaration after the fact, but this will need work.

> Also, in one mode of operation Synopsis wants to get a position-correct picture of the entire preprocessed source file, so it can generate a hyperlinked and otherwise styled version of it. Does CLang provide this level of detail ?


Internally, yes. There isn't enough information exposed via the CIndex interface to do this (but I'd support extending CIndex in this direction).

        - Doug
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Source code documentation

Stefan Seefeld
On 03/09/2010 04:42 PM, Douglas Gregor wrote:
>> Yes. At present the parser attaches comments to the next declaration it finds, from where Synopsis then picks it up to process it further (extract processing instructions, documentation, whatever).
>>      
> Okay. We don't really have this functionality in Clang yet. Comments are passed through to the AST consumer, and we have a hack that tries to find the comment associated with a declaration after the fact, but this will need work.
>    

OK.

>> Also, in one mode of operation Synopsis wants to get a position-correct picture of the entire preprocessed source file, so it can generate a hyperlinked and otherwise styled version of it. Does CLang provide this level of detail ?
>>      
>
> Internally, yes. There isn't enough information exposed via the CIndex interface to do this (but I'd support extending CIndex in this direction).
>    

OK. Without knowing CIndex, I'm not sure how useful it is to support
such different levels of details through the same representation. For
example, to generate a hyperlinked source tree, I'd operate on something
close to the parse tree, i.e. individual tokens.

But for the documentation, a much more high-level view is useful, such
as a syntax tree or even a semantic graph.

Do you think all of those will be represented by CIndex, eventually ?

Thanks,
         Stefan

--

       ...ich hab' noch einen Koffer in Berlin...

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Source code documentation

Douglas Gregor

On Mar 9, 2010, at 4:55 PM, Stefan Seefeld wrote:

> On 03/09/2010 04:42 PM, Douglas Gregor wrote:
>>> Yes. At present the parser attaches comments to the next declaration it finds, from where Synopsis then picks it up to process it further (extract processing instructions, documentation, whatever).
>>>    
>> Okay. We don't really have this functionality in Clang yet. Comments are passed through to the AST consumer, and we have a hack that tries to find the comment associated with a declaration after the fact, but this will need work.
>>  
>
> OK.
>
>>> Also, in one mode of operation Synopsis wants to get a position-correct picture of the entire preprocessed source file, so it can generate a hyperlinked and otherwise styled version of it. Does CLang provide this level of detail ?
>>>    
>>
>> Internally, yes. There isn't enough information exposed via the CIndex interface to do this (but I'd support extending CIndex in this direction).
>>  
>
> OK. Without knowing CIndex, I'm not sure how useful it is to support such different levels of details through the same representation. For example, to generate a hyperlinked source tree, I'd operate on something close to the parse tree, i.e. individual tokens.

Check out clang_annotateTokens() at http://clang.llvm.org/doxygen/group__CINDEX__LEX.html . It maps from tokens (which you can get from clang_tokenize()) to the AST entities those tokens refer to. A "cursor" in CIndex parlance represents an AST element.

> But for the documentation, a much more high-level view is useful, such as a syntax tree or even a semantic graph.

Sure. Cursors point into the AST, which contains much semantic information.

> Do you think all of those will be represented by CIndex, eventually ?


I think so. The goal of CIndex is to support various tools (documentation generators, IDEs, syntax highlighters, whatever) without forcing those tools to deal with the ever-changing Clang ASTs directly. So if Synopsis needs something CIndex doesn't provide, it's probably a CIndex bug. We're not there yet, and it will probably be *more* work right now to use CIndex than it would to grok Clang ASTs directly, but the end result will be better if Synopsis can go through CIndex because many other tools will benefit from the CIndex improvements that Synopsis will need.

        - Doug
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev