I had an interesting discussion with a co-worker yesterday, we were discussing how to grab all the comments from an Objective-C source file. We tried to think about how to do that with a regex, which probably would have worked ok, but definitely wouldn’t have handled all the edge cases easily.
I’ve been using clang’s awesome libTooling library for writing frontend actions recently so I thought I’d try and take a quick pass at writing a frontend action that can grab all the comments from a source file.
Setting up the envionment for building frontend actions is covered in the awesome clang Tutorial for building tools using LibTooling and LibASTMatchers documentation, so go do that if you haven’t already!
Next, I set up a new frontend action. I cloned down llvm into the ~/clang-llvm
folder:
Then the CMakeLists.txt
should look like this:
set(LLVM_LINK_COMPONENTS support)
add_clang_executable(commentparser
CommentParser.cpp
)
target_link_libraries(commentparser
clangTooling
clangBasic
clangASTMatchers
)
Now we can start writing our fontend action. We need a couple of different elements: setting up command line parsing in the main function, an ASTFrontendAction
and an ASTConsumer
. Starting with the command line parsing and setup:
Nothing too interesting going on here yet, although note that we setup a new OptionCategory
, this lets us hide all the default clang options displayed in the help, we don’t want those for our tool.
Next is the ASTFrontendAction
, all we are going to do here is create our ASTConsumer
and forward the compiler’s ASTContext
to it. Here’s that code:
Now for the interesting part, our ASTConsumer
, this is the class that we can visit AST nodes in. If we wanted to, we could perform an action on all IfStmt
, ForLoop
, EnumDecl
, etc. All we need to visit to handle comments is the translation unit. A TranslationUnitDecl
is always the top level node in the Abstract Syntax Tree1
That’s all we need for this frontend action, we are just going to print out the comments that we find in each translation unit.
For testing, I’m going to use a simple source file test.cpp
:
Compiling and running the frontend action:
ninja commentparser
./commentparser test.cpp --
/** Other comment */
/// Documentation comment
/** this is a block comment */
Finished parsing for comments
Wait, what happened to the other comments in the file? Clang treats some comments differently to others. Comments that start with /**
or ///
are treated as documentation comments. By default, clang only
parses documentation comments.2
Luckily, there’s a command line flag to override this and parse all comments -fparse-all-comments
:
./commentparser test.cpp -- -fparse-all-comments
/** Other comment */
// This is the main method
/// Documentation comment
/* block comment */
// end of line comment
/** this is a block comment */
Finished parsing for comments
That gives us all the comments in the translation unit! Note that if we wanted to compile a more complicated class with imports or a project that has multiple files, we’d have to use a compilation database3. Checkout out the full source code for this frontend action CommentParser.cpp