I had an interesting discussion with a co-worker yesterday, we were discussing how to grab all the comments from an Objective-C source file. We tried to think about how to do that with a regex, which probably would have worked ok, but definitely wouldn’t have handled all the edge cases easily.

I’ve been using clang’s awesome libTooling library for writing frontend actions recently so I thought I’d try and take a quick pass at writing a frontend action that can grab all the comments from a source file.

Setting up the envionment for building frontend actions is covered in the awesome clang Tutorial for building tools using LibTooling and LibASTMatchers documentation, so go do that if you haven’t already!

Next, I set up a new frontend action. I cloned down llvm into the ~/clang-llvm folder:

Then the CMakeLists.txt should look like this:

set(LLVM_LINK_COMPONENTS support)

CommentParser.cpp
)
clangTooling
clangBasic
clangASTMatchers
)


Now we can start writing our fontend action. We need a couple of different elements: setting up command line parsing in the main function, an ASTFrontendAction and an ASTConsumer. Starting with the command line parsing and setup:

Nothing too interesting going on here yet, although note that we setup a new OptionCategory, this lets us hide all the default clang options displayed in the help, we don’t want those for our tool.

Next is the ASTFrontendAction, all we are going to do here is create our ASTConsumer and forward the compiler’s ASTContext to it. Here’s that code:

Now for the interesting part, our ASTConsumer, this is the class that we can visit AST nodes in. If we wanted to, we could perform an action on all IfStmt, ForLoop, EnumDecl, etc. All we need to visit to handle comments is the translation unit. A TranslationUnitDecl is always the top level node in the Abstract Syntax Tree1

That’s all we need for this frontend action, we are just going to print out the comments that we find in each translation unit.

For testing, I’m going to use a simple source file test.cpp:

Compiling and running the frontend action:

ninja commentparser
./commentparser test.cpp --

/** Other comment */
/// Documentation comment
/** this is a block comment */


Wait, what happened to the other comments in the file? Clang treats some comments differently to others. Comments that start with /** or /// are treated as documentation comments. By default, clang only parses documentation comments.2

Luckily, there’s a command line flag to override this and parse all comments -fparse-all-comments:

./commentparser test.cpp -- -fparse-all-comments

/** Other comment */
// This is the main method
/// Documentation comment
/* block comment */
// end of line comment
/** this is a block comment */