Compilation database¶

What is a compilation database?¶

A compilation database is a database for compile options. It records which compile options are used to build the files in a project. A compilation database can be, but is not limited to, a JSON Compilation Database. Refer to the clang::tooling::CompilationDatabase class for the full interface.

A real-world JSON Compilation Database entry, generated by CMake, looks like this:

{
  "directory": "/home/user/dev/llvm/build",
  "file": "/home/user/dev/llvm/llvm/lib/Support/APFloat.cpp",
  "command": "/usr/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support -I/home/user/dev/llvm/llvm/lib/Support -Iinclude -I/home/user/dev/llvm/llvm/include -fPIC -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Werror=date-time -std=c++11 -fcolor-diagnostics -ffunction-sections -fdata-sections -O3 -UNDEBUG -fno-exceptions -fno-rtti -o lib/Support/CMakeFiles/LLVMSupport.dir/APFloat.cpp.o -c /home/user/dev/llvm/llvm/lib/Support/APFloat.cpp"
}

An entry generated by clang -MJ, looks like this:

{
  "directory": "/home/user/dev/llvm/build",
  "file": "/tmp/foo.cpp",
  "output": "foo.o",
  "arguments": ["/usr/bin/clang-5.0", "-xc++", "/tmp/foo.cpp", "--driver-mode=g++", "-Wall", "-I", "/home/user/dev/libcpp/libcpp/include", "-c", "--target=x86_64-unknown-linux-gnu"]
}

Note

A good introduction to compilation databases is available on Eli Bendersky’s blog:

Compilation databases for Clang-based tools

What is it good for?¶

You might wonder what a compilation database is good for. This section list a various tools that may benefit from a compilation database.

Clang tools ¶

Core Clang tools and extra Clang tools:

A few other tools seems to be available, but they aren’t officially documented:

clang-refactor
clang-reorder-fields
clang-change-namespace
clang-move

Text editors and IDEs ¶

To bring basic IDE-like features to text editor you need 2 things:

text editor plugin which integrates libclang
a compilation database, to feed to libclang

With this, you can have features such as semantic code completion and on-the-fly syntax checking.

LSP ¶

LSP stands for Language Server Protocol, see Microsoft/language-server-protocol on Github.

Atom ¶

GNU Emacs ¶

Vim ¶

Other tools ¶

scan-build, the Clang Static Analyzer CLI, generates and uses a compilation databases.
Ericsson/codechecker generates and uses compilation dabatases.
Include What You Use: https://github.com/include-what-you-use/include-what-you-use
OCLint: http://docs.oclint.org/en/stable/manual/oclint-json-compilation-database.html
With little effort the Kythe indexer can be run on a compilation database.
Clang’s LibTooling based tools:
- clang-expand
PVS-Studio on Linux [1]
cc_driver.pl from the Mo’ Static article.
Sourcetrail
CLion
kompiledb, the Kotlin bindings to compilation database format.

How to generate a JSON Compilation Database?¶

Build systems and compilers ¶

This section describes build tools which natively support the generation of a compilation database.

Bazel ¶

github.com/google/kythe: tools/cpp/generate_compilation_database.sh

Uses experimental_action_listener to produce a compilation database.
github.com/hedronvision/bazel-compile-commands-extractor: Hedron’s Compile Commands Extractor for Bazel

Faster, does not require a full build (it is based on “Action Graph Query (aquery)”).
github.com/grailbio/bazel-compilation-database

Also faster than Kythe’s experimental_action_listener, easier to setup and does not require a full build, at the cost of being less accurate.
github.com/stackb/bazel-stack-vscode-cc

Clang ¶

Clang’s -MJ option generates a compilation database entry per input (requires Clang >= 5.0).

Usage:

clang++ -MJ a.o.json -Wall -std=c++11 -o a.o -c a.cpp
clang++ -MJ b.o.json -Wall -std=c++11 -o b.o -c b.cpp

To merge the compilation database entries into a valid compilation database, it is possible to use (GNU) sed:

sed -e '1s/^/[\n/' -e '$s/,$/\n]/' *.o.json > compile_commands.json

Or, using any sed under Bash, Zsh or ksh:

sed -e '1s/^/[\'$'\n''/' -e '$s/,$/\'$'\n'']/' *.o.json > compile_commands.json

This sed invocation does the following:

insert the opening bracket: [
concatenate the entries
remove the trailing comma of the last entry (to be JSON compliant)
insert the closing bracket: ]

CMake ¶

To generate a JSON compilation database with CMake, enable the CMAKE_EXPORT_COMPILE_COMMANDS option (requires CMake >= 2.8.5).

For example, in an existing build directory, type:

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .

This will create a file name compile_commands.json in the build directory.

Ninja ¶

To generate a JSON compilation database with Ninja, use the -t compdb option (requires Ninja >= 1.2). This option takes a list of rules as argument.

Usage:

ninja -t compdb > compile_commands.json

Note

Ninja < 1.10 were more difficult to use, they required a rule name to be specified as an argument.

Qbs ¶

qbs natively support the generation of a compilation database.

Usage:

qbs generate --generator clangdb

waf ¶

waf supports the generation of a JSON Compilation database by adding the following lines to the wfscript:

def configure(conf):
    conf.load('compiler_cxx')
    ...
    conf.load('clang_compilation_database')

Specialized tools ¶

Some build systems do not support generating a compilation database.

A non-exhaustive list, includes:

the GNU Build System (autotools): ./configure and friends
KBuild, the Linux Kernel Makefiles

For this reason, a few tools have emerged to respond to this issue.

bear and intercept-build ¶

Bear and intercept-build from scan-build, are two tools from László Nagy, that collects the compile options by intercepting calls to the compiler during the build. To have a complete compilation database a full build is required.

The scan-build tools is included in Clang tree since release 3.8.0, as a replacement of the Perl implementation of scan-build. It’s reasonable to think that someday, distributions will offer it as package. scan-build can already be easily be installed with pip:

pip install scan-build

Usage:

<bear|intercept-build> BUILD_COMMAND

Example:

bear make -B -j9
intercept-build ./build.sh

A file named compile_commands.json is created in the current directory.

cdcd ¶

The cdcc uses a compiler wrapper to write an sqlite3 database, from which compile_commands.json files can be generated.

The tools can be used to generate a compilation database for the JHBuild tool.

CodeChecker log ¶

The ld logger tool from codechecker has an implementation of a build interceptor similar to bear and intercept-build.

They favor intercept-build [2] when available, but fallback to the ld logger tool when needed.

The ld logger tool can be invoked with a build command, for example:

CodeChecker log -o compile_commands.json -b "make -B"

Howewer, in version 5.6, the resulting compilation database is surprising:

Escaping of double quotes is not handled properly, for example it produces:
```
-DIRONY_PACKAGE_VERSION=\"0.2.2-cvs\"
```
instead of:
```
-DIRONY_PACKAGE_VERSION=\\\"0.2.2-cvs\\\"
```

There are compile commands not only for the compilation step, but also for linking:

{
        "directory": "/home/user/build-irony/src",
        "command": "c++ -I<...> ...Irony.cpp.o ...main.cpp.o -o ...irony-server <ldflags...>",
        "file": "/home/user/build-irony/srcCMakeFiles/irony-server.dir/Irony.cpp.o"
}

Luckily, with intercept-build, these issues are fixed.

compdb ¶

compdb is a tool to manipulate compilation databases. It can generate a compilation database for header files.

compiledb-generator ¶

compiledb-generator is a tool to generate compilation database for make-based build systems. It works by parsing the output of commands like make --dry-run.

Usage:

compiledb-make all > compile_commands.json

To parse an existing build log:

compiledb-parser . < build-log.txt

There is also a specialized command compiledb-aosp, to deal with AOSP.

Sourcetrail Extension for Visual Studio ¶

The Sourcetrail Extension for Visual Studio is a GUI tool that generates JSON Compilation Databases from VS Solutions. A wide range of VS versions seems to be supported.

sw-btrace ¶

sourceweb‘s btrace tool, aka sw-btrace, use the same principle as bear and intercept-build.

The generation is done in 2 steps:

Run sw-btrace BUILD_COMMAND to log the compilation.
Call sw-btrace-to-compiledb to generate a JSON compilation database out of the compilation log.

Example:

sw-btrace make -B
sw-btrace-to-compiledb

A file named compile_commands.json is created in the current directory.

tee3/commands_to_compilation_database ¶

tee3/commands_to_compilation_database can generate compilation databases for Boost.Build, make, and a potentially other tools by mean of a regular expressions to match the build output.

It also provides a tools to generate a compilation database from files specified to the standard input, and compile options specified on the command line.

xcpretty ¶

xcpretty can generate a compilation database for Xcode projects. To do so, it uses the xcodebuild output.

Usage:

xcodebuild | xcpretty -r json-compilation-database

Other compilation databases and tools ¶

This section shows that people invented their own compilation database version. Either because no standards existed yet, or because of specialized needs.

cc_args.py ¶

The cc_args.py script from the Vim plugin clang_complete.

This script generates a .clang_complete configuration file.

Usage:

make CC='~/.vim/bin/cc_args.py gcc' CXX='~/.vim/bin/cc_args.py g++' -B

gccrec ¶

The gccrec tool from the now unmaintained gccsense project.

The tool records the compile options in an SQLite database.

Links to the manual for reference:

txt
HTML

rtags ¶

The rtags project has a gcc wrapper named gcc-rtags-wrapper.sh to help feed its internal compilation database.

Description here:

YCM-Generator ¶

YCM-Generator works differently than bear and intercept-build. It builds a project using a fake toolchain. This is faster than doing a full build, because the fake toolchain is composed of trivial programs.

The tool does not actually generate a “JSON Compilation Database”, instead it creates a configuration file for YouCompleteMe.

Case studies on a few open source projects ¶

This section describes how to generate a compilation database for a few open source projects. Depending on the project, the method to generate a compilation database can differ.

The result should preferrably be:

correct

Some tools guess the compile options, if they guess wrong, the compile command entry is not useful.

complete

A compilation database should be as exhaustive as possible. Any file on which a tool can be run on, need to have compile options.

For example, a compilation database usually lacks compile options for headers, even though they would be useful to things like text editors. Or compile options for unit tests may not be available, if tests aren’t built by default.

fast

Between 2 or more correct and complete methods, one should favor the fastest.

Tools that require a full project build to generate the database can easily become a hindrance on big projects. Imagine adding a new file to a big project. When you have to do a full rebuild just to make the file show up in the database, it’s not pleasant.

git ¶

git uses a custom Makefile and a configure scripts for the build. The build system does not seem to have native support for the compilation database generation. We will use bear and intercept-build to generate one.

From a quick glimpse at the Makefile and documentation, we can see there is a special DEVELOPER setting to enable stricter compilation options. This is used in this example to match the developer workflow better.

This example has been tested on git 2.9.2.

Compilation database generation with bear:

echo DEVELOPER=1 >> config.mak
make configure
bear make -j9

With intercept-build, replace the last line by:

intercept-build make -j9

Footnotes