Compilation database

What is a compilation database?

A compilation database is a database for compile options. It records which compile options are used to build the files in a project. A compilation database can be, but is not limited to, a JSON Compilation Database. Refer to the clang::tooling::CompilationDatabase class for the full interface.

A real-world JSON Compilation Database entry, generated by CMake, looks like this:

{
  "directory": "/home/user/dev/llvm/build",
  "file": "/home/user/dev/llvm/llvm/lib/Support/APFloat.cpp",
  "command": "/usr/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support -I/home/user/dev/llvm/llvm/lib/Support -Iinclude -I/home/user/dev/llvm/llvm/include -fPIC -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Werror=date-time -std=c++11 -fcolor-diagnostics -ffunction-sections -fdata-sections -O3 -UNDEBUG -fno-exceptions -fno-rtti -o lib/Support/CMakeFiles/LLVMSupport.dir/APFloat.cpp.o -c /home/user/dev/llvm/llvm/lib/Support/APFloat.cpp"
}

An entry generated by clang -MJ, looks like this:

{
  "directory": "/home/user/dev/llvm/build",
  "file": "/tmp/foo.cpp",
  "output": "foo.o",
  "arguments": ["/usr/bin/clang-5.0", "-xc++", "/tmp/foo.cpp", "--driver-mode=g++", "-Wall", "-I", "/home/user/dev/libcpp/libcpp/include", "-c", "--target=x86_64-unknown-linux-gnu"]
}

Note

A good introduction to compilation databases is available on Eli Bendersky’s blog:

What is it good for?

You might wonder what a compilation database is good for. This section list a various tools that may benefit from a compilation database.

Clang tools

Core Clang tools and extra Clang tools:

A few other tools seems to be available, but they aren’t officially documented:

  • clang-refactor

  • clang-reorder-fields

  • clang-change-namespace

  • clang-move

See also

Text editors and IDEs

To bring basic IDE-like features to text editor you need 2 things:

  1. text editor plugin which integrates libclang

  2. a compilation database, to feed to libclang

With this, you can have features such as semantic code completion and on-the-fly syntax checking.

LSP

LSP stands for Language Server Protocol, see Microsoft/language-server-protocol on Github.

Atom

GNU Emacs

Vim

Other tools

See also

Some of the tools listed here:

How to generate a JSON Compilation Database?

Build systems and compilers

This section describes build tools which natively support the generation of a compilation database.

Bazel

Clang

Clang’s -MJ option generates a compilation database entry per input (requires Clang >= 5.0).

Usage:

clang++ -MJ a.o.json -Wall -std=c++11 -o a.o -c a.cpp
clang++ -MJ b.o.json -Wall -std=c++11 -o b.o -c b.cpp

To merge the compilation database entries into a valid compilation database, it is possible to use (GNU) sed:

sed -e '1s/^/[\n/' -e '$s/,$/\n]/' *.o.json > compile_commands.json

Or, using any sed under Bash, Zsh or ksh:

sed -e '1s/^/[\'$'\n''/' -e '$s/,$/\'$'\n'']/' *.o.json > compile_commands.json

This sed invocation does the following:

  • insert the opening bracket: [

  • concatenate the entries

  • remove the trailing comma of the last entry (to be JSON compliant)

  • insert the closing bracket: ]

CMake

To generate a JSON compilation database with CMake, enable the CMAKE_EXPORT_COMPILE_COMMANDS option (requires CMake >= 2.8.5).

For example, in an existing build directory, type:

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .

This will create a file name compile_commands.json in the build directory.

Ninja

To generate a JSON compilation database with Ninja, use the -t compdb option (requires Ninja >= 1.2). This option takes a list of rules as argument.

Usage:

ninja -t compdb > compile_commands.json

Note

Ninja < 1.10 were more difficult to use, they required a rule name to be specified as an argument.

Qbs

qbs natively support the generation of a compilation database.

Usage:

qbs generate --generator clangdb

waf

waf supports the generation of a JSON Compilation database by adding the following lines to the wfscript:

def configure(conf):
    conf.load('compiler_cxx')
    ...
    conf.load('clang_compilation_database')

Specialized tools

Some build systems do not support generating a compilation database.

A non-exhaustive list, includes:

  • the GNU Build System (autotools): ./configure and friends

  • KBuild, the Linux Kernel Makefiles

For this reason, a few tools have emerged to respond to this issue.

bear and intercept-build

Bear and intercept-build from scan-build, are two tools from László Nagy, that collects the compile options by intercepting calls to the compiler during the build. To have a complete compilation database a full build is required.

The scan-build tools is included in Clang tree since release 3.8.0, as a replacement of the Perl implementation of scan-build. It’s reasonable to think that someday, distributions will offer it as package. scan-build can already be easily be installed with pip:

pip install scan-build

Usage:

<bear|intercept-build> BUILD_COMMAND

Example:

bear make -B -j9
intercept-build ./build.sh

A file named compile_commands.json is created in the current directory.

cdcd

The cdcc uses a compiler wrapper to write an sqlite3 database, from which compile_commands.json files can be generated.

The tools can be used to generate a compilation database for the JHBuild tool.

CodeChecker log

The ld logger tool from codechecker has an implementation of a build interceptor similar to bear and intercept-build.

They favor intercept-build [2] when available, but fallback to the ld logger tool when needed.

The ld logger tool can be invoked with a build command, for example:

CodeChecker log -o compile_commands.json -b "make -B"

Howewer, in version 5.6, the resulting compilation database is surprising:

  • Escaping of double quotes is not handled properly, for example it produces:

    -DIRONY_PACKAGE_VERSION=\"0.2.2-cvs\"
    

    instead of:

    -DIRONY_PACKAGE_VERSION=\\\"0.2.2-cvs\\\"
    
  • There are compile commands not only for the compilation step, but also for linking:

    {
            "directory": "/home/user/build-irony/src",
            "command": "c++ -I<...> ...Irony.cpp.o ...main.cpp.o -o ...irony-server <ldflags...>",
            "file": "/home/user/build-irony/srcCMakeFiles/irony-server.dir/Irony.cpp.o"
    }
    

Luckily, with intercept-build, these issues are fixed.

compdb

compdb is a tool to manipulate compilation databases. It can generate a compilation database for header files.

compiledb-generator

compiledb-generator is a tool to generate compilation database for make-based build systems. It works by parsing the output of commands like make --dry-run.

Usage:

compiledb-make all > compile_commands.json

To parse an existing build log:

compiledb-parser . < build-log.txt

There is also a specialized command compiledb-aosp, to deal with AOSP.

Sourcetrail Extension for Visual Studio

The Sourcetrail Extension for Visual Studio is a GUI tool that generates JSON Compilation Databases from VS Solutions. A wide range of VS versions seems to be supported.

sw-btrace

sourceweb‘s btrace tool, aka sw-btrace, use the same principle as bear and intercept-build.

The generation is done in 2 steps:

  1. Run sw-btrace BUILD_COMMAND to log the compilation.

  2. Call sw-btrace-to-compiledb to generate a JSON compilation database out of the compilation log.

Example:

sw-btrace make -B
sw-btrace-to-compiledb

A file named compile_commands.json is created in the current directory.

tee3/commands_to_compilation_database

tee3/commands_to_compilation_database can generate compilation databases for Boost.Build, make, and a potentially other tools by mean of a regular expressions to match the build output.

It also provides a tools to generate a compilation database from files specified to the standard input, and compile options specified on the command line.

xcpretty

xcpretty can generate a compilation database for Xcode projects. To do so, it uses the xcodebuild output.

Usage:

xcodebuild | xcpretty -r json-compilation-database

Other compilation databases and tools

This section shows that people invented their own compilation database version. Either because no standards existed yet, or because of specialized needs.

cc_args.py

The cc_args.py script from the Vim plugin clang_complete.

This script generates a .clang_complete configuration file.

Usage:

make CC='~/.vim/bin/cc_args.py gcc' CXX='~/.vim/bin/cc_args.py g++' -B

gccrec

The gccrec tool from the now unmaintained gccsense project.

The tool records the compile options in an SQLite database.

Links to the manual for reference:

rtags

The rtags project has a gcc wrapper named gcc-rtags-wrapper.sh to help feed its internal compilation database.

Description here:

YCM-Generator

YCM-Generator works differently than bear and intercept-build. It builds a project using a fake toolchain. This is faster than doing a full build, because the fake toolchain is composed of trivial programs.

The tool does not actually generate a “JSON Compilation Database”, instead it creates a configuration file for YouCompleteMe.

Case studies on a few open source projects

This section describes how to generate a compilation database for a few open source projects. Depending on the project, the method to generate a compilation database can differ.

The result should preferrably be:

correct

Some tools guess the compile options, if they guess wrong, the compile command entry is not useful.

complete

A compilation database should be as exhaustive as possible. Any file on which a tool can be run on, need to have compile options.

For example, a compilation database usually lacks compile options for headers, even though they would be useful to things like text editors. Or compile options for unit tests may not be available, if tests aren’t built by default.

fast

Between 2 or more correct and complete methods, one should favor the fastest.

Tools that require a full project build to generate the database can easily become a hindrance on big projects. Imagine adding a new file to a big project. When you have to do a full rebuild just to make the file show up in the database, it’s not pleasant.

git

git uses a custom Makefile and a configure scripts for the build. The build system does not seem to have native support for the compilation database generation. We will use bear and intercept-build to generate one.

From a quick glimpse at the Makefile and documentation, we can see there is a special DEVELOPER setting to enable stricter compilation options. This is used in this example to match the developer workflow better.

This example has been tested on git 2.9.2.

Compilation database generation with bear:

echo DEVELOPER=1 >> config.mak
make configure
bear make -j9

With intercept-build, replace the last line by:

intercept-build make -j9

Footnotes