Compilation database

What is a compilation database?

A compilation database is a database for compile options. It records which compile options are used to build the files in a project. A compilation database can be, but is not limited to, a JSON Compilation Database. Refer to the clang::tooling::CompilationDatabase class for the full interface.

A real-world JSON Compilation Database entry, generated by CMake, looks like this:

  "directory": "/home/user/dev/llvm/build",
  "file": "/home/user/dev/llvm/llvm/lib/Support/APFloat.cpp",
  "command": "/usr/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support -I/home/user/dev/llvm/llvm/lib/Support -Iinclude -I/home/user/dev/llvm/llvm/include -fPIC -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Werror=date-time -std=c++11 -fcolor-diagnostics -ffunction-sections -fdata-sections -O3 -UNDEBUG -fno-exceptions -fno-rtti -o lib/Support/CMakeFiles/LLVMSupport.dir/APFloat.cpp.o -c /home/user/dev/llvm/llvm/lib/Support/APFloat.cpp"

An entry generated by clang -MJ, looks like this:

  "directory": "/home/user/dev/llvm/build",
  "file": "/tmp/foo.cpp",
  "output": "foo.o",
  "arguments": ["/usr/bin/clang-5.0", "-xc++", "/tmp/foo.cpp", "--driver-mode=g++", "-Wall", "-I", "/home/user/dev/libcpp/libcpp/include", "-c", "--target=x86_64-unknown-linux-gnu"]


A good introduction to compilation databases is available on Eli Bendersky’s blog:

What is it good for?

You might wonder what a compilation database is good for. This section list a various tools that may benefit from a compilation database.

Clang tools

Core Clang tools and extra Clang tools:

A few other tools seems to be available, but they aren’t officially documented:

  • clang-reorder-fields
  • clang-change-namespace
  • clang-move

It’s possible these tools will be merged into one, be it called clang-refactor or not.

See also

Other tools

See also

Some of the tools listed here:

How to generate a JSON Compilation Database?

Build systems and compilers

This section describes build tools which natively support the generation of a compilation database.


Clang’s -MJ option generates a compilation database entry per input (requires Clang >= 5.0).


clang++ -MJ a.o.json -Wall -std=c++11 -o a.o -c a.cpp
clang++ -MJ b.o.json -Wall -std=c++11 -o b.o -c b.cpp

To merge the compilation database entries into a valid compilation database, it is possible to use sed:

sed -e '1s/^/[\n/' -e '$s/,$/\n]/' *.o.json

This sed invocation does the following:

  • insert the opening bracket: [
  • concatenate the entries
  • remove the trailing comma of the last entry (to be JSON compliant)
  • insert the closing bracket: ]


To generate a JSON compilation database with CMake, enable the CMAKE_EXPORT_COMPILE_COMMANDS option (requires CMake >= 2.8.5).

For example, in an existing build directory, type:


This will create a file name compile_commands.json in the build directory.


To generate a JSON compilation database with Ninja, use the -t compdb option (requires Ninja >= 1.2). This option takes a list of rules as argument.


ninja -t compdb [RULES...]

This works well with projects containing one rule for C++ files, such as Ninja itself:

ninja -t compdb cxx > compile_commands.json

However, it gets ugly if the Ninja build files contains a lot of rules. You have to find a way to get a list of all the rules. For example, as of version 3.6.1, CMake generates a lot of rules. To generate a compilation database of Clang using CMake’s Ninja generator (cmake -G Ninja <...>):

ninja -t compdb $(awk '/^rule (C|CXX)_COMPILER__/ { print $2 }' > compile_commands.json

This method is not ideal, the awk line is not really good parser for Ninja syntax. To make things better, there is an issue on the ninja bug tracker with an associated pull request:

Specialized tools

Some build systems do not support generating a compilation database.

A non-exhaustive list, includes:

  • the GNU Build System (autotools): ./configure and friends
  • KBuild, the Linux Kernel Makefiles

For this reason, a few tools have emerged to respond to this issue.

bear and intercept-build

Bear and intercept-build from scan-build, are two tools from László Nagy, that collects the compile options by intercepting calls to the compiler during the build. To have a complete compilation database a full build is required.

The scan-build tools is included in Clang tree since release 3.8.0, as a replacement of the Perl implementation of scan-build. It’s reasonable to think that someday, distributions will offer it as package. scan-build can already be easily be installed with pip:

pip install scan-build


<bear|intercept-build> BUILD_COMMAND


bear make -B -j9
intercept-build ./

A file named compile_commands.json is created in the current directory.


The cdcc uses a compiler wrapper to write an sqlite3 database, from which compile_commands.json files can be generated.

The tools can be used to generate a compilation database for the JHBuild tool.


commands_to_compilation_database can generate compilation databases for Boost.Build, make, and a potentially other tools by mean of a regular expressions to match the build output.

It also provides a tools to generate a compilation database from files specified to the standard input, and compile options specified on the command line.


compdb is a tool to manipulate compilation databases. It can generate a compilation database for header files.

CodeChecker log

The ld logger tool from codechecker has an implementation of a build interceptor similar to bear and intercept-build.

They favor intercept-build [2] when available, but fallback to the ld logger tool when needed.

The ld logger tool can be invoked with a build command, for example:

CodeChecker log -o compile_commands.json -b "make -B"

Howewer, in version 5.6, the resulting compilation database is surprising:

  • Escaping of double quotes is not handled properly, for example it produces:


    instead of:

  • There are compile commands not only for the compilation step, but also for linking:

            "directory": "/home/user/build-irony/src",
            "command": "c++ -I<...> ...Irony.cpp.o ...main.cpp.o -o ...irony-server <ldflags...>",
            "file": "/home/user/build-irony/srcCMakeFiles/irony-server.dir/Irony.cpp.o"

Luckily, with intercept-build, these issues are fixed.


sourceweb’s btrace tool, aka sw-btrace, use the same principle as bear and intercept-build.

The generation is done in 2 steps:

  1. Run sw-btrace BUILD_COMMAND to log the compilation.
  2. Call sw-btrace-to-compiledb to generate a JSON compilation database out of the compilation log.


sw-btrace make -B

A file named compile_commands.json is created in the current directory.


xcpretty can generate a compilation database for Xcode projects. To do so, it uses the xcodebuild output.


xcodebuild | xcpretty -r json-compilation-database

Other compilation databases and tools

This section shows that people invented their own compilation database version. Either because no standards existed yet, or because of specialized needs.

The script from the Vim plugin clang_complete.

This script generates a .clang_complete configuration file.


make CC='~/.vim/bin/ gcc' CXX='~/.vim/bin/ g++' -B


The gccrec tool from the now unmaintained gccsense project.

The tool records the compile options in an SQLite database.

Links to the manual for reference:


The rtags project has a gcc wrapper named to help feed its internal compilation database.

Description here:


YCM-Generator works differently than bear and intercept-build. It builds a project using a fake toolchain. This is faster than doing a full build, because the fake toolchain is composed of trivial programs.

The tool does not actually generate a “JSON Compilation Database”, instead it creates a configuration file for YouCompleteMe.

Case studies on a few open source projects

This section describes how to generate a compilation database for a few open source projects. Depending on the project, the method to generate a compilation database can differ.

The result should preferrably be:

Some tools guess the compile options, if they guess wrong, the compile command entry is not useful.

A compilation database should be as exhaustive as possible. Any file on which a tool can be run on, need to have compile options.

For example, a compilation database usually lacks compile options for headers, even though they would be useful to things like text editors. Or compile options for unit tests may not be available, if tests aren’t built by default.


Between 2 or more correct and complete methods, one should favor the fastest.

Tools that require a full project build to generate the database can easily become a hindrance on big projects. Imagine adding a new file to a big project. When you have to do a full rebuild just to make the file show up in the database, it’s not pleasant.


git uses a custom Makefile and a configure scripts for the build. The build system does not seem to have native support for the compilation database generation. We will use bear and intercept-build to generate one.

From a quick glimpse at the Makefile and documentation, we can see there is a special DEVELOPER setting to enable stricter compilation options. This is used in this example to match the developer workflow better.

This example has been tested on git 2.9.2.

Compilation database generation with bear:

echo DEVELOPER=1 >> config.mak
make configure
bear make -j9

With intercept-build, replace the last line by:

intercept-build make -j9