Cling and Precompiled Headers

Open Questions

We are investigating the use of precompiled headers (PCH) as a replacement of CINT's (and ROOT's) current dictionaries. They contain far more information, all we would every want. Items to be studied are

  • PCH implementation for C++
  • speed when "opening" (attaching) the PCH
  • speed when "reading" (extracting an entity from) the PCH
  • memory size
  • disk size
  • loading multiple PCHs

Speed

Zdenek has studied the performance of PCHs and pre-tokenized headers (PTH). The two numbers are creation of the PCH (PTH) and the use of them; compared to a traditional run of the compiler without PCH / PTH. In all cases -fsyntax-only was used, i.e. only the parsing part of clang was exercised, no output was produced. -fsyntax-only is more than -E in that it does also syntactical analysis. Due to the minimal (nearly lack of) support of PCH for C++, Zdenek decided to parse gtk.h as C code.

Traditional: no PCH, no PTH

time ./clr -fsyntax-only  \
      -femit-all-decls  \
      -I /usr/include/gtk-2.0  \
      -I /usr/lib/gtk-2.0/include  \
      -I /usr/include/atk-1.0  \
      -I /usr/include/cairo  \
      -I /usr/include/pango-1.0  \
      -I /usr/include/glib-2.0  \
      -I /usr/lib/glib-2.0/include  \
      -I /usr/include/pixman-1  \
      -I /usr/include/freetype2  \
      -I /usr/include/libpng12  \
      -I/opt/root/include  \
      -x c /usr/include/gtk-2.0/gtk/gtk.h 
real	0m0.665s
user	0m0.495s
sys	0m0.031s

PTH

With or without PTH does not make any difference with respect to speed.

Write

time ./clr -emit-pth -o output/gtk.pth  \
      -femit-all-decls  \
      -I /usr/include/gtk-2.0  \
      -I /usr/lib/gtk-2.0/include  \
      -I /usr/include/atk-1.0  \
      -I /usr/include/cairo  \
      -I /usr/include/pango-1.0  \
      -I /usr/include/glib-2.0  \
      -I /usr/lib/glib-2.0/include  \
      -I /usr/include/pixman-1  \
      -I /usr/include/freetype2  \
      -I /usr/include/libpng12  \
      -I/opt/root/include  \
      -x c /usr/include/gtk-2.0/gtk/gtk.h 
real	0m0.451s
user	0m0.257s
sys	0m0.043s

Read

time ./clr -fsyntax-only -include-pth output/gtk.pth  \
      -femit-all-decls  \
      -I /usr/include/gtk-2.0  \
      -I /usr/lib/gtk-2.0/include  \
      -I /usr/include/atk-1.0  \
      -I /usr/include/cairo  \
      -I /usr/include/pango-1.0  \
      -I /usr/include/glib-2.0  \
      -I /usr/lib/glib-2.0/include  \
      -I /usr/include/pixman-1  \
      -I /usr/include/freetype2  \
      -I /usr/include/libpng12  \
      -I/opt/root/include  \
      -x c output/empty.cpp 
real	0m0.796s
user	0m0.391s
sys	0m0.029s

PCH

Precompiled headers significantly increase the parsing speed.

Write

time ./clr -emit-pch -o output/gtk.pch  \
      -femit-all-decls  \
      -I /usr/include/gtk-2.0  \
      -I /usr/lib/gtk-2.0/include  \
      -I /usr/include/atk-1.0  \
      -I /usr/include/cairo  \
      -I /usr/include/pango-1.0  \
      -I /usr/include/glib-2.0  \
      -I /usr/lib/glib-2.0/include  \
      -I /usr/include/pixman-1  \
      -I /usr/include/freetype2  \
      -I /usr/include/libpng12  \
      -I/opt/root/include  \
      -x c /usr/include/gtk-2.0/gtk/gtk.h 
real	0m0.850s
user	0m0.614s
sys	0m0.049s

Read

time ./clr -fsyntax-only -include-pch output/gtk.pch  \
      -femit-all-decls  \
      -I /usr/include/gtk-2.0  \
      -I /usr/lib/gtk-2.0/include  \
      -I /usr/include/atk-1.0  \
      -I /usr/include/cairo  \
      -I /usr/include/pango-1.0  \
      -I /usr/include/glib-2.0  \
      -I /usr/lib/glib-2.0/include  \
      -I /usr/include/pixman-1  \
      -I /usr/include/freetype2  \
      -I /usr/include/libpng12  \
      -I/opt/root/include  \
      -x c output/empty.cpp
real	0m0.121s
user	0m0.095s
sys	0m0.018s

File sizes

For gtk.h, the PTH is 3.9MB, PCH is 3.4MB. A preprocessed gtk.h is 1.4MB.

Analysis

The speed improvement is significant. The file size is much, much larger than the original preprocessed sources. This needs to be understood. GZipping it brings it down to 2.2MB. The PCH contains 0.6MB of strings, says "strings". But it is presumably still a lot smaller than dictionaries' shared libraries, thus reducing the costs. Again, this needs to be looked at.

References

Zdenek's original files are at http://koala.fjfi.cvut.cz/clr-0.2.3/output-clang-only/.