GCC front-end (2): language-specific files

This post is part of a series about GCC internals and specifically about how-to create a new language front-end for GCC. For a list of related posts, please check this page.

The last post gave a short introduction into the differences between compiler and (compilation) driver and how-to control the different phases of the gcc/g++ drivers. This post now explains the basic file and directory structure of a GCC front-end (the file and directory structure of the GCC project in general will be shown where appropriate in the context of the front-end explanation).

Except the C compiler, the source code for each GCC front-end is located in a separate directory, namely gcc-x.y.z/gcc/any_name, e.g. gcc-x.y.z/gcc/any_sample_fe. Just as a note in this context, to configure GCC for a new front-end with the –enable-languages configure option, don’t use the directory name, but the language name configured in gcc-x.y.z/gcc/any_sample_fe/config-lang.in as described below.

After the extraction of the sample, minimal front-end as described here, the following files shall be typically found in the front-end directory:

config-lang.in
General front-end language configuration used by the configure/make build process, like the name of the language (parameter: language) or the file name of the compiler (parameter: compilers) among others. This file gets read by the configure scripts (gcc-x.y.z/configure, gcc-x.y.z/gcc/configure, etc.) and the contents get incorporated into the generated makefiles. For a detailed description of the single parameters, please check the GCC internals manual (Chapter 6.3.8.1) available here.

lang.opt
Language-specific driver and compiler options which are automatically parsed by the GCC common code and passed to a front-end specific function for final analysis and internal processing. For a sample file layout, check out the files gcc-x.y.z/gcc/c.opt and gcc-x.y.z/gcc/common.opt

lang-specs.h
This is the specification used by the GCC driver infrastructure to handle the specific phases of the compilation process. As described in the last post, GCC selects the phases and corresponding tools (e.g. compiler) based on file extensions. Though assume, as an example, you want to create a new language which shouldn’t be directly translated into machine code, but beforehand into C/C++ code. Assume furthermore that your new language files end in ‘.my_c_ext’. By using the lang-specs.h file, you could then instruct your language driver to pass your new language file to your compiler which creates a ‘.c’ file. This file could/will then further be processed by the common tools, e.g. cc1, as, ld, etc. For a sample file layout, please check the file gcc-x.y.z/gcc/gcc.c which greatly explains the syntax of the lang-specs.h file.

Make-lang.in
Language-specific makefile fragment. This fragments gets included into the GCC Makefile (builddir/gcc/Makefile). This makefile fragment should contain the instructions to build and install the language-specific driver, compiler, man pages and documentation.

lang-tree.def, e.g. sample_fe-tree.def
To simplify the creation of new front-ends and the interaction of the front-end with the middle-end/back-end, GCC provides a language-independent abstract-syntax tree (AST) named GENERIC. GENERIC is a tree-based representation while each tree node has a unique tree code. All the tree codes available by GCC are listed in the file gcc-x.y.z/gcc/tree.def. Most of these tree codes suffice the purposes of a new language front-end, but you might need additional ones for specific language constructs. These extra tree codes are put into a language-specific tree definition file. The naming convention for this file is to use the front-end directory as the first part of the name instead of the language parameter in the config-lang.in file. For example, the C++ front-end – located in the directory gcc-x.y.z/gcc/cp – defines the file gcc-x.y.z/gcc/cp/cp-tree.def. For a sample layout, please check the file gcc-x.y.z/gcc/tree.def. But please keep in mind, the GCC middle-end expects GIMPLE – three-address code tuples – as input (in earlier version of GCC, GIMPLE was a tree-based intermediate representation based on GENERIC). The tree codes in gcc-x.y.z/gcc/tree.def could be automatically transformed into GIMPLE by the GCC gimplifier, but additional tree codes must be manually transformed into GIMPLE code.

driver-specific source files
compiler-specific files
All the source files for the driver, compiler or whatever tools are required for the new language front-end.