|
|

Preprocessor directives

A look at C & C++ Preprocessor Directives

My goal is to develop a deeper understanding of the entire C & C++ Compilation Process.

I’m going to begin with the first phase in the compilation proccess, the preprocessor stage starting with a look at some C & C++ preprocessor directives.

--- title: Compilation Pipeline --- graph LR ip([source code]) --> id1(Preprocessor) id1 --> id2(Parsing) id2 --> id3(IR Generation) id3 --> id4(Compiler Backend) id4 --> id5(Assembler) id5 --> id6(Linker) id6 --> op([executable]) style ip fill:#285943,color:#DDDD style op fill:#285943,color:#DDDD style id1 stroke:#BA160C, stroke-width:2px, stroke-dasharray: 5 5

Macro Definitions: #define

To examine we’ll compare two variation of the same file, with and without a #define directive.
We’ll compile with default clang and use the -E option to tell the toolchain to stop after the preprocessor phase of compilation and output the results to a text file for comparison.

main.c with #define

1
2
3
4
5
6
#define size 10
int main()
{
  int i = size;
  return 0;
}

$ clang -E main.c > main_with_define.txt

main.c without #define

1
2
3
4
5
int main()
{
  int i = 10;
  return 0;
}

$ clang -E main.c > main_without_define.txt

here are the results:

main_with_define.txt

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# 1 "main.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 382 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "main.c" 2

int main()
{
  int i = 10;
  return 0;
}

main_without_define.txt

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# 1 "main.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 382 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "main.c" 2
int main()
{
  int i = 10;
  return 0;
}

Notice the only diffrence is there is an additional blank line (line 8) in main_with_define.txt. This corrisponds to where the #define would be in the source file. i.e. the compiler takes line 8, evaluates size to really mean 10 and then replaces the word size with 10 anywhere it is found in the source code.

#define is simply equivilant to a “replace” statement that follows the format:

#define <thing to replace> <replace with>

A caveat to #define is that it acts as a “dumb” replacement, meaning that there is no check to ensure that whatever is being replaced results in legal C/C++ code.

We can see the simplicity of this by replacing #define size 10 with #define size ten in main.c and analyzing as before demostrating it really is just a replace x with y.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# 1 "main.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 382 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "main.c" 2

int main()
{
  int i = ten;
  return 0;
}

Notice that int i = ten. This obviously doesn’t make any sense since int can’t be assigned to the word “ten”, yet the compiler didn’t throw an error… why?

This is because the -E option of our compile command tells clang to only perform the preprocessor stage of compliation then stop.

This highlights that during the early stages of the compilation process, pretty much anything goes and that the compiler doesn’t know anything about the validity of the source code at this stage in the compilation pipeline.

An additional fun thing we can look at are Builtin Marcos. Above we looked at marco definitions; However, there are also builtin macro definitions that we can access. Using the same method as before to inspect the result of the preprocessor stage we can get some intersting information. Note that the file only contains the two lines of text show, not even a main() function. As previously mentioned, during this stage of compilation, the compiler doesn’t check for vaild C/C++.

1
2
3
4
5
__FILE_NAME__
__TIMESTAMP__
__APPLE__
_WIN32
__FreeBSD__
1
2
3
4
5
"main.c"
"Wed May  3 19:04:52 2023"
1
_WIN32
__FreeBSD__

Notice that __FILE_NAME__ , __TIMESTAMP__ , __APPLE__ were all replaced with builtin macro definitions. __APPLE__ was replaced with 1 indicating true; whereas _WIN32 , __FreeBSD__ were not replaced with anything because they are not defined for the clang implementation on my system, which would result in them evaluating to false.

One use of builtin macro definitions is their helpfulness when developing cross-platform applications when combined with conditional inclusions.


Conditional Inclusions: #ifdef

TBC