How the Compilation Process Works for C Programs

How the Compilation Process Works for C Programs

Preprocessing:

Preprocessing is the first step. The preprocessor obeys commands that begin with # (known as directives) by:

  • removing comments
  • expanding macros
  • expanding included files

If you included a header file such as #include <stdio.h>, it will look for the stdio.h file and copy the header file into the source code file.

The preprocessor also generates macro code and replaces symbolic constants defined using #define with their values.

Compiling:

Compiling is the second step. It takes the output of the preprocessor and generates assembly language, an intermediate human readable language, specific to the target processor.

Assembly:

Assembly is the third step of compilation. The assembler will convert the assembly code into pure binary code or machine code (zeros and ones). This code is also known as object code.

Linking:

Linking is the final step of compilation. The linker merges all the object code from multiple modules into a single one. If we are using a function from libraries, linker will link our code with that library function code.

In static linking, the linker makes a copy of all used library functions to the executable file. In dynamic linking, the code is not copied, it is done by just placing the name of the library in the binary file.

Let’s now compile! For our example, we’ll use “main.c” as our source file.

#include <stdio.h>int main(void)
{
printf("Hello, World!\n");
return (0);
}

At the shell prompt, enter the command “gcc main.c” and hit Enter. If it successfully compiles, the shell prompt will be displayed again. If it does not compile, it will display error message(s).

vagrant@vagrant-ubuntu-trusty-64:~$ gcc main.c
vagrant@vagrant-ubuntu-trusty-64:~$ ls
a.out main.c

After main.c is compiled, type the command “ls” to list your directory contents and you will see an executable file named a.out. To run the program, type “./a.out” at the shell prompt and hit Enter. Yay, we see the correct output “Hello, World!” followed by a newline.

vagrant@vagrant-ubuntu-trusty-64:~$ ./a.out
Hello, World!
vagrant@vagrant-ubuntu-trusty-64:~$

If you don’t want your output file to be named a.out, which is the default output filename, you can specify a different output filename with the -o option.

gcc -o <desired_output_filename> <source filename>

Let’s see an example below where we want the output file to be named main.

vagrant@vagrant-ubuntu-trusty-64:~$ gcc -o main main.c
vagrant@vagrant-ubuntu-trusty-64:~$ ls
main main.c
vagrant@vagrant-ubuntu-trusty-64:~$

To run the main program, we type “./main” into the terminal.

vagrant@vagrant-ubuntu-trusty-64:~$ ./main
Hello, World!
vagrant@vagrant-ubuntu-trusty-64:~$

If you make changes to your code (i.e. make any changes to your source file), you will need to save and recompile.

In some instances, the C source file will not successfully compile.

Below, we left out the semicolon at the end of the printf statement.

#include <stdio.h>int main(void)
{
printf("Hello, World!\n")
return (0);
}

An error message displays that a semicolon was expected before return.

vagrant@vagrant-ubuntu-trusty-64:~$ gcc -o main main.c
main.c: In function 'main':
main.c:5:7: error: expected ';' before 'return'
return (0);
^
vagrant@vagrant-ubuntu-trusty-64:~$

If we add the semicolon back…

#include <stdio.h>int main(void)
{
printf("Hello, World!\n");
return (0);
}

…and then recompile, the error message will be gone.

vagrant@vagrant-ubuntu-trusty-64:~$ gcc -o main main.c
vagrant@vagrant-ubuntu-trusty-64:~$ ./main
Hello, World!
vagrant@vagrant-ubuntu-trusty-64:~$

Even if your source file compiles, check if your output is correct.

#include <stdio.h>int main(void)
{
printf("The sum of 9+2 is: %i\n", 10);
return (0);
}

…but the output is not correct.

vagrant@vagrant-ubuntu-trusty-64:~$ ./main2
The sum of 9+2 is: 10
vagrant@vagrant-ubuntu-trusty-64:~$

Oops, we have an error due to a typo in the source code.

printf("The sum of 9+2 is: %i\n", 10);

We know that 9+2 = 11 but the compiler did as the code instructed in the above line. It substituted the number 10 into the format specifer %i.

The correct code would have been:

#include <stdio.h>int main(void)
{
printf("The sum of 9+2 is: %i\n", 11);
return (0);
}

To sum up, the four steps of compilation are: preprocessing, compiling, assembly, linking.