The C++ Compiler
708 words · 4 minutes
A Brief Introduction
C++ is a general-purpose programming language with object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation.
The source code, shown in the snippet below, must be compiled before it can be executed. There are many steps and intricacies to the compilation process, and this post was a personal exercise to learn and remember as much information as I can.
Compiling C++ projects is a frustrating task most days. Seemingly nonexistent errors keeping your program from successfully compiling can be annoying (especially since you know you wrote it perfectly the first time, right?).
I'm learning more and more about C++ these days and decided to write this concept down so that I can cement it even further in my own head. However, C++ is not the only compiled language. Check out the Wikipedia entry for compiled languages for more examples of compiled languages.
I'll start with a wonderful, graphical way to conceptualize the C++ compiler. View "The C++ compilation process" by Kurt MacMahon, an NIU professor, to see the graphic and an explanation. The goal of the compilation process is to take the C++ code and produce a shared library, dynamic library, or an executable file.
Let's break down the compilation process. There are four major steps to compiling C++ code.
The first step is to expand the source code file to meet all dependencies. The C++ preprocessor includes the code from all the header files, such as
#include <iostream>. Now, what does that mean? The previous example includes the
iostream header. This tells the computer that you want to use the
iostream standard library, which contains classes and functions written in the core language. This specific header allows you to manipulate input/output streams. After all this, you'll end up which a temporary file that contains the expanded source code.
In the example of the C++ code above, the
iostream class would be included in the expanded code.
After the code is expanded, the compiler comes into play. The compiler takes the C++ code and converts this code into the assembly language, understood by the platform. You can see this in action if you head over to the Godbolt Compiler Explorer, which shows C++ being converted into assembly dynamically.
For example, the
Hello, world! code snippet above compiles into the following assembly code:
.LC0: .string "Hello, world!\n" main: push rbp mov rbp, rsp mov esi, OFFSET FLAT:.LC0 mov edi, OFFSET FLAT:_ZSt4cout call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) mov eax, 0 pop rbp ret __static_initialization_and_destruction_0(int, int): push rbp mov rbp, rsp sub rsp, 16 mov DWORD PTR [rbp-4], edi mov DWORD PTR [rbp-8], esi cmp DWORD PTR [rbp-4], 1 jne .L5 cmp DWORD PTR [rbp-8], 65535 jne .L5 mov edi, OFFSET FLAT:_ZStL8__ioinit call std::ios_base::Init::Init() [complete object constructor] mov edx, OFFSET FLAT:__dso_handle mov esi, OFFSET FLAT:_ZStL8__ioinit mov edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev call __cxa_atexit .L5: nop leave ret _GLOBAL__sub_I_main: push rbp mov rbp, rsp mov esi, 65535 mov edi, 1 call __static_initialization_and_destruction_0(int, int) pop rbp ret
Third, the assembly code generated by the compiler is assembled into the object code for the platform. Essentially, this is when the compiler takes the assembly code and assembles it into machine code in a binary format. After researching this online, I figured out that a lot of compilers will allow you to stop compilation at this step. This would be useful for compiling each source code file separately. This saves time later if a single file changes; only that file needs to be recompiled.
Finally, the object code file generated by the assembler is linked together with the object code files for any library functions used to produce a shared library, dynamic library, or an executable file. It replaces all references to undefined symbols with the correct addresses.