In C before main tasks to setup the runtime environment needs to run. In C and C++ Crt objects do the setup and tare down of the runtime environment. Crt stands for “C(++) RunTime”. These setup tasks are initializing the stack, IRQ (interrupt requests), registers, memory segments, passes argc/argv to main. C/C++ compilers automatically link in the Crt objects into executables. Most likely developers compiling executables are unaware that Crt exist. and are always linked in.
Initially, I wanted to find tangible real-world examples of each Crt file. My goal was to do a deep dive into their implementation. However, the more I dug, the less attainable that goal became.
That pursuit started with a solid grip in the middle of the rope of truth and understanding. Then deeper into each Crt implementation. I found my self holding on the tendrils of the rope of truth and understand. The rope of truth and understanding frayed into three separate tendrils during this deep dive.
- Many parts of Crt files are not necessary anymore and are purely there for backwards compatibility.
- Crt‘s do different jobs on different OS implementation
- The elf file formats render many functions of Crt’s moot
Also, to be honest there are better resources for learning about Crt out there. That are better than anything I could hope to make.
Really good write up on c-runtime
Instead, I want to experiment by remove the Crt from a simple program, and see what functionality is missing. This is a simply experiment. It follows the logic of the legendary country artist George Strait’s song “You Don’t Know What You’re Missing.” In the song George says
You don’t know what you’re missing
‘Till it’s gone
So will we miss the Crts? will we figure out what they do?
Example: C Program
#include <stdio.h>
int main(void) {
printf("hello\n");
return 0;
}
Compiling without Crt
To compile without Crts we need to add -nostartfiles and -nostdlib options. nostartfiles removes the crt files from being linked. nostdlib is not needed but removes any linking to the stdlib.
gcc main.c -o mainc -nostartfiles -nostdlib

This does not compile. there are two issues:
- start symbol does not exist
- puts does not exist
Issue 1: start symbol
The start symbol/function is an assembly label that calls main. It is implemented in crti. crti is usually implemented in assembly. Since we are no longer using Crt’s we need to compile main differently and link it by hand. Gcc does this automatically normally. Now, we need to compile it as a plain object file. using the -c option. Then, we need to link it with an assembly file declaring _start and calling main. We will call this file assembly file CrtStub.s. Now we begin finding out what we miss, and we start implementing pieces of the Crt in CrtStub.s
.globl _start
_start:
// assembly here
call main

It gets us further. One new warning, and the old puts issue.
Issue 2: puts
Lets add the label/symbol for puts. We can get the program to compile if we just add a symbol for the compiler to find.
.globl _start
.globl puts
_start:
// assembly here
call main
puts:

Above shows the program compiling but the program crashes during runtime. There are many possible reasons and issues for this. We will start implementing puts first. If puts is implemented we should see some text or output.
To implement puts we have two questions to answer
- How the arguments are passed into puts? (what is the GCC calling convention)
- How do we use system calls?
Question 1 Answered by disassembling a.out
./a.out: file format elf64-x86-64
Disassembly of section .text:
0000000000401000 <main>:
401000: 48 83 ec 08 sub $0x8,%rsp
401004: 48 8d 3d f5 0f 00 00 lea 0xff5(%rip),%rdi # 402000 <puts+0xfe4>
40100b: e8 0c 00 00 00 call 40101c <puts>
401010: 31 c0 xor %eax,%eax
401012: 48 83 c4 08 add $0x8,%rsp
401016: c3 ret
0000000000401017 <_start>:
401017: e8 e4 ff ff ff call 401000 <main>
In the disassembly two lines show how the arguments are passed in gcc functions the instructions lea, call are called. rip is the instruction pointer register and rdi is the first argument register (at least in gcc). In plain English (value 40200) is loaded into rdirip + 0xff5
Value 40200 is probably the address to the string. this can be verified by dumping the a.out file.
objdump -s ./a.out
./a.out: file format elf64-x86-64
Contents of section .note.gnu.property:
400190 04000000 20000000 05000000 474e5500 .... .......GNU.
4001a0 010001c0 04000000 01000000 00000000 ................
4001b0 020001c0 04000000 00000000 00000000 ................
Contents of section .text:
401000 4883ec08 488d3df5 0f0000e8 0c000000 H...H.=.........
401010 31c04883 c408c3e8 e4ffffff 1.H.........
Contents of section .rodata:
402000 68656c6c 6f00 hello.
Contents of section .eh_frame:
402008 14000000 00000000 017a5200 01781001 .........zR..x..
402018 1b0c0708 90010000 14000000 1c000000 ................
402028 d8efffff 17000000 00440e10 520e0800 .........D..R...
Contents of section .comment:
0000 4743433a 2028474e 55292031 332e332e GCC: (GNU) 13.3.
0010 3000 0.
We can see hello lives in the read only section at address 402000. We disassembled the compiled output. This verifies the claim above. puts always takes a single argument which is an address to a string. With the information above and the knowledge of the function signature of puts. we figured out how an argument is passed in the gcc calling convention.
Calling Conventions
Each compiler has their own calling convention. A calling convention in this respect defines which registers/arguments are used for function inputs and which are used for returns. Other compilers use the stack and registers differently. For instance this is one reason libraries compiled by clang will not work when linked in with gcc programs/libraries.
We know the address to the string must be in register rdi for puts to work. After that it need to be pushed into stdout this is done by system calls. now on to Question 2.
Question 2 Answer by more assembly code
.section .text
.globl _start
.globl puts
_start:
# assembly here
call main
ret
# IN: rdi points to NUL-terminated string
# OUT: rax contains string length
strlen:
mov %rdi, %rax
dec %rax
.top:
inc %rax
cmpb $0, (%rax)
jne .top
sub %rdi, %rax
ret
# IN: rdi points to NUL-terminated string
puts:
call strlen
mov %rdi, %rsi
mov %rax, %rdx
mov $1, %rax
mov $1, %rdi
syscall
CrtStub.s has grown. puts was implemented right under _start and puts needs strlen so strlen was added. Strlen computes the size of a string by counting the characters until a null character is hit. The write system call takes four arguments in registers rsi, rdx, rax, rdi.
- rsi: the address of the string
- rdx: number of bytes in the string
- rax: system call number
- rdi: file out
The write system needs the four parameters to work correctly. We need to put the strings address into rsi. We need to get the string length which also goes into rdx. rax is a constant number each system call. That value need to be looked up. stdout is a file out so we need to mov the corresponding constant into rdi.

we still have a core dump but we at least have the expected output “hello”. Except for the segmentation fault. It looks like we only need three functions to run a c program. I need to end this post here.
Leave a Reply