Friday, January 29, 2016

Who calls Main.. and can we do more before calling main

When you write your application in C, your code isn't the only thing that gets programmed. Before your application can perform its first action, the C Runtime Environment startup code must configure the device to run code produced by a C compiler.


There are several things the C Runtime Environment startup code must do before your application's code can run.
  • Allocate space for a software stack and initialize the stack pointer
    On 8-bit devices that have a hardware based return address stack, the software stack is mostly used for parameter passing to and from functions. On 16- and 32-bit devices the software stack also stores the return address for each function call and interrupt.
  • Allocate space for a heap (if used)
    A heap is a block of RAM that has been set aside as a sort of scratchpad for your application. C has the ability to dynamically create variables at runtime. This is done in the heap.
  • Copy values from Flash into variables declared with initial values
    Variables declared with initial values (e.g. int x=10;) must have those initial values loaded into memory before the program can use them. The initial values are stored in flash program memory (so they will be available after the device is power cycled) and are copied into each RAM location allocated to an initialized variable for its storage.
  • Clear uninitialized RAM
    Any RAM (file register) not allocated to a specific purpose (variable storage, stack, heap, etc.) is cleared so that it will be in a known state.
  • Disable all interrupts
  • Call main(), where your application code start.

The runtime environment setup code is automatically linked into your application. It usually comes from a file with a name like crt0.s (assembly source) or crt0.o (object code).

The runtime startup code can be modified if necessary. In fact, the source file provides hooks for "user initialization" where you can run code that must execute before the main application begins, such as initializing some external hardware immediately after power is applied. Details on runtime startup code modification will be covered in the compiler specific classes.

crt0.o is an object file with code that is prepended to object code supplied by the user to make an executable. It initializes variables and the stack, and starts the user's program, among other things.

The simplest C runtime code would be
.text    // Select .text section
 b main  // Branch to main() in C source

c runtime

  • crt1.o
    This object file defines the _start symbol. The manner in which this code handles program bootstrap is highly dependent on the particularC library implementation. Some systems use crt0.o while others may even specify crt2.o or higher. Ultimately, whatever gcc has encoded should correspond to the C library in use.
  • crti.o and crtn.o
    crti.o defines the _init and _fini function prologs for the .init and .fini sections, respectively. crtn.o defines the corresponding function epilogs. When the static linker eventually merges all .init and .fini sections of its input object files, the DT_INITand DT_FINI tags in the dynamic section of its output object file will correspond to the addresses of the complete _init and _finisymbols, respectively.
    During run-time, _start sets up some way that the _init and _fini symbols will get invoked e.g. via the __libc_csu_init and__libc_csu_fini symbols, respectively, of the C library.
  • crtbegin.o and crtend.o
    The details of the symbols and sections defined in these files vary among architectures. With the Ubuntu 12.04 AMD64 toolchain, these include legacy code that GCC used to find the constructors and destructors i.e. __do_global_dtors_aux and __do_global_ctors_aux.
SOme more general Information
Some definitions:
PIC - position independent code (-fPIC)
PIE - position independent executable (-fPIE -pie)
crt - C runtime

crt0.o crt1.o etc...
  Some systems use crt0.o, while some use crt1.o (and a few even use crt2.o
  or higher).  Most likely due to a transitionary phase that some targets
  went through.  The specific number is otherwise entirely arbitrary -- look
  at the internal gcc port code to figure out what your target expects.  All
  that matters is that whatever gcc has encoded, your C library better use
  the same name.

  This object is expected to contain the _start symbol which takes care of
  bootstrapping the initial execution of the program.  What exactly that
  entails is highly libc dependent and as such, the object is provided by
  the C library and cannot be mixed with other ones.

  On uClibc/glibc systems, this object initializes very early ABI requirements
  (like the stack or frame pointer), setting up the argc/argv/env values, and
  then passing pointers to the init/fini/main funcs to the internal libc main
  which in turn does more general bootstrapping before finally calling the real
  main function.


glibc ports call this file 'start.S' while uClibc ports call this crt0.S or
  crt1.S (depending on what their gcc expects).
crti.o Defines the function prologs for the .init and .fini sections (with the _init
  and _fini symbols respectively).  This way they can be called directly.  These
  symbols also trigger the linker to generate DT_INIT/DT_FINI dynamic ELF tags.

  These are to support the old style constructor/destructor system where all
  .init/.fini sections get concatenated at link time.  Not to be confused with
  newer prioritized constructor/destructor .init_array/.fini_array sections and
  DT_INIT_ARRAY/DT_FINI_ARRAY ELF tags.

  glibc ports used to call this 'initfini.c', but now use 'crti.S'.  uClibc
  also uses 'crti.S'.

crtn.o
  Defines the function epilogs for the .init/.fini sections.  See crti.o.

  glibc ports used to call this 'initfini.c', but now use 'crtn.S'.  uClibc
  also uses 'crtn.S'.
For statically linked applications2, the load process only requires the kernel to make the binary available in its fixed load address before initializing the Program Counter (PC) for the process with the address of the _start symbol. On the other hand, for dynamically linked applications, the kernel first transfers control to the dynamic linker. In turn, the dynamic linker loads the required shared object dependencies and performs anyimmediate relocations (by default, lazy relocations for function references are performed later on when the symbols are actually referenced). It then methodically runs the initialization code for the loaded shared objects before handing control over to the executable's _start.
Entering the executable's _start concludes the application's load process and control proceeds to the executable's C run-time code before eventually reaching main.

No comments:

Post a Comment