Home Practical linkage types ๐Ÿ”—
Post
Cancel

Practical linkage types ๐Ÿ”—

The standard way to define a functions is as follows: one writes the declaration in an .h-file (so that many translation units get aware of its existence) and the definition in a .cpp-file.

To fully define the function (declaration+definition) in an .h-file there is the inline keyword used, but one can use the static keyword as well.

Letโ€™s see the LLVM IR (the intermediate representation of the C++-code) for int sum1(), static int sum2(), inline int sum3(): link to godbolt.

The LLVM IR specifies the interesting list of symbolsโ€™ linkage types, there are many types. The linkage type defines how the symbol behaves when the linker is working. (A symbol is either a function or a global variable)

  • sum1 has the default linkage type, external - there is exactly 1 definition of the symbol is possible. The linker fails if the program contains 0 or >1 definitions.
  • sum2 has the internal linkage type - the symbol is reachable only in the translation unit where it was defined. The linker just renames the internal symbol if there is names collision.
  • sum3 has the linkonce_odr linkage type. This is a typical weak symbol, similar to some other types (linkonce, weak). The program can have multiple weak definitions of the same symbol. If all the definitions are weak, the linker takes a random weak definition. If one of the definitions is strong (like sum1), the linker takes the strong definition.

What is the difference between linkonce and linkonce_odr?

  • Weak definitions might differ from each other, therefore, for example, the compiler has no right to inline a call of a weak function (the function may point to another definition after the linker stage).
  • But the C++ Standard requires that the programmer secures that inline functions have exactly one definition (this is usually true because the definition is placed in an .h-file).
  • Therefore the compiler has the right to inline a call to sum3 - nothing could break because of this.

Letโ€™s compiler the code into an object file:

1
clang++ -c link.cpp

Weโ€™ll get an ELF file on Linux. Letโ€™s use the readelf tool to observer the symbol table. To print human-readable function names (not mangled ones) the c++filt tool can be used:

1
readelf -s link.o | c++filt

We get kind of this (I omitted other symbols):

1
2
3
4
5
Symbol table '.symtab' contains 21 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     6: 00000000000000c0    18 FUNC    LOCAL  DEFAULT    2 sum2(int, int)
    14: 0000000000000000    18 FUNC    GLOBAL DEFAULT    2 sum1(int, int)
    20: 0000000000000000    18 FUNC    WEAK   DEFAULT    7 sum3(int, int)

In general, it is rather better to use inline functions than static functions. This way the executable wonโ€™t contain many copies of the same function. Also, this way the executable will have exactly one instance of static variables inside the function.

1
2
3
4
5
6
7
8
inline int* get_address() {
    // `dummy` takes sizeof(int) memory
    // it could be N*sizeof(int) if it was static function
    static int dummy;
    // inline-function: returns the same pointer everywhere
    // static-function: returns a unique pointer for every translation unit
    return &dummy;
}
This post is licensed under CC BY 4.0 by the author.

Two-faced "inline" for function

Wrapper for output streams ๐ŸŒŠ