Data alignment in C/C++

Environment

OS: macOS Big Sur 11.4
CPU: Apple M1
Compiler: Apple clang version 12.0.5 (clang-1205.0.22.11)
Target: arm64-apple-darwin20.5.0

Problem:

struct A {
    uint8_t data8;
    uint64_t data64;
    uint32_t data32;
};
/* What's the difference and why? */
struct B {
    uint64_t data64;
    uint32_t data32;
    uint8_t data8;
};

Difference between these structures is in data alignment which depends mostly on your compiler configuration.

Example (and header) how to conveniently check similar structs for education purpose
(Example uses tuples to make code smaller)

Consider following code (it uses function from the link above):

printTupleInfo(std::tuple<uint8_t, uint64_t, uint32_t>());
printTupleInfo(std::tuple<uint64_t, uint32_t, uint8_t>());

The output of the code above:

[T = std::__1::tuple<unsigned char, unsigned long long, unsigned int>]
   sizeof = 24
   alignment_of = 8
   [t = unsigned char]
      sizeof = 1
      value = 0
   [t = unsigned long long]
      sizeof = 8
      value = 0
   [t = unsigned int]
      sizeof = 4
      value = 0

[T = std::__1::tuple<unsigned long long, unsigned int, unsigned char>]
   sizeof = 16
   alignment_of = 8
   [t = unsigned long long]
      sizeof = 8
      value = 0
   [t = unsigned int]
      sizeof = 4
      value = 0
   [t = unsigned char]
      sizeof = 1
      value = 0

Despite that structures have exactly the same amount and types of fields the sizes of the structs are different.

What’s going on?

Short answer is because CPUs are word oriented (not byte oriented).

Long answer

On the 64-bit CPU sizeof(word) will be 8 bytes and for 32-bit is 4 bytes accordingly, so accessing aligned data is simply saying more optimized (more on this here https://developer.ibm.com/technologies/systems/articles/pa-dalign/).
Data in structures is aligned to the data's size
(data_address % sizeof(data) == 0)
and depending on the order there can be different amount of pads. For example `char` should be aligned to 1, `wchar` should be aligned to 2, `float` should aligned to 4 and `long long` to 8. Alignment in array is explained further.

Alignment of a `struct A` object

Layout of a struct A object
Hence long long should be aligned to a multiple of 8 we get a 7 byte pad before it. Second pad is inserted in order not to loose alignment when used in array. For example the long long field should be aligned to a multiple of 8 and without second pad second element in array won’t have a long long field aligned to a multiple of 8.

Alignment of an array of `struct A` objects

Layout of an array of struct A objects
Thanks to padding long long haven’t lost its alignment.

Alignment of a `struct B` object

Layout of a struct A object This is the same object, but due to alignment it will weight less than struct A object. So different fields order will result to different memory object representation. This can lead to unobvious errors. For example to be as fast as possible you send a struct as a piece of memory over the network (ignore Endian problems 😄) and after a while occasionally (or intentionally) you change the order of fields and get corrupted values because now mapping is broken.

Alignment of an array of `struct B` objects

Layout of an array of struct A objects
So in this particular case 3 objects of struct B fits to 48 bytes, while only 2 objects of struct A fits to the same 48 bytes. Quite good optimization for a large amount of objects.

As a rule of thumb place fields in decreasing order to get rid of extra padding.

You can force your compiler not to align data, but then the processor will have to perform more instructions to access not aligned fields (by combining first part from first word and second part from the second word).

P.S. If you want to get an alignment of your structure here is an example:

    class Test
    {
        int a;
        char b;
    };
    /* C++11 way */
    std::cout << "Alignment of [T = Test] : " << std::alignment_of<Test>::value << '\n';
    /* C++17 way */
    std::cout << "Alignment of [T = Test] : " << std::alignment_of_v<Test> << '\n';
    /* alignof way, but can't be used in templates */
    std::cout << "Alignment of [T = Test] : " << alignof(test) << '\n';

TODO

Perform perf tests on accessing aligned/unaligned data
Checkout alignas
Checkout std::aligned_storage
Checkout std::aligned_alloc
Checkout std::aligned_union
Checkout std::max_align_t
Checkout compiler specific align operations
Checkout assume_aligned
Checkout std::align
Checkout std::launder

Data alignment in C/C++

Problem:

TODO

References