User:Pancakes/Sandbox
Unaligned Memory Access And Byte Order
The accuracy of this section has not been generally verified, and could be inaccurate. |
Also, it is recommended to have a look at this article on Endianness which will help clarify and potentially gives a second source of information.
Let us imagine we have a board with 8 bytes of RAM, as depicted below:
Value | A | B | C | D | E | F | G | H |
---|---|---|---|---|---|---|---|---|
Address | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
A little endian machine reads the least significant byte (LSB) first. So, if we made a 32-bit (word sized) read at address 0 on the memory above, that would yield "DCBA". On a big endian machine, which reads the most significant byte (MSB) first, you would get "ABCD". It all depends on how you visualize memory addresses - whether they flow to the right (big endian), or towards the left (little endian).
Some CPU platforms even support either mode. For example, the ARM7TDMI-S has a pin BIGEND to switch from little to big endian. (Although I could not find the QEMU option to enable this feature.)
ARM cores have two other differences from x86/x64:
- They do not handle unaligned memory accesses in any defined manner. It is specified on 4-32 ARM7TDMI-S Data Sheet that if bit 0 for the memory address on a half-word access is high then the results are undefined.
- Accessing half-words in little endian mode will swap the half-words in the word.
An example for the latter:
uint32 *a; uint16 *b; a = (uint32*)0; b = (uint16*)0; a[0] = 0x12345678; printf("%#04x --> %#04x\n", b[0], b[1]);
This gives 0x5678 --> 0x1234, while in big endian mode you would get 0x1234 --> 0x5678.
Little Endian Word Size (32Bit) | |||||||
---|---|---|---|---|---|---|---|
ED | CB | A9 | 87 | 78 | 9A | BC | DE |
Offset | Register(LittleEndian) | Register(BigEndian) | |||||
00 | 0x87A9CBED | 0xEDCBA987 | |||||
01 | UNALIGNED - UNDEFINED | ||||||
10 | UNALIGNED - UNDEFINED | ||||||
11 | UNALIGNED - UNDEFINED |
Little Endian Half-Word Size (16Bit) | |||||||
---|---|---|---|---|---|---|---|
ED | CB | A9 | 87 | 78 | 9A | BC | DE |
Offset | Register(LittleEndian) | Register(BigEndian) | |||||
00 | 0xA987 | 0xEDCB | |||||
01 | UNALIGNED - UNDEFINED | ||||||
10 | 0xEDCB | 0xA987 | |||||
11 | UNALIGNED - UNDEFINED |
Byte Size (8Bit) | |||||||
---|---|---|---|---|---|---|---|
ED | CB | A9 | 87 | 78 | 9A | BC | DE |
Offset | Register(LittleEndian) | Register(BigEndian) | |||||
00 | 0xED | 0xED | |||||
01 | 0xCB | 0xCB | |||||
10 | 0xA9 | 0xA9 | |||||
11 | 0x87 | 0x87 |
From reading the data sheet it appears that if operating in big endian mode the half-word access would be reversed to be more natural as you would expect on the X86/64 architecture. But, since I can not actually test it at the moment I am hoping I got it right.
The word access with offset of 10b (0x2) may be defined, but I am not sure because it does not really state. However, it may employ some of the mechanisms for loading half-words. (Need someone to come through and correct this if it is wrong)
The reason memory access has to be aligned is because unaligned access requires additional access cycles, due tue the way memory modules work (see DDR datasheet below), and would increase the complexity of load / store operations (and the processor core).
You could simulate an unaligned memory access on the ARM7TDMI-S, but you would have to make separate loads from two memory locations. A compiler could probably emit code to do this automatically, but the checks whether an access is unaligned or not would slow down all memory accesses. Some code is provided in the ARM7TDMI-S Data Sheet on page 4-35 for such "checked" memory access, when you do not know if the address will be aligned or non-aligned.
- http://lists.gnu.org/archive/html/qemu-devel/2004-12/msg00206.html - patch for QEMU to support big-endian ARM
- http://download.micron.com/pdf/datasheets/dram/ddr/256MBDDRx4x8x16.pdf Datasheet for DDR memory