How does the CPU know how many bytes it should read for the next instruction, considering instructions have different lenghts?

OneAndOnly

So i was reading a paper, and in it, they said that statically disassembling the code of a binary is undecidable, because a series of bytes could be represented as many possible ways as shown in picture ( its x86 )

disassembling

so my question is :

  1. how does the CPU execute this then? for example in the picture, when we reach after C3, how does it know how many bytes it should read for the next instruction?

  2. how does the CPU know how much it should increment the PC after executing one instruction? does it somehow store the size of the current instruction and add that when it wants to increment PC?

  3. if the CPU can somehow know how many bytes it should read for the next instruction or basically how to interpret the next instruction, why cant we do it statically?

Ross Ridge

The simple way is to just read one byte, decode it and then determine if it's a complete instruction. If not read another byte, decode it if necessary and then determine if a complete instruction has been read. If not continue reading/decoding bytes until the complete instruction is read.

This means that if the instruction pointer is pointing at a given sequence of bytes there's only possible way to decode that first instruction of that sequence of bytes. The ambiguity only comes because the next instruction to be executed might not be located at the bytes immediately that follow the first instruction. That's because the first instruction in the sequence of bytes may change the instruction pointer so some other instruction other than the following one gets executed.

The RET (retn) instruction in your example could be the end of a function. Functions often end with e RET instruction, but not necessarily so. It's possible for a function to have multiple RET instructions, none of which are at the end of the function. Instead the last instruction will be some sort of JMP instruction that jumps back to some location in the function, or to another function entirely.

That means in your example code, without more context, it's impossible to know if any of the bytes following the RET instruction will be ever be executed, and if so, which of the bytes would be the first instruction of the following function. There could be data in between functions or this RET instruction could be the end of the last function in the program.


The x86 instruction set in particular has a pretty complex format of optional prefix bytes, one or more opcode bytes, one or two possible addressing form bytes, and then possible displacement and immediate bytes. The prefix bytes can be prepended to just about any instruction. The opcode bytes determine how many opcode bytes there are and whether the instruction can have a operand bytes and immediate bytes. The opcode may also indicate that there's displacement bytes. The first operand byte determines if there's a second operand byte and if there's displacement bytes.

The Intel 64 and IA-32 Architectures Software Developer's Manual has this figure showing the format of x86 instructions:

X86 Instruction Format

Python-like pseudo-code for decoding x86 instructions would look something like this:

# read possible prefixes

prefixes = []
while is_prefix(memory[IP]):
    prefixes.append(memory[IP))
    IP += 1

# read the opcode 

opcode = [memory[IP]]
IP += 1
while not is_opcode_complete(opcode):
    opcode.append(memory[IP])
    IP += 1

# read addressing form bytes, if any

modrm = None
addressing_form = []    
if opcode_has_modrm_byte(opcode):
    modrm = memory[IP]
    IP += 1
    if modrm_has_sib_byte(modrm):
        addressing_form = [modrm, memory[IP]]
        IP += 1
    else:
        addressing_form = [modrm]

# read displacement bytes, if any

displacement = []
if (opcode_has_displacement_bytes(opcode)
    or modrm_has_displacement_bytes(modrm)):
    length = determine_displacement_length(prefixes, opcode, modrm)
    displacement = memory[IP : IP + length]
    IP += length

# read immediate bytes, if any

immediate = []
if opcode_has_immediate_bytes(opcode):
    length = determine_immediate_length(prefixes, opcode)
    immediate = memory[IP : IP + length]
    IP += length

# the full instruction

instruction = prefixes + opcode + addressing_form + displacement + immediate

One important detail left out of the pseudo-code above is that instructions are limited to 15 bytes in length. It's possible to construct otherwise valid x86 instructions that are 16 bytes or longer, but such instruction will generate an undefined opcode CPU exception if executed. (There are other details I've left out, like how part of the opcode can be encoded inside of the Mod R/M byte, but I don't think this affects the length of instructions.)


However x86 CPUs don't actually decode instructions like how I've described above, they only decode instructions as if they read each byte one at a time. Instead modern CPUs will read an entire 15 bytes in to a buffer and then decode bytes in parallel, usually in a single cycle. When it fully decodes the instruction, determining its length, and is ready to read the next instruction it shifts over the remaining bytes in the buffer that weren't part of the instruction. It then reads more bytes to fill the buffer to 15 bytes again and starts decoding the next instruction.

Another thing modern CPUs will do that's isn't implied by what I've written above, is speculatively execute instructions. This means the CPU will decode instructions and tentatively try to execute them even before it has finished executing the previous instructions. This in turn means that the CPU may end up decoding the instructions following the RET instruction, but only if it can't determine where the RET will return to. Because there can be performance penalties from trying to decode and tentatively execute random data that's not intended to be executed, compilers don't normally put data in between functions. Though they may pad this space with NOP instructions that will never be executed in order to align functions for performance reasons.

(They used to put read-only data in between functions long ago, but this was before x86 CPUs that could speculatively execute instructions became common place.)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How does the CPU/assembler know the size of the next instruction?

From Dev

How does Perl know how many bytes to read in a handle from IO::Select::->can_read?

From Java

How many bytes does a string have

From Dev

How does a CPU 'know' what commands and instructions actually mean?

From Dev

How does MySQL varchar know how many bytes indicate the length?

From Dev

How many bytes should I read/write to a socket?

From Dev

How to scrape headers as a different column from paragraphs with rvest asuming they have different lenghts?

From Dev

How many CPU do I have and how many jobs should I submit?

From Java

How many interrupts does my cpu have per second?

From Dev

C# - How to know how many bytes are left in TcpClient after perform SslStream.Read

From Dev

Bare metal RISC-V CPU - how does the processor know which address to start fetching instructions from?

From Dev

How does C++ know how many bytes to copy when retuning an object from a function?

From Dev

When adding a *void item into an array of *void, how does compiler know how many bytes to use in memory?

From Dev

How does CPU differentiate REP and REPE instructions?

From Dev

How does the linux kernel know how many threads/cpu exists on a particular HW platform

From Dev

How does c# decoder know the exact number of bytes it should use for one char?

From Dev

How does the Go compiler know which bytes in a byte slice should be grouped together into one rune?

From Dev

How does an instruction in a CPU report that is has finished?

From Dev

How is a 15 bytes instruction transferred form memory to CPU?

From Dev

Is there a way to check how many bytes are available to be read?

From Linux

How does kernel know how many cores there are

From Dev

Trying to know how many rows I have from the different fleets in django

From Dev

Is there any way to seach for coincidences in Excel, in order to know how many times two different values have the same ID

From Dev

How does a CPU know there is IO pending?

From Dev

How many bytes does \n use in a file?

From Dev

How many NSOperationQueues should an app have?

From Dev

how to handle cases that should have many exceptions?

From Dev

How many confirmations should I have on ethereum?

From Dev

How should I handle filling multiple textboxes when I don't know how many of them will have data?

Related Related

  1. 1

    How does the CPU/assembler know the size of the next instruction?

  2. 2

    How does Perl know how many bytes to read in a handle from IO::Select::->can_read?

  3. 3

    How many bytes does a string have

  4. 4

    How does a CPU 'know' what commands and instructions actually mean?

  5. 5

    How does MySQL varchar know how many bytes indicate the length?

  6. 6

    How many bytes should I read/write to a socket?

  7. 7

    How to scrape headers as a different column from paragraphs with rvest asuming they have different lenghts?

  8. 8

    How many CPU do I have and how many jobs should I submit?

  9. 9

    How many interrupts does my cpu have per second?

  10. 10

    C# - How to know how many bytes are left in TcpClient after perform SslStream.Read

  11. 11

    Bare metal RISC-V CPU - how does the processor know which address to start fetching instructions from?

  12. 12

    How does C++ know how many bytes to copy when retuning an object from a function?

  13. 13

    When adding a *void item into an array of *void, how does compiler know how many bytes to use in memory?

  14. 14

    How does CPU differentiate REP and REPE instructions?

  15. 15

    How does the linux kernel know how many threads/cpu exists on a particular HW platform

  16. 16

    How does c# decoder know the exact number of bytes it should use for one char?

  17. 17

    How does the Go compiler know which bytes in a byte slice should be grouped together into one rune?

  18. 18

    How does an instruction in a CPU report that is has finished?

  19. 19

    How is a 15 bytes instruction transferred form memory to CPU?

  20. 20

    Is there a way to check how many bytes are available to be read?

  21. 21

    How does kernel know how many cores there are

  22. 22

    Trying to know how many rows I have from the different fleets in django

  23. 23

    Is there any way to seach for coincidences in Excel, in order to know how many times two different values have the same ID

  24. 24

    How does a CPU know there is IO pending?

  25. 25

    How many bytes does \n use in a file?

  26. 26

    How many NSOperationQueues should an app have?

  27. 27

    how to handle cases that should have many exceptions?

  28. 28

    How many confirmations should I have on ethereum?

  29. 29

    How should I handle filling multiple textboxes when I don't know how many of them will have data?

HotTag

Archive