diff options
Diffstat (limited to 'docs/assembly.rst')
-rw-r--r-- | docs/assembly.rst | 130 |
1 files changed, 38 insertions, 92 deletions
diff --git a/docs/assembly.rst b/docs/assembly.rst index 443cb7da..2cbad06f 100644 --- a/docs/assembly.rst +++ b/docs/assembly.rst @@ -21,10 +21,9 @@ often hard to address the correct stack slot and provide arguments to opcodes at point on the stack. Solidity's inline assembly tries to facilitate that and other issues arising when writing manual assembly by the following features: -* functional-style opcodes: ``mul(1, add(2, 3))`` instead of ``push1 3 push1 2 add push1 1 mul`` +* functional-style opcodes: ``mul(1, add(2, 3))`` * assembly-local variables: ``let x := add(2, 3) let y := mload(0x40) x := add(x, y)`` * access to external variables: ``function f(uint x) public { assembly { x := sub(x, 1) } }`` -* labels: ``let x := 10 repeat: x := sub(x, 1) jumpi(repeat, eq(x, 0))`` * loops: ``for { let i := 0 } lt(i, x) { i := add(i, 1) } { y := mul(2, y) }`` * if statements: ``if slt(x, 0) { x := sub(0, x) }`` * switch statements: ``switch x case 0 { y := mul(x, 2) } default { y := 0 }`` @@ -54,7 +53,7 @@ idea is that assembly libraries will be used to enhance the language in such way pragma solidity ^0.4.0; library GetCode { - function at(address _addr) public view returns (bytes o_code) { + function at(address _addr) public view returns (bytes memory o_code) { assembly { // retrieve the size of the code, this needs assembly let size := extcodesize(_addr) @@ -83,7 +82,7 @@ you really know what you are doing. library VectorSum { // This function is less efficient because the optimizer currently fails to // remove the bounds checks in array access. - function sumSolidity(uint[] _data) public view returns (uint o_sum) { + function sumSolidity(uint[] memory _data) public pure returns (uint o_sum) { for (uint i = 0; i < _data.length; ++i) o_sum += _data[i]; } @@ -91,7 +90,7 @@ you really know what you are doing. // We know that we only access the array in bounds, so we can avoid the check. // 0x20 needs to be added to an array because the first slot contains the // array length. - function sumAsm(uint[] _data) public view returns (uint o_sum) { + function sumAsm(uint[] memory _data) public pure returns (uint o_sum) { for (uint i = 0; i < _data.length; ++i) { assembly { o_sum := add(o_sum, mload(add(add(_data, 0x20), mul(i, 0x20)))) @@ -100,7 +99,7 @@ you really know what you are doing. } // Same as above, but accomplish the entire code within inline assembly. - function sumPureAsm(uint[] _data) public view returns (uint o_sum) { + function sumPureAsm(uint[] memory _data) public pure returns (uint o_sum) { assembly { // Load the length (first 32 bytes) let len := mload(_data) @@ -115,7 +114,7 @@ you really know what you are doing. // Iterate until the bound is not met. for - { let end := add(data, len) } + { let end := add(data, mul(len, 0x20)) } lt(data, end) { data := add(data, 0x20) } { @@ -134,7 +133,6 @@ usual ``//`` and ``/* */`` comments. Inline assembly is marked by ``assembly { . these curly braces, the following can be used (see the later sections for more details) - literals, i.e. ``0x123``, ``42`` or ``"abc"`` (strings up to 32 characters) - - opcodes (in "instruction style"), e.g. ``mload sload dup1 sstore``, for a list see below - opcodes in functional style, e.g. ``add(1, mlod(0))`` - labels, e.g. ``name:`` - variable declarations, e.g. ``let x := 7``, ``let x := add(y, 3)`` or ``let x`` (initial value of empty (0) is assigned) @@ -212,16 +210,14 @@ In the grammar, opcodes are represented as pre-defined identifiers. +-------------------------+-----+---+-----------------------------------------------------------------+ | sar(x, y) | | C | arithmetic shift right y by x bits | +-------------------------+-----+---+-----------------------------------------------------------------+ -| addmod(x, y, m) | | F | (x + y) % m with arbitrary precision arithmetics | +| addmod(x, y, m) | | F | (x + y) % m with arbitrary precision arithmetic | +-------------------------+-----+---+-----------------------------------------------------------------+ -| mulmod(x, y, m) | | F | (x * y) % m with arbitrary precision arithmetics | +| mulmod(x, y, m) | | F | (x * y) % m with arbitrary precision arithmetic | +-------------------------+-----+---+-----------------------------------------------------------------+ | signextend(i, x) | | F | sign extend from (i*8+7)th bit counting from least significant | +-------------------------+-----+---+-----------------------------------------------------------------+ | keccak256(p, n) | | F | keccak(mem[p...(p+n))) | +-------------------------+-----+---+-----------------------------------------------------------------+ -| sha3(p, n) | | F | keccak(mem[p...(p+n))) | -+-------------------------+-----+---+-----------------------------------------------------------------+ | jump(label) | `-` | F | jump to label / code position | +-------------------------+-----+---+-----------------------------------------------------------------+ | jumpi(label, cond) | `-` | F | jump to label if cond is nonzero | @@ -230,13 +226,13 @@ In the grammar, opcodes are represented as pre-defined identifiers. +-------------------------+-----+---+-----------------------------------------------------------------+ | pop(x) | `-` | F | remove the element pushed by x | +-------------------------+-----+---+-----------------------------------------------------------------+ -| dup1 ... dup16 | | F | copy ith stack slot to the top (counting from top) | +| dup1 ... dup16 | | F | copy nth stack slot to the top (counting from top) | +-------------------------+-----+---+-----------------------------------------------------------------+ -| swap1 ... swap16 | `*` | F | swap topmost and ith stack slot below it | +| swap1 ... swap16 | `*` | F | swap topmost and nth stack slot below it | +-------------------------+-----+---+-----------------------------------------------------------------+ -| mload(p) | | F | mem[p..(p+32)) | +| mload(p) | | F | mem[p...(p+32)) | +-------------------------+-----+---+-----------------------------------------------------------------+ -| mstore(p, v) | `-` | F | mem[p..(p+32)) := v | +| mstore(p, v) | `-` | F | mem[p...(p+32)) := v | +-------------------------+-----+---+-----------------------------------------------------------------+ | mstore8(p, v) | `-` | F | mem[p] := v & 0xff (only modifies a single byte) | +-------------------------+-----+---+-----------------------------------------------------------------+ @@ -274,16 +270,16 @@ In the grammar, opcodes are represented as pre-defined identifiers. +-------------------------+-----+---+-----------------------------------------------------------------+ | returndatacopy(t, f, s) | `-` | B | copy s bytes from returndata at position f to mem at position t | +-------------------------+-----+---+-----------------------------------------------------------------+ -| create(v, p, s) | | F | create new contract with code mem[p..(p+s)) and send v wei | +| create(v, p, s) | | F | create new contract with code mem[p...(p+s)) and send v wei | | | | | and return the new address | +-------------------------+-----+---+-----------------------------------------------------------------+ -| create2(v, n, p, s) | | C | create new contract with code mem[p..(p+s)) at address | -| | | | keccak256(<address> . n . keccak256(mem[p..(p+s))) and send v | +| create2(v, n, p, s) | | C | create new contract with code mem[p...(p+s)) at address | +| | | | keccak256(<address> . n . keccak256(mem[p...(p+s))) and send v | | | | | wei and return the new address | +-------------------------+-----+---+-----------------------------------------------------------------+ -| call(g, a, v, in, | | F | call contract at address a with input mem[in..(in+insize)) | +| call(g, a, v, in, | | F | call contract at address a with input mem[in...(in+insize)) | | insize, out, outsize) | | | providing g gas and v wei and output area | -| | | | mem[out..(out+outsize)) returning 0 on error (eg. out of gas) | +| | | | mem[out...(out+outsize)) returning 0 on error (eg. out of gas) | | | | | and 1 on success | +-------------------------+-----+---+-----------------------------------------------------------------+ | callcode(g, a, v, in, | | F | identical to ``call`` but only use the code from a and stay | @@ -295,23 +291,23 @@ In the grammar, opcodes are represented as pre-defined identifiers. | staticcall(g, a, in, | | B | identical to ``call(g, a, 0, in, insize, out, outsize)`` but do | | insize, out, outsize) | | | not allow state modifications | +-------------------------+-----+---+-----------------------------------------------------------------+ -| return(p, s) | `-` | F | end execution, return data mem[p..(p+s)) | +| return(p, s) | `-` | F | end execution, return data mem[p...(p+s)) | +-------------------------+-----+---+-----------------------------------------------------------------+ -| revert(p, s) | `-` | B | end execution, revert state changes, return data mem[p..(p+s)) | +| revert(p, s) | `-` | B | end execution, revert state changes, return data mem[p...(p+s)) | +-------------------------+-----+---+-----------------------------------------------------------------+ | selfdestruct(a) | `-` | F | end execution, destroy current contract and send funds to a | +-------------------------+-----+---+-----------------------------------------------------------------+ | invalid | `-` | F | end execution with invalid instruction | +-------------------------+-----+---+-----------------------------------------------------------------+ -| log0(p, s) | `-` | F | log without topics and data mem[p..(p+s)) | +| log0(p, s) | `-` | F | log without topics and data mem[p...(p+s)) | +-------------------------+-----+---+-----------------------------------------------------------------+ -| log1(p, s, t1) | `-` | F | log with topic t1 and data mem[p..(p+s)) | +| log1(p, s, t1) | `-` | F | log with topic t1 and data mem[p...(p+s)) | +-------------------------+-----+---+-----------------------------------------------------------------+ -| log2(p, s, t1, t2) | `-` | F | log with topics t1, t2 and data mem[p..(p+s)) | +| log2(p, s, t1, t2) | `-` | F | log with topics t1, t2 and data mem[p...(p+s)) | +-------------------------+-----+---+-----------------------------------------------------------------+ -| log3(p, s, t1, t2, t3) | `-` | F | log with topics t1, t2, t3 and data mem[p..(p+s)) | +| log3(p, s, t1, t2, t3) | `-` | F | log with topics t1, t2, t3 and data mem[p...(p+s)) | +-------------------------+-----+---+-----------------------------------------------------------------+ -| log4(p, s, t1, t2, t3, | `-` | F | log with topics t1, t2, t3, t4 and data mem[p..(p+s)) | +| log4(p, s, t1, t2, t3, | `-` | F | log with topics t1, t2, t3, t4 and data mem[p...(p+s)) | | t4) | | | | +-------------------------+-----+---+-----------------------------------------------------------------+ | origin | | F | transaction sender | @@ -382,23 +378,13 @@ used ``x_slot`` and to retrieve the byte-offset you used ``x_offset``. In assignments (see below), we can even use local Solidity variables to assign to. -Functions external to inline assembly can also be accessed: The assembly will -push their entry label (with virtual function resolution applied). The calling semantics -in solidity are: - - - the caller pushes ``return label``, ``arg1``, ``arg2``, ..., ``argn`` - - the call returns with ``ret1``, ``ret2``, ..., ``retm`` - -This feature is still a bit cumbersome to use, because the stack offset essentially -changes during the call, and thus references to local variables will be wrong. - .. code:: pragma solidity ^0.4.11; contract C { uint b; - function f(uint x) public returns (uint r) { + function f(uint x) public view returns (uint r) { assembly { r := mul(x, sload(b_slot)) // ignore the offset, we know it is zero } @@ -418,57 +404,8 @@ changes during the call, and thus references to local variables will be wrong. Labels ------ -.. note:: - Labels are deprecated. Please use functions, loops, if or switch statements instead. - -Another problem in EVM assembly is that ``jump`` and ``jumpi`` use absolute addresses -which can change easily. Solidity inline assembly provides labels to make the use of -jumps easier. Note that labels are a low-level feature and it is possible to write -efficient assembly without labels, just using assembly functions, loops, if and switch instructions -(see below). The following code computes an element in the Fibonacci series. - -.. code:: - - { - let n := calldataload(4) - let a := 1 - let b := a - loop: - jumpi(loopend, eq(n, 0)) - a add swap1 - n := sub(n, 1) - jump(loop) - loopend: - mstore(0, a) - return(0, 0x20) - } - -Please note that automatically accessing stack variables can only work if the -assembler knows the current stack height. This fails to work if the jump source -and target have different stack heights. It is still fine to use such jumps, but -you should just not access any stack variables (even assembly variables) in that case. - -Furthermore, the stack height analyser goes through the code opcode by opcode -(and not according to control flow), so in the following case, the assembler -will have a wrong impression about the stack height at label ``two``: - -.. code:: - - { - let x := 8 - jump(two) - one: - // Here the stack height is 2 (because we pushed x and 7), - // but the assembler thinks it is 1 because it reads - // from top to bottom. - // Accessing the stack variable x here will lead to errors. - x := 9 - jump(three) - two: - 7 // push something onto the stack - jump(one) - three: - } +Support for labels has been removed in version 0.5.0 of Solidity. +Please use functions, loops, if or switch statements instead. Declaring Assembly-Local Variables ---------------------------------- @@ -662,12 +599,21 @@ first. Solidity manages memory in a very simple way: There is a "free memory pointer" at position ``0x40`` in memory. If you want to allocate memory, just use the memory -from that point on and update the pointer accordingly. +starting from where this pointer points at and update it accordingly. +There is no built-in mechanism to release or free allocated memory. +Here is an assembly snippet that can be used for allocating memory:: + + function allocate(length) -> pos { + pos := mload(0x40) + mstore(0x40, add(pos, length)) + } The first 64 bytes of memory can be used as "scratch space" for short-term allocation. The 32 bytes after the free memory pointer (i.e. starting at ``0x60``) is meant to be zero permanently and is used as the initial value for empty dynamic memory arrays. +This means that the allocatable memory starts at ``0x80``, which is the initial value +of the free memory pointer. Elements in memory arrays in Solidity always occupy multiples of 32 bytes (yes, this is even true for ``byte[]``, but not for ``bytes`` and ``string``). Multi-dimensional memory @@ -751,7 +697,7 @@ We consider the runtime bytecode of the following Solidity program:: The following assembly will be generated:: { - mstore(0x40, 0x60) // store the "free memory pointer" + mstore(0x40, 0x80) // store the "free memory pointer" // function dispatcher switch div(calldataload(0), exp(2, 226)) case 0xb3de648b { |