diff options
Diffstat (limited to 'docs/miscellaneous.rst')
-rw-r--r-- | docs/miscellaneous.rst | 164 |
1 files changed, 9 insertions, 155 deletions
diff --git a/docs/miscellaneous.rst b/docs/miscellaneous.rst index e78c4807..6d6c25ac 100644 --- a/docs/miscellaneous.rst +++ b/docs/miscellaneous.rst @@ -84,10 +84,8 @@ Layout of Call Data ******************* When a Solidity contract is deployed and when it is called from an -account, the input data is assumed to be in the format in `the ABI -specification -<https://github.com/ethereum/wiki/wiki/Ethereum-Contract-ABI>`_. The -ABI specification requires arguments to be padded to multiples of 32 +account, the input data is assumed to be in the format in :ref:`the ABI +specification <ABI>`. The ABI specification requires arguments to be padded to multiples of 32 bytes. The internal function calls use a different convention. @@ -145,13 +143,13 @@ Different types have different rules for cleaning up invalid values: Internals - The Optimizer ************************* -The Solidity optimizer operates on assembly, so it can be and also is used by other languages. It splits the sequence of instructions into basic blocks at JUMPs and JUMPDESTs. Inside these blocks, the instructions are analysed and every modification to the stack, to memory or storage is recorded as an expression which consists of an instruction and a list of arguments which are essentially pointers to other expressions. The main idea is now to find expressions that are always equal (on every input) and combine them into an expression class. The optimizer first tries to find each new expression in a list of already known expressions. If this does not work, the expression is simplified according to rules like ``constant + constant = sum_of_constants`` or ``X * 1 = X``. Since this is done recursively, we can also apply the latter rule if the second factor is a more complex expression where we know that it will always evaluate to one. Modifications to storage and memory locations have to erase knowledge about storage and memory locations which are not known to be different: If we first write to location x and then to location y and both are input variables, the second could overwrite the first, so we actually do not know what is stored at x after we wrote to y. On the other hand, if a simplification of the expression x - y evaluates to a non-zero constant, we know that we can keep our knowledge about what is stored at x. +The Solidity optimizer operates on assembly, so it can be and also is used by other languages. It splits the sequence of instructions into basic blocks at ``JUMPs`` and ``JUMPDESTs``. Inside these blocks, the instructions are analysed and every modification to the stack, to memory or storage is recorded as an expression which consists of an instruction and a list of arguments which are essentially pointers to other expressions. The main idea is now to find expressions that are always equal (on every input) and combine them into an expression class. The optimizer first tries to find each new expression in a list of already known expressions. If this does not work, the expression is simplified according to rules like ``constant + constant = sum_of_constants`` or ``X * 1 = X``. Since this is done recursively, we can also apply the latter rule if the second factor is a more complex expression where we know that it will always evaluate to one. Modifications to storage and memory locations have to erase knowledge about storage and memory locations which are not known to be different: If we first write to location x and then to location y and both are input variables, the second could overwrite the first, so we actually do not know what is stored at x after we wrote to y. On the other hand, if a simplification of the expression x - y evaluates to a non-zero constant, we know that we can keep our knowledge about what is stored at x. -At the end of this process, we know which expressions have to be on the stack in the end and have a list of modifications to memory and storage. This information is stored together with the basic blocks and is used to link them. Furthermore, knowledge about the stack, storage and memory configuration is forwarded to the next block(s). If we know the targets of all JUMP and JUMPI instructions, we can build a complete control flow graph of the program. If there is only one target we do not know (this can happen as in principle, jump targets can be computed from inputs), we have to erase all knowledge about the input state of a block as it can be the target of the unknown JUMP. If a JUMPI is found whose condition evaluates to a constant, it is transformed to an unconditional jump. +At the end of this process, we know which expressions have to be on the stack in the end and have a list of modifications to memory and storage. This information is stored together with the basic blocks and is used to link them. Furthermore, knowledge about the stack, storage and memory configuration is forwarded to the next block(s). If we know the targets of all ``JUMP`` and ``JUMPI`` instructions, we can build a complete control flow graph of the program. If there is only one target we do not know (this can happen as in principle, jump targets can be computed from inputs), we have to erase all knowledge about the input state of a block as it can be the target of the unknown ``JUMP``. If a ``JUMPI`` is found whose condition evaluates to a constant, it is transformed to an unconditional jump. As the last step, the code in each block is completely re-generated. A dependency graph is created from the expressions on the stack at the end of the block and every operation that is not part of this graph is essentially dropped. Now code is generated that applies the modifications to memory and storage in the order they were made in the original code (dropping modifications which were found not to be needed) and finally, generates all values that are required to be on the stack in the correct place. -These steps are applied to each basic block and the newly generated code is used as replacement if it is smaller. If a basic block is split at a JUMPI and during the analysis, the condition evaluates to a constant, the JUMPI is replaced depending on the value of the constant, and thus code like +These steps are applied to each basic block and the newly generated code is used as replacement if it is smaller. If a basic block is split at a ``JUMPI`` and during the analysis, the condition evaluates to a constant, the ``JUMPI`` is replaced depending on the value of the constant, and thus code like :: @@ -223,156 +221,12 @@ This means the following source mappings represent the same information: ``1:2:1;:9;2::2;;`` -***************** -Contract Metadata -***************** - -The Solidity compiler automatically generates a JSON file, the -contract metadata, that contains information about the current contract. -It can be used to query the compiler version, the sources used, the ABI -and NatSpec documentation in order to more safely interact with the contract -and to verify its source code. - -The compiler appends a Swarm hash of the metadata file to the end of the -bytecode (for details, see below) of each contract, so that you can retrieve -the file in an authenticated way without having to resort to a centralized -data provider. - -Of course, you have to publish the metadata file to Swarm (or some other service) -so that others can access it. The file can be output by using ``solc --metadata`` -and the file will be called ``ContractName_meta.json``. -It will contain Swarm references to the source code, so you have to upload -all source files and the metadata file. - -The metadata file has the following format. The example below is presented in a -human-readable way. Properly formatted metadata should use quotes correctly, -reduce whitespace to a minimum and sort the keys of all objects to arrive at a -unique formatting. -Comments are of course also not permitted and used here only for explanatory purposes. - -.. code-block:: none - - { - // Required: The version of the metadata format - version: "1", - // Required: Source code language, basically selects a "sub-version" - // of the specification - language: "Solidity", - // Required: Details about the compiler, contents are specific - // to the language. - compiler: { - // Required for Solidity: Version of the compiler - version: "0.4.6+commit.2dabbdf0.Emscripten.clang", - // Optional: Hash of the compiler binary which produced this output - keccak256: "0x123..." - }, - // Required: Compilation source files/source units, keys are file names - sources: - { - "myFile.sol": { - // Required: keccak256 hash of the source file - "keccak256": "0x123...", - // Required (unless "content" is used, see below): Sorted URL(s) - // to the source file, protocol is more or less arbitrary, but a - // Swarm URL is recommended - "urls": [ "bzzr://56ab..." ] - }, - "mortal": { - // Required: keccak256 hash of the source file - "keccak256": "0x234...", - // Required (unless "url" is used): literal contents of the source file - "content": "contract mortal is owned { function kill() { if (msg.sender == owner) selfdestruct(owner); } }" - } - }, - // Required: Compiler settings - settings: - { - // Required for Solidity: Sorted list of remappings - remappings: [ ":g/dir" ], - // Optional: Optimizer settings (enabled defaults to false) - optimizer: { - enabled: true, - runs: 500 - }, - // Required for Solidity: File and name of the contract or library this - // metadata is created for. - compilationTarget: { - "myFile.sol": "MyContract" - }, - // Required for Solidity: Addresses for libraries used - libraries: { - "MyLib": "0x123123..." - } - }, - // Required: Generated information about the contract. - output: - { - // Required: ABI definition of the contract - abi: [ ... ], - // Required: NatSpec user documentation of the contract - userdoc: [ ... ], - // Required: NatSpec developer documentation of the contract - devdoc: [ ... ], - } - } - -.. note:: - Note the ABI definition above has no fixed order. It can change with compiler versions. - -.. note:: - Since the bytecode of the resulting contract contains the metadata hash, any change to - the metadata will result in a change of the bytecode. Furthermore, since the metadata - includes a hash of all the sources used, a single whitespace change in any of the source - codes will result in a different metadata, and subsequently a different bytecode. - -Encoding of the Metadata Hash in the Bytecode -============================================= - -Because we might support other ways to retrieve the metadata file in the future, -the mapping ``{"bzzr0": <Swarm hash>}`` is stored -[CBOR](https://tools.ietf.org/html/rfc7049)-encoded. Since the beginning of that -encoding is not easy to find, its length is added in a two-byte big-endian -encoding. The current version of the Solidity compiler thus adds the following -to the end of the deployed bytecode:: - - 0xa1 0x65 'b' 'z' 'z' 'r' '0' 0x58 0x20 <32 bytes swarm hash> 0x00 0x29 - -So in order to retrieve the data, the end of the deployed bytecode can be checked -to match that pattern and use the Swarm hash to retrieve the file. - -Usage for Automatic Interface Generation and NatSpec -==================================================== - -The metadata is used in the following way: A component that wants to interact -with a contract (e.g. Mist) retrieves the code of the contract, from that -the Swarm hash of a file which is then retrieved. -That file is JSON-decoded into a structure like above. - -The component can then use the ABI to automatically generate a rudimentary -user interface for the contract. - -Furthermore, Mist can use the userdoc to display a confirmation message to the user -whenever they interact with the contract. - -Usage for Source Code Verification -================================== - -In order to verify the compilation, sources can be retrieved from Swarm -via the link in the metadata file. -The compiler of the correct version (which is checked to be part of the "official" compilers) -is invoked on that input with the specified settings. The resulting -bytecode is compared to the data of the creation transaction or CREATE opcode data. -This automatically verifies the metadata since its hash is part of the bytecode. -Excess data corresponds to the constructor input data, which should be decoded -according to the interface and presented to the user. - - *************** Tips and Tricks *************** * Use ``delete`` on arrays to delete all its elements. -* Use shorter types for struct elements and sort them such that short types are grouped together. This can lower the gas costs as multiple SSTORE operations might be combined into a single (SSTORE costs 5000 or 20000 gas, so this is what you want to optimise). Use the gas price estimator (with optimiser enabled) to check! +* Use shorter types for struct elements and sort them such that short types are grouped together. This can lower the gas costs as multiple ``SSTORE`` operations might be combined into a single (``SSTORE`` costs 5000 or 20000 gas, so this is what you want to optimise). Use the gas price estimator (with optimiser enabled) to check! * Make your state variables public - the compiler will create :ref:`getters <visibility-and-getters>` for you automatically. * If you end up checking conditions on input or state a lot at the beginning of your functions, try using :ref:`modifiers`. * If your contract has a function called ``send`` but you want to use the built-in send-function, use ``address(contractVariable).send(amount)``. @@ -469,7 +323,7 @@ Global Variables - ``require(bool condition)``: abort execution and revert state changes if condition is ``false`` (use for malformed input or error in external component) - ``revert()``: abort execution and revert state changes - ``keccak256(...) returns (bytes32)``: compute the Ethereum-SHA-3 (Keccak-256) hash of the (tightly packed) arguments -- ``sha3(...) returns (bytes32)``: an alias to `keccak256` +- ``sha3(...) returns (bytes32)``: an alias to ``keccak256`` - ``sha256(...) returns (bytes32)``: compute the SHA-256 hash of the (tightly packed) arguments - ``ripemd160(...) returns (bytes20)``: compute the RIPEMD-160 hash of the (tightly packed) arguments - ``ecrecover(bytes32 hash, uint8 v, bytes32 r, bytes32 s) returns (address)``: recover address associated with the public key from elliptic curve signature, return zero on error @@ -478,7 +332,7 @@ Global Variables - ``this`` (current contract's type): the current contract, explicitly convertible to ``address`` - ``super``: the contract one level higher in the inheritance hierarchy - ``selfdestruct(address recipient)``: destroy the current contract, sending its funds to the given address -- ``suicide(address recipieint)``: an alias to `selfdestruct`` +- ``suicide(address recipieint)``: an alias to ``selfdestruct`` - ``<address>.balance`` (``uint256``): balance of the :ref:`address` in Wei - ``<address>.send(uint256 amount) returns (bool)``: send given amount of Wei to :ref:`address`, returns ``false`` on failure - ``<address>.transfer(uint256 amount)``: send given amount of Wei to :ref:`address`, throws on failure @@ -519,7 +373,7 @@ Reserved Keywords These keywords are reserved in Solidity. They might become part of the syntax in the future: ``abstract``, ``after``, ``case``, ``catch``, ``default``, ``final``, ``in``, ``inline``, ``let``, ``match``, ``null``, -``of``, ``pure``, ``relocatable``, ``static``, ``switch``, ``try``, ``type``, ``typeof``, ``view``. +``of``, ``relocatable``, ``static``, ``switch``, ``try``, ``type``, ``typeof``. Language Grammar ================ |