aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorchriseth <chris@ethereum.org>2018-10-12 21:53:45 +0800
committerGitHub <noreply@github.com>2018-10-12 21:53:45 +0800
commit94526b2d92e469fc8679be1f5a2b56c4c1ed25be (patch)
treea85bb55dbb29de2d3e271160af3e5afcc7d9c228
parent1d312c8e4073e2e7ce9a23a721013942e1e5c727 (diff)
parent914668c622b60eab4129d0a6b3776c20d8e614bd (diff)
downloaddexon-solidity-94526b2d92e469fc8679be1f5a2b56c4c1ed25be.tar
dexon-solidity-94526b2d92e469fc8679be1f5a2b56c4c1ed25be.tar.gz
dexon-solidity-94526b2d92e469fc8679be1f5a2b56c4c1ed25be.tar.bz2
dexon-solidity-94526b2d92e469fc8679be1f5a2b56c4c1ed25be.tar.lz
dexon-solidity-94526b2d92e469fc8679be1f5a2b56c4c1ed25be.tar.xz
dexon-solidity-94526b2d92e469fc8679be1f5a2b56c4c1ed25be.tar.zst
dexon-solidity-94526b2d92e469fc8679be1f5a2b56c4c1ed25be.zip
Merge pull request #5145 from ethereum/hashLinker
Hash linker
-rw-r--r--Changelog.md1
-rw-r--r--docs/050-breaking-changes.rst8
-rw-r--r--docs/using-the-compiler.rst18
-rw-r--r--libdevcore/CommonData.cpp10
-rw-r--r--libdevcore/CommonIO.cpp2
-rw-r--r--libevmasm/LinkerObject.cpp10
-rw-r--r--libevmasm/LinkerObject.h5
-rw-r--r--solc/CommandLineInterface.cpp56
-rw-r--r--solc/CommandLineInterface.h4
-rwxr-xr-xtest/cmdlineTests.sh18
-rw-r--r--test/libevmasm/Assembler.cpp2
11 files changed, 114 insertions, 20 deletions
diff --git a/Changelog.md b/Changelog.md
index 86376017..b817dc22 100644
--- a/Changelog.md
+++ b/Changelog.md
@@ -17,6 +17,7 @@ Breaking Changes:
* Commandline interface: Remove obsolete ``--formal`` option.
* Commandline interface: Rename the ``--julia`` option to ``--yul``.
* Commandline interface: Require ``-`` if standard input is used as source.
+ * Commandline interface: Use hash of library name for link placeholder instead of name itself.
* Compiler interface: Disallow remappings with empty prefix.
* Control Flow Analyzer: Consider mappings as well when checking for uninitialized return values.
* Control Flow Analyzer: Turn warning about returning uninitialized storage pointers into an error.
diff --git a/docs/050-breaking-changes.rst b/docs/050-breaking-changes.rst
index 93f099ca..7b227297 100644
--- a/docs/050-breaking-changes.rst
+++ b/docs/050-breaking-changes.rst
@@ -158,6 +158,14 @@ Command Line and JSON Interfaces
node was replaced by a field called ``kind`` which can have the
value ``"constructor"``, ``"fallback"`` or ``"function"``.
+* In unlinked binary hex files, library address placeholders are now
+ the first 36 hex characters of the keccak256 hash of the fully qualified
+ library name, surrounded by ``$...$``. Previously,
+ just the fully qualified library name was used.
+ This recudes the chances of collisions, especially when long paths are used.
+ Binary files now also contain a list of mappings from these placeholders
+ to the fully qualified names.
+
Constructors
------------
diff --git a/docs/using-the-compiler.rst b/docs/using-the-compiler.rst
index 39520bec..9ba6caa5 100644
--- a/docs/using-the-compiler.rst
+++ b/docs/using-the-compiler.rst
@@ -41,14 +41,26 @@ If there are multiple matches due to remappings, the one with the longest common
For security reasons the compiler has restrictions what directories it can access. Paths (and their subdirectories) of source files specified on the commandline and paths defined by remappings are allowed for import statements, but everything else is rejected. Additional paths (and their subdirectories) can be allowed via the ``--allow-paths /sample/path,/another/sample/path`` switch.
-If your contracts use :ref:`libraries <libraries>`, you will notice that the bytecode contains substrings of the form ``__LibraryName______``. You can use ``solc`` as a linker meaning that it will insert the library addresses for you at those points:
+If your contracts use :ref:`libraries <libraries>`, you will notice that the bytecode contains substrings of the form ``__$53aea86b7d70b31448b230b20ae141a537$__``. These are placeholders for the actual library addresses.
+The placeholder is a 34 character prefix of the hex encoding of the keccak256 hash of the fully qualified library name.
+The bytecode file will also contain lines of the form ``// <placeholder> -> <fq library name>`` at the end to help
+identify which libraries the placeholders represent. Note that the fully qualified library name
+is the path of its source file and the library name separated by ``:``.
+You can use ``solc`` as a linker meaning that it will insert the library addresses for you at those points:
-Either add ``--libraries "Math:0x12345678901234567890 Heap:0xabcdef0123456"`` to your command to provide an address for each library or store the string in a file (one library per line) and run ``solc`` using ``--libraries fileName``.
+Either add ``--libraries "file.sol:Math:0x1234567890123456789012345678901234567890 file.sol:Heap:0xabCD567890123456789012345678901234567890"`` to your command to provide an address for each library or store the string in a file (one library per line) and run ``solc`` using ``--libraries fileName``.
-If ``solc`` is called with the option ``--link``, all input files are interpreted to be unlinked binaries (hex-encoded) in the ``__LibraryName____``-format given above and are linked in-place (if the input is read from stdin, it is written to stdout). All options except ``--libraries`` are ignored (including ``-o``) in this case.
+If ``solc`` is called with the option ``--link``, all input files are interpreted to be unlinked binaries (hex-encoded) in the ``__$53aea86b7d70b31448b230b20ae141a537$__``-format given above and are linked in-place (if the input is read from stdin, it is written to stdout). All options except ``--libraries`` are ignored (including ``-o``) in this case.
If ``solc`` is called with the option ``--standard-json``, it will expect a JSON input (as explained below) on the standard input, and return a JSON output on the standard output. This is the recommended interface for more complex and especially automated uses.
+.. note::
+ The library placeholder used to be the fully qualified name of the library itself
+ instead of the hash of it. This format is still supported by ``solc --link`` but
+ the compiler will no longer output it. This change was made to reduce
+ the likelihood of a collision between libraries, since only the first 36 characters
+ of the fully qualified library name could be used.
+
.. _evm-version:
.. index:: ! EVM version, compile target
diff --git a/libdevcore/CommonData.cpp b/libdevcore/CommonData.cpp
index 445d11cd..6d7c74d7 100644
--- a/libdevcore/CommonData.cpp
+++ b/libdevcore/CommonData.cpp
@@ -76,18 +76,18 @@ bytes dev::fromHex(std::string const& _s, WhenError _throw)
bool dev::passesAddressChecksum(string const& _str, bool _strict)
{
- string s = _str.substr(0, 2) == "0x" ? _str.substr(2) : _str;
+ string s = _str.substr(0, 2) == "0x" ? _str : "0x" + _str;
- if (s.length() != 40)
+ if (s.length() != 42)
return false;
if (!_strict && (
- _str.find_first_of("abcdef") == string::npos ||
- _str.find_first_of("ABCDEF") == string::npos
+ s.find_first_of("abcdef") == string::npos ||
+ s.find_first_of("ABCDEF") == string::npos
))
return true;
- return _str == dev::getChecksummedAddress(_str);
+ return s == dev::getChecksummedAddress(s);
}
string dev::getChecksummedAddress(string const& _addr)
diff --git a/libdevcore/CommonIO.cpp b/libdevcore/CommonIO.cpp
index 2005d087..1aa3504c 100644
--- a/libdevcore/CommonIO.cpp
+++ b/libdevcore/CommonIO.cpp
@@ -94,7 +94,7 @@ void dev::writeFile(std::string const& _file, bytesConstRef _data, bool _writeDe
{
// create directory if not existent
fs::path p(_file);
- if (!fs::exists(p.parent_path()))
+ if (!p.parent_path().empty() && !fs::exists(p.parent_path()))
{
fs::create_directories(p.parent_path());
try
diff --git a/libevmasm/LinkerObject.cpp b/libevmasm/LinkerObject.cpp
index 1d5efecb..a11f2378 100644
--- a/libevmasm/LinkerObject.cpp
+++ b/libevmasm/LinkerObject.cpp
@@ -21,6 +21,7 @@
#include <libevmasm/LinkerObject.h>
#include <libdevcore/CommonData.h>
+#include <libdevcore/SHA3.h>
using namespace dev;
using namespace dev::eth;
@@ -50,14 +51,19 @@ string LinkerObject::toHex() const
for (auto const& ref: linkReferences)
{
size_t pos = ref.first * 2;
- string const& name = ref.second;
+ string hash = libraryPlaceholder(ref.second);
hex[pos] = hex[pos + 1] = hex[pos + 38] = hex[pos + 39] = '_';
for (size_t i = 0; i < 36; ++i)
- hex[pos + 2 + i] = i < name.size() ? name[i] : '_';
+ hex[pos + 2 + i] = hash.at(i);
}
return hex;
}
+string LinkerObject::libraryPlaceholder(string const& _libraryName)
+{
+ return "$" + keccak256(_libraryName).hex().substr(0, 34) + "$";
+}
+
h160 const*
LinkerObject::matchLibrary(
string const& _linkRefName,
diff --git a/libevmasm/LinkerObject.h b/libevmasm/LinkerObject.h
index 152487b4..92890803 100644
--- a/libevmasm/LinkerObject.h
+++ b/libevmasm/LinkerObject.h
@@ -50,6 +50,11 @@ struct LinkerObject
/// addresses by placeholders.
std::string toHex() const;
+ /// @returns a 36 character string that is used as a placeholder for the library
+ /// address (enclosed by `__` on both sides). The placeholder is the hex representation
+ /// of the first 18 bytes of the keccak-256 hash of @a _libraryName.
+ static std::string libraryPlaceholder(std::string const& _libraryName);
+
private:
static h160 const* matchLibrary(
std::string const& _linkRefName,
diff --git a/solc/CommandLineInterface.cpp b/solc/CommandLineInterface.cpp
index 8fd0d6ef..4052ed13 100644
--- a/solc/CommandLineInterface.cpp
+++ b/solc/CommandLineInterface.cpp
@@ -226,21 +226,21 @@ void CommandLineInterface::handleBinary(string const& _contract)
if (m_args.count(g_argBinary))
{
if (m_args.count(g_argOutputDir))
- createFile(m_compiler->filesystemFriendlyName(_contract) + ".bin", m_compiler->object(_contract).toHex());
+ createFile(m_compiler->filesystemFriendlyName(_contract) + ".bin", objectWithLinkRefsHex(m_compiler->object(_contract)));
else
{
cout << "Binary: " << endl;
- cout << m_compiler->object(_contract).toHex() << endl;
+ cout << objectWithLinkRefsHex(m_compiler->object(_contract)) << endl;
}
}
if (m_args.count(g_argBinaryRuntime))
{
if (m_args.count(g_argOutputDir))
- createFile(m_compiler->filesystemFriendlyName(_contract) + ".bin-runtime", m_compiler->runtimeObject(_contract).toHex());
+ createFile(m_compiler->filesystemFriendlyName(_contract) + ".bin-runtime", objectWithLinkRefsHex(m_compiler->runtimeObject(_contract)));
else
{
cout << "Binary of the runtime part: " << endl;
- cout << m_compiler->runtimeObject(_contract).toHex() << endl;
+ cout << objectWithLinkRefsHex(m_compiler->runtimeObject(_contract)) << endl;
}
}
}
@@ -482,9 +482,23 @@ bool CommandLineInterface::parseLibraryOption(string const& _input)
string addrString(lib.begin() + colon + 1, lib.end());
boost::trim(libName);
boost::trim(addrString);
+ if (addrString.substr(0, 2) == "0x")
+ addrString = addrString.substr(2);
+ if (addrString.empty())
+ {
+ cerr << "Empty address provided for library \"" << libName << "\": " << endl;
+ cerr << "Note that there should not be any whitespace after the colon." << endl;
+ return false;
+ }
+ else if (addrString.length() != 40)
+ {
+ cerr << "Invalid length for address for library \"" << libName << "\": " << addrString.length() << " instead of 40 characters." << endl;
+ return false;
+ }
if (!passesAddressChecksum(addrString, false))
{
- cerr << "Invalid checksum on library address \"" << libName << "\": " << addrString << endl;
+ cerr << "Invalid checksum on address for library \"" << libName << "\": " << addrString << endl;
+ cerr << "The correct checksum is " << dev::getChecksummedAddress(addrString) << endl;
return false;
}
bytes binAddr = fromHex(addrString);
@@ -569,7 +583,7 @@ Allowed options)",
g_argLibraries.c_str(),
po::value<vector<string>>()->value_name("libs"),
"Direct string or file containing library addresses. Syntax: "
- "<libraryName>: <address> [, or whitespace] ...\n"
+ "<libraryName>:<address> [, or whitespace] ...\n"
"Address is interpreted as a hex string optionally prefixed by 0x."
)
(
@@ -1056,8 +1070,12 @@ bool CommandLineInterface::link()
{
string const& name = library.first;
// Library placeholders are 40 hex digits (20 bytes) that start and end with '__'.
- // This leaves 36 characters for the library name, while too short library names are
- // padded on the right with '_' and too long names are truncated.
+ // This leaves 36 characters for the library identifier. The identifier used to
+ // be just the cropped or '_'-padded library name, but this changed to
+ // the cropped hex representation of the hash of the library name.
+ // We support both ways of linking here.
+ librariesReplacements["__" + eth::LinkerObject::libraryPlaceholder(name) + "__"] = library.second;
+
string replacement = "__";
for (size_t i = 0; i < placeholderSize - 4; ++i)
replacement.push_back(i < name.size() ? name[i] : '_');
@@ -1087,6 +1105,11 @@ bool CommandLineInterface::link()
cerr << "Reference \"" << name << "\" in file \"" << src.first << "\" still unresolved." << endl;
it += placeholderSize;
}
+ // Remove hints for resolved libraries.
+ for (auto const& library: m_libraries)
+ boost::algorithm::erase_all(src.second, "\n" + libraryPlaceholderHint(library.first));
+ while (!src.second.empty() && *prev(src.second.end()) == '\n')
+ src.second.resize(src.second.size() - 1);
}
return true;
}
@@ -1100,6 +1123,23 @@ void CommandLineInterface::writeLinkedFiles()
writeFile(src.first, src.second);
}
+string CommandLineInterface::libraryPlaceholderHint(string const& _libraryName)
+{
+ return "// " + eth::LinkerObject::libraryPlaceholder(_libraryName) + " -> " + _libraryName;
+}
+
+string CommandLineInterface::objectWithLinkRefsHex(eth::LinkerObject const& _obj)
+{
+ string out = _obj.toHex();
+ if (!_obj.linkReferences.empty())
+ {
+ out += "\n";
+ for (auto const& linkRef: _obj.linkReferences)
+ out += "\n" + libraryPlaceholderHint(linkRef.second);
+ }
+ return out;
+}
+
bool CommandLineInterface::assemble(
AssemblyStack::Language _language,
AssemblyStack::Machine _targetMachine
diff --git a/solc/CommandLineInterface.h b/solc/CommandLineInterface.h
index 010dce34..aa49383a 100644
--- a/solc/CommandLineInterface.h
+++ b/solc/CommandLineInterface.h
@@ -54,6 +54,10 @@ public:
private:
bool link();
void writeLinkedFiles();
+ /// @returns the ``// <identifier> -> name`` hint for library placeholders.
+ static std::string libraryPlaceholderHint(std::string const& _libraryName);
+ /// @returns the full object with library placeholder hints in hex.
+ static std::string objectWithLinkRefsHex(eth::LinkerObject const& _obj);
bool assemble(AssemblyStack::Language _language, AssemblyStack::Machine _targetMachine);
diff --git a/test/cmdlineTests.sh b/test/cmdlineTests.sh
index 71866bce..20254ef4 100755
--- a/test/cmdlineTests.sh
+++ b/test/cmdlineTests.sh
@@ -233,6 +233,24 @@ echo '' | "$SOLC" - --link --libraries a:0x90f20564390eAe531E810af625A22f51385Cd
printTask "Testing long library names..."
echo '' | "$SOLC" - --link --libraries aveeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeerylonglibraryname:0x90f20564390eAe531E810af625A22f51385Cd222 >/dev/null
+printTask "Testing linking itself..."
+SOLTMPDIR=$(mktemp -d)
+(
+ cd "$SOLTMPDIR"
+ set -e
+ echo 'library L { function f() public pure {} } contract C { function f() public pure { L.f(); } }' > x.sol
+ "$SOLC" --bin -o . x.sol 2>/dev/null
+ # Explanation and placeholder should be there
+ grep -q '//' C.bin && grep -q '__' C.bin
+ # But not in library file.
+ grep -q -v '[/_]' L.bin
+ # Now link
+ "$SOLC" --link --libraries x.sol:L:0x90f20564390eAe531E810af625A22f51385Cd222 C.bin
+ # Now the placeholder and explanation should be gone.
+ grep -q -v '[/_]' C.bin
+)
+rm -rf "$SOLTMPDIR"
+
printTask "Testing overwriting files..."
SOLTMPDIR=$(mktemp -d)
(
diff --git a/test/libevmasm/Assembler.cpp b/test/libevmasm/Assembler.cpp
index bc652f56..1c041596 100644
--- a/test/libevmasm/Assembler.cpp
+++ b/test/libevmasm/Assembler.cpp
@@ -94,7 +94,7 @@ BOOST_AUTO_TEST_CASE(all_assembly_items)
BOOST_CHECK_EQUAL(
_assembly.assemble().toHex(),
- "5b6001600220606773__someLibrary___________________________"
+ "5b6001600220606773__$bf005014d9d0f534b8fcb268bd84c491a2$__"
"6000567f556e75736564206665617475726520666f722070757368696e"
"6720737472696e605f6001605e73000000000000000000000000000000000000000000fe"
"fe010203044266eeaa"