Over the past couple of weeks, our primary focus has been updating all clients to PoC5 compliance, and it’s certainly been a long road. Among the changes made to the VM are the following:
- New init/code mechanism: Basically, when you create a contract, the provided code will be executed immediately, and then the returned value of that code will be what becomes the contract code. This allows us to have the contract initialization code, while maintaining the same format [nonce, price, gas, to, value, data] For both transactions and contract creation, which also facilitates the creation of new contracts via forwarding contracts
- Rearranging transaction and contract data: Order now [nonce, price, gas, to, value, data] In transactions and [gas, to, value, datain, datainsz, dataout, dataoutsz] In messages. Note that python keeps the sending parameters (to, value, gas), o = msg(to, value, gas, datain, datainsz) and o = msg(to, value, gas, datain, datainsz, dataoutsz).
- Fee adjustments: The transaction creation fee is now 500 gas, and many other fees have been updated.
- CODECCOPY and CALLDATACOPY opcodes: CODECOPY takes code_index, mem_index and len as arguments, and copies the code from code_index … code_index+len-1 to memory mem_index … mem_index+len-1. These are very useful when combined with init/code. There is now also CODESIZE.
However, the biggest changes have been in the architecture surrounding the protocol. On the GUI side, C++ and Go clients are evolving rapidly, and we will see more updates from this aspect very soon. If you follow Ethereum closely, you’ve probably seen this Religious Lotto,A complete implementation of the lottery, as well as the GUI, is written and implemented within a C++ client. From now on, the C++ client will shift to being a more developer-oriented tool, while the Go client will start to focus on being a user-facing application (or rather, a declarative application). On the compiler side, Serpent has undergone a number of substantial improvements.
First, the code. Peek at the Serpent Translator under the hood and you’ll be able to see All jobsavailable, as well as their exact translations into EVM code. For example, we have:
72: [‘access’, 2, 1,
73: [”, ”, 32, ‘MUL’, ‘ADD’, ‘MLOAD’]],
This means that what access(x,y) actually does under the hood is that it recursively aggregates whatever x and y already are, and then loads memory at index x + y * 32; Hence, x is the pointer to the beginning of the array and y is the index. This code structure has been around since PoC4, but I’ve now upgraded the meta language used to describe compilations further, to include if, while, and init/code in this structure (before they were special cases); Now, only set and seq remain as special cases, and if I wanted to, I could even remove seq by reimplementing them as special cases Rewrite the rule.
The biggest changes so far have been around PoC5 compatibility. For example, if you run serpent Compiler_to_Assembly ‘return(msg.data[0]*2)’, you will see:
[“begincode_0″, “CALLDATACOPY”, “RETURN”, “~begincode_0”, “#CODE_BEGIN”, 2, 0, “CALLDATALOAD”, “MUL”, “MSIZE”, “SWAP”, “MSIZE”, “MSTORE”, 32, “SWAP”, “RETURN”, “#CODE_END”, “~endcode_0”]
The actual code is just there:
[2, 0, “CALLDATALOAD”, “MUL”, “MSIZE”, “SWAP”, “MSIZE”, “MSTORE”, 32, “SWAP”, “RETURN”]
If you want to see what’s going on here, let’s say the message comes with its first reference being 5. So we have:
2->Stack: [2]
0 -> stack: [2, 0]
CALLDATALOAD -> Stack: [2,5]
Mall -> Stack: [10]
MSIZE -> STACK: [10, 0]
Swap -> Stack: [0, 10]
MSIZE -> STACK: [0, 10, 0]
MSTORE -> STACK: [0],memory: [0, 0, 0 … 10]
32 -> stack: [0, 32],memory: [0, 0, 0 … 10]
Swap -> Stack: [32, 0],memory: [0, 0, 0 … 10]
Back
The last RETURN operation returns 32 bytes of memory starting at 0 or [0, 0, 0 … 10]Or the number 10.
Now, let’s analyze the assembler code.
[“begincode_0″, “CALLDATACOPY”, “RETURN”, “~begincode_0”, “#CODE_BEGIN”, ….. , “#CODE_END”, “~endcode_0”]
I’ve deleted the internal code shown above to make things clearer. The first thing we see are two signs, begincode_0 andendcode_0 and guards #CODE_BEGIN and #CODE_END. The labels indicate the beginning and end of the internal code, and the guards are there for later stages of the compiler, which understands that everything between the guards should be compiled as if it were a separate program. Now let’s look at the first parts of the code. In this case, we have ~begincode_0 at position 10 and ~endcode_0 at position 24 in the final code. endcode_0 is used to indicate these positions, and $begincode_0.endcode_0 indicates the length of the interval between them, 14. Now, remember that while initializing the nodes, the call data is the code you enter. Hence, we have:
14 -> stack: [14]
DUP -> Stack: [14, 14]
MSIZE -> STACK: [14, 14, 0]
Swap -> Stack: [14, 0, 14]
MSIZE -> STACK: [14, 0, 14, 0]
10->stack: [14, 0, 14, 0, 10]
Call copy -> stack: [14, 0] memory: [ … ]
Back
Notice how the first half of the code cleverly sets up the stack so that the internal code pushes to memory pointers 0…13, and then immediately returns that bit of memory. In the final compiled code, 600e515b525b600a37f26002600035025b525b54602052f2, the internal code is located well to the right of the initializer code that simply returns it. In more complex contracts, initializers can also perform functions such as setting specific storage slots to values, or even calling or creating other contracts.
Now, let’s introduce Serpent’s newest and most fun feature: imports. One common use case in contract land is that you want to give nodes the ability to produce new contracts. The problem is, how do you put the code for generator contracts into generator contracts? Previously, the only solution was the inconvenient approach of compiling the newest contracts first, then putting the compiled code into an array. Now, we have a better solution: import.
Put the following in returnten.se:
x = create(tx.gas – 100, 0, import(mul2.se)) return(msg(x,0,tx.gas-100,[5],1))
Now, put the following in mul2.se:
return(msg.data[0]*2)
Now, if you compile the returnten.se python and Run the contractI noticed that, oh my God, it goes back ten. The reason is clear. The returnten.se contract creates an instance of the mul2.se contract, and then calls it with the value 5. mul2.se, as the name suggests, is a multiplier, so it returns 5*2 = 10. Note that import is not a function in the standard sense; x = import(‘123.se’) will fail, and import only works in a very specific build context.
Now, let’s say you are creating a huge contract of 1000 lines and want to split it into files. To do this, we use inset. Intoouter.se, put:
If msg.data[0] == 1: inset (inner.se)
And on inside.se, put:
Back(3)
Running serpent Compile External.se gives you a nice piece of compiled code that returns 3 if msg.data[0] The argument equals one. And that’s all there is to it.
Upcoming updates for Serpent include:
- An improvement to this mechanism so that the internal code is not loaded twice if you try to use import twice with the same file name
- String literal
- Space and code efficiency improvements for array literals
- Debug designer (i.e. a compiler function that tells you which python lines correspond to which bytes of compiled code)
However, in the short term, I will focus my efforts on bug fixes, a cross-client test suite, and continuing to work on ethereumjs-lib.





















.jpg)

