-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guide to implementing a new runner + Python runner for PArSEC #241
base: trunk
Are you sure you want to change the base?
Conversation
98f2885
to
995709b
Compare
Will address the linting errors shortly |
230f402
to
24f860e
Compare
Signed-off-by: Michael Maurer <[email protected]>
Adds a tool to get information directly from the shards, for debug and testing purposes. Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
+ Includes linting fixes Signed-off-by: Michael Maurer <[email protected]>
Signed-off-by: Michael Maurer <[email protected]>
24f860e
to
0ca9524
Compare
@@ -0,0 +1,33 @@ | |||
def pay(pay, balance1, balance2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hints won't forcibly typecast or error out but helps for reading the code
def pay(pay, balance1, balance2): | |
def pay(pay: int, balance1: int, balance2: int) -> [int, int]: |
pay = int(pay) | ||
if(pay > balance1): | ||
pay = 0 | ||
balance1 = balance1 - pay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
balance1 = balance1 - pay | |
balance1 -= pay |
if(pay > balance1): | ||
pay = 0 | ||
balance1 = balance1 - pay | ||
balance2 = balance2 + pay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
balance2 = balance2 + pay | |
balance2 += pay |
balance1 = int(balance1) | ||
balance2 = int(balance2) | ||
pay = int(pay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] could compact it by assigning on one line. Just style preference
balance1 = int(balance1) | |
balance2 = int(balance2) | |
pay = int(pay) | |
balance1, balance2, pay = int(balance1), int(balance2), int(pay) |
balance1 = int(balance1) | ||
balance2 = int(balance2) | ||
pay = int(pay) | ||
if(pay > balance1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing after the if is preferable for readability
if(pay > balance1): | |
if (pay > balance1): |
def accrueInterest(rate, balance1): | ||
balance1 = int(balance1) | ||
rate = float(rate) | ||
print("Balance 1:", balance1) | ||
print("Rate:", rate*100, "%") | ||
balance1 = int(balance1 * (2.71828**rate)) | ||
print("Balance 1:", balance1) | ||
return [balance1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would write it like this. Could add a "Balance 1 after rate:" print after the conversion has been applied.
def accrueInterest(rate, balance1): | |
balance1 = int(balance1) | |
rate = float(rate) | |
print("Balance 1:", balance1) | |
print("Rate:", rate*100, "%") | |
balance1 = int(balance1 * (2.71828**rate)) | |
print("Balance 1:", balance1) | |
return [balance1] | |
def accrueInterest(rate: float, balance1: int) -> list[int]: | |
rate, balance1 = float(rate), int(balance1) | |
print(f"Balance 1: {balance1}") | |
print(f"Rate: {rate*100}%") | |
balance1 = int(balance1 * (2.71828**rate)) | |
print(f"Balance 1: {balance1}") | |
return [balance1] |
def pay2(pay1, pay2, balance0, balance1, balance2): | ||
balance0 = int(balance0) | ||
balance1 = int(balance1) | ||
balance2 = int(balance2) | ||
pay1 = int(pay1) | ||
pay2 = int(pay2) | ||
if(pay1 + pay2 > balance0 and pay1 > 0 and pay2 > 0): | ||
pay1 = 0 | ||
pay2 = 0 | ||
balance0 = balance0 - pay1 - pay2 | ||
balance1 = balance1 + pay1 | ||
balance2 = balance2 + pay2 | ||
return [balance0, balance1, balance2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def pay2(pay1, pay2, balance0, balance1, balance2): | |
balance0 = int(balance0) | |
balance1 = int(balance1) | |
balance2 = int(balance2) | |
pay1 = int(pay1) | |
pay2 = int(pay2) | |
if(pay1 + pay2 > balance0 and pay1 > 0 and pay2 > 0): | |
pay1 = 0 | |
pay2 = 0 | |
balance0 = balance0 - pay1 - pay2 | |
balance1 = balance1 + pay1 | |
balance2 = balance2 + pay2 | |
return [balance0, balance1, balance2] | |
def pay2(pay1, pay2, balance0, balance1, balance2) -> list[int, int, int]: | |
balance0, balance1, balance2 = int(balance0), int(balance1), int(balance2) | |
if (pay1 + pay2 > balance0) and (min(pay1, pay2) > 0): | |
pay1, pay2 = 0, 0 | |
balance0 -= (pay1 + pay2) | |
balance1 += pay1 | |
balance2 += pay2 | |
return [balance0, balance1, balance2] |
@@ -0,0 +1,115 @@ | |||
# assumes all return types are strings | |||
def parseContract(file, funcName): | |||
import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best to put imports at the top before functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I included the import here instead of above is because python functions which are to be converted to smart contracts need to have their imports defined within the actual function, so this preserves that stylistic behavior. In this particular instance however the import can be defined outside of the function, so I would be happy to make that change.
Further, imports need to be specified in the C++ scope (see runners/py/pyutil.cpp:20
) which is interesting to note, because it allows for the system designer to restrict what python libraries are available for smart contracts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good context, thanks for explaining the logic
#!/bin/bash | ||
|
||
# A sample script for running the Lua bench test | ||
|
||
IP="localhost" | ||
PORT="8889" | ||
N_WALLETS=2 | ||
|
||
./build/tools/bench/parsec/lua/lua_bench --component_id=0 \ | ||
--ticket_machine0_endpoint=$IP:7777 --ticket_machine_count=1 \ | ||
--shard_count=1 --shard0_count=1 --shard00_endpoint=$IP:5556 \ | ||
--agent_count=1 --agent0_endpoint=$IP:$PORT \ | ||
--loglevel=TRACE scripts/gen_bytecode.lua $N_WALLETS | ||
echo done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs rebasing for #253
@@ -41,7 +41,7 @@ fi | |||
|
|||
if [[ "$OSTYPE" == "linux-gnu"* ]]; then | |||
apt update | |||
apt install -y build-essential wget cmake libgtest-dev libbenchmark-dev lcov git software-properties-common rsync unzip | |||
apt install -y build-essential wget cmake libgtest-dev libbenchmark-dev lcov git software-properties-common rsync unzip python3-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified that this works. It installed python 3.10 in my Ubuntu environment, but in Dockerfile
at root dir, it is specified that we use python 3.8 ENV PY_VER=python3.8
(line 30)
@@ -27,6 +27,8 @@ FROM $BASE_IMAGE AS builder | |||
# Copy source | |||
COPY . . | |||
|
|||
ENV PY_VER=python3.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 3.8 instead of newer such as 3.10+ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #252 was merged, we require python 3.10 in-practice anyway (in terms of what's shipped by the image our docker images are based upon). I'm quite comfortable pushing this to 3.10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, there is no reason that this needs to be 3.8 and cannot be 3.10 -- at the time of writing I was using python 3.8 but can run it just fine with python 3.10
COPY --from=builder /opt/tx-processor/build/tools/bench/parsec/py/py_bench ./build/tools/bench/parsec/py/py_bench | ||
|
||
# Copy wait script | ||
COPY --from=builder /opt/tx-processor/scripts/wait-for-it.sh ./scripts/wait-for-it.sh | ||
|
||
# Copy python scripts | ||
COPY --from=builder /opt/tx-processor/scripts/pythonContractConverter.py ./scripts/pythonContractConverter.py | ||
COPY --from=builder /opt/tx-processor/scripts/paycontract.py ./scripts/paycontract.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable additions, but this is already a strong argument that we need a better maintainability story for runners and how we add them.
(i.e., the scripts
directory is going to rapidly become unwieldy—let's be honest, it already is—should we continue adding runners without bound).
return contract | ||
|
||
|
||
contract = parseContract(file, funcname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"file" and funcname" are not defined. Is this meant to be a comment on how to run it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this file is actually just used to parse python functions such that arbitrary python functions can be formatted for usage as smart contracts. It cannot be run on its own, but is written to be used within pyutil.cpp:formContract
.
In that function, file
and funcname
are specified within the C++ scope and registered within the Python context, which is why they are not defined here.
I do certainly understand the confusion here, and I don't actually think this is the cleanest way to do this contract parsing -- though it is much more straightforward to define string parsing behavior within python as opposed to C++ which is why I did it. This was just written to leverage the already-in-use Python VM to handle some string operations with the ease of Python string operations.
# assumes all return types are strings | ||
def parseContract(file, funcName): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would add type hints then
# assumes all return types are strings | |
def parseContract(file, funcName): | |
# assumes all return types are strings | |
def parseContract(file: str, funcName: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fully agree that type-hints are a good practice, and make the commented assumption at least parsed (even if not enforced).
I might additionally suggest removing the comment if the type hints are added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually refers to the return types of functions which are parsed by parseContract
, so this behavior cannot be represented as a type hint (or at least not a standard Pythonic type hint). That said, these type hints are correct and could be added without an issue in this context, if helpful.
data_lines = data_lines[idx:] | ||
|
||
|
||
while(True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for parentheses
while(True): | |
while True: |
fin = -1 | ||
try: | ||
fin = open(file) | ||
except: | ||
print("[ERROR] {fname} cannot be opened".format(fname = file)) | ||
return "" | ||
|
||
data = fin.read() | ||
fin.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The with handles file closing automatically
fin = -1 | |
try: | |
fin = open(file) | |
except: | |
print("[ERROR] {fname} cannot be opened".format(fname = file)) | |
return "" | |
data = fin.read() | |
fin.close() | |
data = '' | |
try: | |
with open(file, 'r') as fi: | |
data = fi.read() | |
except: | |
print(f"[ERROR] {file} cannot be opened") | |
return "" |
if(("def " + funcName + "(") not in data): | ||
print("[ERROR] {fname} does not contain a definition for \"{func}\"".format(fname = file, func = funcName)) | ||
return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f-strings are useful here
if(("def " + funcName + "(") not in data): | |
print("[ERROR] {fname} does not contain a definition for \"{func}\"".format(fname = file, func = funcName)) | |
return "" | |
if f"def {funcName}(" not in data: | |
print(f"[ERROR] {file} does not contain a definition for \"{funcName}\"") | |
return "" |
Needs rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments as I skimmed through pre-rebase and testing.
Solid conceptACK; look forward to testing thoroughly and getting it merged!
@@ -27,6 +27,8 @@ FROM $BASE_IMAGE AS builder | |||
# Copy source | |||
COPY . . | |||
|
|||
ENV PY_VER=python3.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #252 was merged, we require python 3.10 in-practice anyway (in terms of what's shipped by the image our docker images are based upon). I'm quite comfortable pushing this to 3.10.
COPY --from=builder /opt/tx-processor/build/tools/bench/parsec/py/py_bench ./build/tools/bench/parsec/py/py_bench | ||
|
||
# Copy wait script | ||
COPY --from=builder /opt/tx-processor/scripts/wait-for-it.sh ./scripts/wait-for-it.sh | ||
|
||
# Copy python scripts | ||
COPY --from=builder /opt/tx-processor/scripts/pythonContractConverter.py ./scripts/pythonContractConverter.py | ||
COPY --from=builder /opt/tx-processor/scripts/paycontract.py ./scripts/paycontract.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable additions, but this is already a strong argument that we need a better maintainability story for runners and how we add them.
(i.e., the scripts
directory is going to rapidly become unwieldy—let's be honest, it already is—should we continue adding runners without bound).
#!/bin/bash | ||
|
||
# A sample script for running the Lua bench test | ||
|
||
IP="localhost" | ||
PORT="8889" | ||
N_WALLETS=2 | ||
|
||
./build/tools/bench/parsec/lua/lua_bench --component_id=0 \ | ||
--ticket_machine0_endpoint=$IP:7777 --ticket_machine_count=1 \ | ||
--shard_count=1 --shard0_count=1 --shard00_endpoint=$IP:5556 \ | ||
--agent_count=1 --agent0_endpoint=$IP:$PORT \ | ||
--loglevel=TRACE scripts/gen_bytecode.lua $N_WALLETS | ||
echo done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs rebasing for #253
|
||
#cp ./scripts/pythonContractConverter.py ./build/tools/bench/parsec/py/ | ||
|
||
rr record ./build/tools/bench/parsec/py/py_bench --component_id=0 --ticket_machine0_endpoint=$IP:7777 --ticket_machine_count=1 --shard_count=1 --shard0_count=1 --shard00_endpoint=$IP:5556 --agent_count=1 --agent0_endpoint=$IP:$PORT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an oversight on my end, should not be included here -- good catch
# assumes all return types are strings | ||
def parseContract(file, funcName): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fully agree that type-hints are a good practice, and make the commented assumption at least parsed (even if not enforced).
I might additionally suggest removing the comment if the type hints are added.
@@ -142,6 +143,16 @@ auto main(int argc, char** argv) -> int { | |||
broker, | |||
log, | |||
cfg.value()); | |||
} else if(cfg->m_runner_type == cbdc::parsec::runner_type::py) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(optional) I think this full-block might be a candidate for replacement with switch
/case
(so that the number of supported runners doesn't necessarily change the start-up time).
However, there's a deeper question that may be worth considering here: are runners a good candidate for pulling into a dynamically-linked library enabling the installation/support of new runners without recompilation? This is not a very small engineering effort; but something like dylib might enable us to load the runners at runtime.
This might be a particularly positive way of pulling runners out of the main repo and modularizing them. Food for thought.
cf. #261
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WRT the first comment, I would agree that we ought to make this a switch statement -- for this, I think what'd be best is defining an enum with runner types
WRT the second comment, I think that's actually a very interesting idea -- and i am definitely for pulling runners out of the main repository. This may be a task for a later PR but I think this is good to keep in mind for sure
// get parameters that are user inputs | ||
size_t bufferOffset = 0; | ||
auto it = 0; | ||
// Known that there are 3 parts of the header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a touch magical. Is this guaranteed, python-version-dependent, or internally-specified?
If internally-specified, is there benefit to pulling these cases out as an enum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is internally specified, and the input validation is handled just above this code block -- just to check, are you referring to the cases in the switch statement below this block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the actual line I highlighted for this comment was line 194 (the comment asserting that there are three parts of the header). 👍
This is a Python-based runner and a corresponding guide for adding a new runner. The goal of this extension is to make development of new runner environments more straightforward.
Also embedded here are a script for running the Lua bench test, a ticket logging mechanism for reporting the status of tickets seen by a broker, and a get_row utility for reading directly from the shards. These could be split out into a separate PR if helpful.
Note (for reviewers): This introduces a dependency on Python developer tools. I think it's worth considering making this a separate branch or repository as a result.