forked from rdi-berkeley/metaguard
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathindex.html
413 lines (403 loc) · 18.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Blockchain Large Language Models | Berkeley RDI</title>
<meta name="description" content="blockchain large language models - txrank">
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-gH2yIJqKdNHPEq0n4Mqa/HGKIhSkIHeL5AyhkYV8i59U5AR6csBvApHHNl/vI1Bx" crossorigin="anonymous">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.1.2/css/all.min.css" integrity="sha512-1sCRPdkRXhBV2PBLUdRb4tMg1w2YPf37qatUFeS7zlBy7jJI8Lf4VHwWfZZfpXtYSLy85pkm9GaYVYMfw5BC1A==" crossorigin="anonymous" referrerpolicy="no-referrer" />
<link href="metaverse.css" rel="stylesheet">
</head>
<body>
<nav class="navbar navbar-expand-lg bg-light">
<div class="container">
<a class="navbar-brand" href="#">Research @ <img alt="RDI Logo" src="img/rdi-sm.png" height="30"></a>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav ms-auto mb-2 mb-lg-0">
<li class="nav-item">
<a class="nav-link" href="https://rdi.berkeley.edu">RDI Home</a>
</li>
<li class="nav-item">
<a class="nav-link active" href="https://rdi.berkeley.edu/research">Research Home</a>
</li>
<li class="nav-item dropdown">
<a class="nav-link" href="https://rdi.berkeley.edu/zkp">Zero Knowledge Proofs</a>
</li>
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle" href="#" role="button" data-bs-toggle="dropdown" aria-expanded="false">
Metaverse Research
</a>
<ul class="dropdown-menu">
<li><a class="dropdown-item" href="https://rdi.berkeley.edu/metadata">MetaData</a></li>
<li><a class="dropdown-item" href="https://rdi.berkeley.edu/metaguard">MetaGuard</a></li>
<li><a class="dropdown-item" href="https://rdi.berkeley.edu/metaverse-sok">Privacy SoK</a></li>
<li><a class="dropdown-item" href="https://rdi.berkeley.edu/vr-identification">Identification</a></li>
<li><hr class="dropdown-divider"></li>
<li><a class="dropdown-item" href="https://rdi.berkeley.edu/metaverse">View All</a></li>
</ul>
</li>
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle" href="#" role="button" data-bs-toggle="dropdown"
aria-expanded="false">
DeFi Research
</a>
<ul class="dropdown-menu">
<li><a class="dropdown-item" href="https://rdi.berkeley.edu/defi-attacks">DeFi Attacks</a></li>
<li><a class="dropdown-item" href="https://rdi.berkeley.edu/blockchain-llm">Blockchain LLM</a>
</li>
<li>
<hr class="dropdown-divider">
</li>
<li><a class="dropdown-item" href="https://rdi.berkeley.edu/defi">View All</a></li>
</ul>
</li>
</ul>
</div>
</div>
</nav>
<div class="bg-light py-5">
<div class="container">
<div class="row align-items-center">
<div class="col-lg-12">
<h1 class="mt-3">Blockchain Large Language Models - BlockGPT</h1>
<p class="text-secondary">2023 | Yu Gai* · Liyi Zhou* · Kaihua Qin · Dawn Song · Arthur Gervais | https://arxiv.org/pdf/2304.12749.pdf</p>
<p class="text-justify">This paper presents a dynamic, real-time approach to detecting anomalous blockchain transactions. The proposed tool, BlockGPT, generates tracing representations of blockchain activity and trains from scratch a large language model to act as a real-time Intrusion Detection System. Unlike traditional methods, BlockGPT is designed to offer an unrestricted search space and does not rely on predefined rules or patterns, enabling it to detect a broader range of anomalies. We demonstrate the effectiveness of BlockGPT through its use as an anomaly detection tool for Ethereum transactions. In our experiments, it effectively identifies abnormal transactions among a dataset of 68M transactions and has a batched throughput of 2284 transactions per second on average. Our results show that, BlockGPT identifies abnormal transactions by ranking 49 out of 124 attacks among the top-3 most abnormal transactions interacting with their victim contracts. This work makes contributions to the field of blockchain transaction analysis by introducing a custom data encoding compatible with the transformer architecture, a domain-specific tokenization technique, and a tree encoding method specifically crafted for the Ethereum Virtual Machine (EVM) trace representation.</p>
<p class="mb-0"><a class="btn btn-primary btn-lg" href="https://arxiv.org/pdf/2304.12749" target="_blank"><i class="fa fa-file-lines"></i> Read Paper</a></p>
</div>
</div>
</div>
</div>
<div class="container py-5 text-center">
<h1><i>How it works:</i></h1>
<div class="row align-items-center">
<div class="col-lg-6">
<img alt="High Level Overview Diagram" src="img/highleveloverview.png" class="w-100 rounded">
</div>
<div class="col-lg-6">
<p>High-level overview of the BlockGPT defense mechanism, which consists of the following four major steps. <span class="font-weight-bold">➊</span> BlockGPT is bootstrapped by feeding in a dataset of historical transactions to train the model using unsupervised learning. <span class="font-weight-bold">➋</span> Depending on the system and threat model, BlockGPT detects new block states, including already confirmed transactions, and pending transactions. <span class="font-weight-bold">➌</span> BlockGPT ranks transactions based on how abnormal their execution traces are. <span class="font-weight-bold">➍</span> If an abnormal transaction is detected , BlockGPT triggers a defense mechanism such as a front-running emergency pause.</p>
</div>
</div>
</div>
<div class="bg-dark text-white">
<div class="container">
<div class="row align-items-center mb-5 py-5">
<div class="col-md-12">
<h1>Comparison with Other Intrusion Detection Techniques</h1>
<p>
This table provides a comparison of intrusion detection and prevention techniques, highlighting the unique aspects of each method. Unlike reward-based approaches, our technique employs an unrestricted search space, enabling it to identify unexpected execution patterns instead of focusing solely on profitable vulnerabilities. In contrast to pattern-based techniques (dynamic analysis, fuzzing, symbolic execution, and static analysis), our method does not rely on predefined rules or patterns, which allows it to detect a broader range of anomalies. Furthermore, our technique is capable of real-time analysis, a feature not present in pattern-based symbolic execution or static analysis methods.
</p>
<table>
<thead>
<tr>
<th>Technique</th>
<th>Assumed Prior Knowledge</th>
<th>Search space Unrestricted From Vulnerability Patterns</th>
<th>Real-Time Capable</th>
<th>Application Agnostic</th>
</tr>
</thead>
<tbody>
<tr class="category">
<td colspan="5">Rank based -- the goal is to find all unexpected execution patterns, implicitly capturing vulnerabilities</td>
</tr>
<tr>
<td>BlockGPT (this paper)</td>
<td>All historical transactions</td>
<td>Unrestricted</td>
<td>Yes (0.16s)</td>
<td>Yes</td>
</tr>
<tr class="category">
<td colspan="5">Reward based -- the goal is to extract financial revenue, implicitly capturing vulnerabilities</td>
</tr>
<tr>
<td>APE</td>
<td>N/A</td>
<td>Only profitable patterns</td>
<td>Yes (0.07s)</td>
<td>Yes</td>
</tr>
<tr>
<td>Naive Imitation</td>
<td>N/A</td>
<td>Only profitable patterns</td>
<td>Yes (0.01s)</td>
<td>Yes</td>
</tr>
<tr>
<td>DeFiPoser</td>
<td>DApp models</td>
<td>Only profitable patterns + Limited by the DApp models</td>
<td>Yes (5.93s)</td>
<td>No</td>
</tr>
<tr class="category">
<td colspan="5">Pattern based -- the goal is to match / classify predefined known vulnerability patterns with rules (including machine learning methods)</td>
</tr>
<tr>
<td>Pattern based dynamic analysis</td>
<td>Rule</td>
<td>Limited by the rule</td>
<td>Yes</td>
<td>Partially</td>
</tr>
<tr>
<td>Pattern based fuzzing</td>
<td>Rule + ABI / DApp models</td>
<td>Limited by the rule</td>
<td>Partially</td>
<td>Partially</td>
</tr>
<tr>
<td>Pattern based symbolic execution</td>
<td>Rule + Source code / Bytecode</td>
<td>Limited by the rule</td>
<td>N/A</td>
<td>Partially</td>
</tr>
<tr>
<td>Pattern based static analysis</td>
<td>Rule + Source code / Bytecode</td>
<td>Limited by the rule</td>
<td>N/A</td>
<td>Partially</td>
</tr>
<tr class="category">
<td colspan="5">Proof based -- the goal is to prove that a set of smart contracts meet specific security properties</td>
</tr>
<tr>
<td>Formal verification</td>
<td>Formal security properties + Source code / DApp models</td>
<td>Limited by the security properties</td>
<td>N/A</td>
<td>Partially</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
<div class="bg-white">
<div class="container">
<div class="row align-items-center mb-5 py-5">
<div class="col-md-12">
<h1>Performance Under Various Alarm Threshold Configurations</h1>
<p>The table below presents the performance of the intrusion detection method under various alarm threshold configurations, organized by the number of transactions interacting with the vulnerable smart contracts. The results indicate that using a lower alarm threshold enables the detection of a higher percentage of attacks, albeit at the cost of an increased false positive rate. Notably, the efficacy of the alarm threshold varies across different dataset sizes, emphasizing the need to select a suitable threshold based on the specific attributes of the smart contract under investigation.</p>
<table>
<thead>
<tr>
<th rowspan="2">Dataset Size</th>
<th colspan="5">Percentage Ranking Alarm Threshold (%)</th>
<th colspan="3">Absolute Ranking Alarm Threshold</th>
</tr>
<tr>
<th>≤0.01%</th>
<th>≤0.1%</th>
<th>≤0.5%</th>
<th>≤1%</th>
<th>≤10%</th>
<th>top-1</th>
<th>top-2</th>
<th>top-3</th>
</tr>
</thead>
<tbody>
<!-- First Row -->
<tr>
<td>0 - 99 txs (32 attacks, 28% of dataset)</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>5 (16%)</td>
<td>7 (22%)</td>
<td>20 (63%)</td>
<td>23 (72%)</td>
</tr>
<!-- Second Row -->
<tr>
<td>Average false positive rate</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>8.18%</td>
<td>0%</td>
<td>14.8%</td>
<td>28.3%</td>
</tr>
<!-- Third Row -->
<tr>
<td>Average number of false positives</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>5.1</td>
<td>0</td>
<td>1</td>
<td>2</td>
</tr>
<!-- Fourth Row -->
<tr>
<td>100 - 999 txs (38 attacks, 33% of dataset)</td>
<td>-</td>
<td>-</td>
<td>8 (21%)</td>
<td>12 (32%)</td>
<td>28 (74%)</td>
<td>7 (18%)</td>
<td>12 (32%)</td>
<td>15 (39%)</td>
</tr>
<!-- Fifth Row -->
<tr>
<td>Average false positive rate</td>
<td>-</td>
<td>-</td>
<td>0.24%</td>
<td>0.71%</td>
<td>9.65%</td>
<td>0%</td>
<td>0.46%</td>
<td>0.81%</td>
</tr>
<!-- Sixth Row -->
<tr>
<td>Average number of false positives</td>
<td>-</td>
<td>-</td>
<td>1.5</td>
<td>3.5</td>
<td>39.4</td>
<td>0</td>
<td>1</td>
<td>2</td>
</tr>
<!-- Seventh Row -->
<tr>
<td>1000 - 9999 txs (17 attacks, 15% of dataset)</td>
<td>-</td>
<td>6 (35%)</td>
<td>9 (53%)</td>
<td>11 (65%)</td>
<td>13 (76%)</td>
<td>4 (24%)</td>
<td>7 (41%)</td>
<td>7 (41%)</td>
</tr>
<!-- Eighth Row -->
<tr>
<td>Average false positive rate</td>
<td>-</td>
<td>0.054%</td>
<td>0.45%</td>
<td>0.95%</td>
<td>9.96%</td>
<td>0%</td>
<td>0.049%</td>
<td>0.098%</td>
</tr>
<!-- Ninth Row -->
<tr>
<td>Average number of false positives</td>
<td>-</td>
<td>1.4</td>
<td>11.5</td>
<td>23.7</td>
<td>324.5</td>
<td>0</td>
<td>1</td>
<td>2</td>
</tr>
<!-- Tenth Row -->
<tr>
<td>10000 + txs (29 attacks, 25% of dataset)</td>
<td>2 (7%)</td>
<td>7 (24%)</td>
<td>16 (55%)</td>
<td>18 (62%)</td>
<td>21 (72%)</td>
<td>2 (7%)</td>
<td>3 (10%)</td>
<td>4 (14%)</td>
</tr>
<!-- Eleventh Row -->
<tr>
<td>Average false positive rate</td>
<td>0.007%</td>
<td>0.097%</td>
<td>0.50%</td>
<td>1%</td>
<td>10%</td>
<td>0%</td>
<td>0.004%</td>
<td>0.008%</td>
</tr>
<!-- Twelfth Row -->
<tr>
<td>Average number of false positives</td>
<td>2.5</td>
<td>120.1</td>
<td>429.9</td>
<td>819.6</td>
<td>7302.1</td>
<td>0</td>
<td>1</td>
<td>2</td>
</tr>
<!-- Thirteenth Row -->
<tr>
<td>Overall</td>
<td>2 (2%)</td>
<td>13 (11%)</td>
<td>33 (28%)</td>
<td>41 (35%)</td>
<td>67 (58%)</td>
<td>20 (17%)</td>
<td>42 (36%)</td>
<td>49 (42%)</td>
</tr>
<!-- Fourteenth Row -->
<tr>
<td>Average false positive rate</td>
<td>0.007%</td>
<td>0.077%</td>
<td>0.42%</td>
<td>0.90%</td>
<td>9.71%</td>
<td>0%</td>
<td>7.19%</td>
<td>13.5%</td>
</tr>
<!-- Fifteenth Row -->
<tr>
<td>Average number of false positives</td>
<td>2.5</td>
<td>65.3</td>
<td>211.9</td>
<td>367.2</td>
<td>2368.5</td>
<td>0</td>
<td>1</td>
<td>2</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="bg-dark text-white text-center py-2">
<div class="container">
<p class="m-0">Copyright ©2022 UC Regents | Email us at <a href="mailto:[email protected]">[email protected]</a>.</p>
</div>
</div>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js" integrity="sha384-A3rJD856KowSb7dwlZdYEkO39Gagi7vIsF0jrRAoQmDKKtQBHUuLZ9AsSv4jD4Xa" crossorigin="anonymous"></script>
</body>
</html>