Skip to content

Commit

Permalink
Merge pull request #1376 from mito-ds/fifty-evals
Browse files Browse the repository at this point in the history
Fifty evals
  • Loading branch information
aarondr77 authored Nov 26, 2024
2 parents 2fc867e + d1483e5 commit c1724d0
Show file tree
Hide file tree
Showing 23 changed files with 106,625 additions and 101 deletions.
19 changes: 17 additions & 2 deletions evals/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# AI Evals

## Running the tests
## Setting up evals


1. Create a new virtual environment
```
Expand All @@ -17,8 +18,22 @@ source venv/bin/activate
pip install -r requirements.txt
```

4. Run the tests from the `mito` folder:
## Running all tests
From the `mito` folder, run the command:
TODO: Improve the running so that we don't have to be in the `mito` folder.
```
python -m evals.main
```

## Running specific tests
To specify which tests to run, set some of the following flags:

- `--test-name`
- `--prompt-name`
- `--tags`


For example, to run all tests for the `single_shot_prompt` prompt, run:
```
python -m evals.main --prompt-name=single_shot_prompt
```
95,903 changes: 95,903 additions & 0 deletions evals/data/loans.csv

Large diffs are not rendered by default.

50 changes: 50 additions & 0 deletions evals/data/messy_data.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
Transaction_ID,Date,Stock ,Transaction_Price,Type_of_Investment,Description of each transaction
1,1/1/23,APPLE-Stock,$185.37 ,Stock,AAPL shares purchased
2,1/2/23,ALPHABET-Stock,$122.58 ,Stock,GOOGL shares purchased
3,1/2/23,MICROSOFT-Stock,$337.02 ,Stock,MSFT shares purchased
4,1/2/23,AMAZON-Stock,$125.00 ,Stock,AMZN shares purchased
5,3/5/23,APPLE-Stock,$187.91 ,Stock,AAPL shares purchased
6,1/1/23,APPLE-Stock,$193.83 ,Stock,AAPL shares purchased
7,1/2/23,ALPHABET-Stock,$194.58 ,Stock,GOOGL shares purchased
8,1/2/23,MICROSOFT-Stock,$195.33 ,Stock,MSFT shares purchased
9,1/2/23,AMAZON-Stock,$196.08 ,Stock,AMZN shares purchased
10,3/5/23,APPLE-Stock,$196.83 ,Stock,AAPL shares purchased
11,1/1/23,APPLE-Stock,$197.58 ,Stock,AAPL shares purchased
12,1/2/23,ALPHABET-Stock,$198.33 ,Stock,GOOGL shares purchased
13,1/2/23,MICROSOFT-Stock,$199.08 ,Stock,MSFT shares purchased
14,1/2/23,AMAZON-Stock,$199.83 ,Stock,AMZN shares purchased
15,3/5/23,APPLE-Stock,$200.58 ,Stock,AAPL shares purchased
16,1/1/23,APPLE-Stock,$201.33 ,Stock,AAPL shares purchased
17,1/2/23,ALPHABET-Stock,$202.08 ,Stock,GOOGL shares purchased
18,1/2/23,MICROSOFT-Stock,$202.83 ,Stock,MSFT shares purchased
19,1/2/23,AMAZON-Stock,$203.58 ,Stock,AMZN shares purchased
20,3/5/23,APPLE-Stock,$204.33 ,Stock,AAPL shares purchased
21,1/1/23,APPLE-Stock,$205.08 ,Stock,AAPL shares purchased
22,1/2/23,ALPHABET-Stock,$205.83 ,Stock,GOOGL shares purchased
23,1/2/23,MICROSOFT-Stock,$206.58 ,Stock,MSFT shares purchased
24,1/2/23,AMAZON-Stock,$207.33 ,Stock,AMZN shares purchased
25,3/5/23,APPLE-Stock,$208.08 ,Stock,AAPL shares purchased
26,1/1/23,APPLE-Stock,$208.83 ,Stock,AAPL shares purchased
27,1/2/23,ALPHABET-Stock,$209.58 ,Stock,GOOGL shares purchased
28,1/2/23,MICROSOFT-Stock,$210.33 ,Stock,MSFT shares purchased
29,1/2/23,AMAZON-Stock,$211.08 ,Stock,AMZN shares purchased
30,3/5/23,APPLE-Stock,$211.83 ,Stock,AAPL shares purchased
31,1/1/23,APPLE-Stock,$212.58 ,Stock,AAPL shares purchased
32,1/2/23,ALPHABET-Stock,$213.33 ,Stock,GOOGL shares purchased
33,1/2/23,MICROSOFT-Stock,$214.08 ,Stock,MSFT shares purchased
34,1/2/23,AMAZON-Stock,$214.83 ,Stock,AMZN shares purchased
35,3/5/23,APPLE-Stock,$215.58 ,Stock,AAPL shares purchased
36,1/1/23,APPLE-Stock,$216.33 ,Stock,AAPL shares purchased
37,1/2/23,ALPHABET-Stock,$217.08 ,Stock,GOOGL shares purchased
38,1/2/23,MICROSOFT-Stock,$217.83 ,Stock,MSFT shares purchased
39,1/2/23,AMAZON-Stock,$218.58 ,Stock,AMZN shares purchased
40,3/5/23,APPLE-Stock,$219.33 ,Stock,AAPL shares purchased
41,1/1/23,APPLE-Stock,$220.08 ,Stock,AAPL shares purchased
42,1/2/23,ALPHABET-Stock,$220.83 ,Stock,GOOGL shares purchased
43,1/2/23,MICROSOFT-Stock,$221.58 ,Stock,MSFT shares purchased
44,1/2/23,AMAZON-Stock,$222.33 ,Stock,AMZN shares purchased
45,3/5/23,APPLE-Stock,$223.08 ,Stock,AAPL shares purchased
46,1/1/23,APPLE-Stock,$223.83 ,Stock,AAPL shares purchased
47,1/2/23,ALPHABET-Stock,$224.58 ,Stock,GOOGL shares purchased
48,1/2/23,MICROSOFT-Stock,$225.33 ,Stock,MSFT shares purchased
49,1/2/23,AMAZON-Stock,$226.08 ,Stock,AMZN shares purchased
26 changes: 26 additions & 0 deletions evals/data/monthly_equity/august_balances.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
entity_id,ending_capital
001,52000
002,72500
003,26500
004,82500
005,62200
006,87700
007,41800
008,72500
009,77500
010,52000
011,31600
012,92800
013,67300
014,82600
015,36700
016,77500
017,72400
018,92800
019,41800
020,82600
021,57000
022,87700
023,46900
024,97900
025,62200
26 changes: 26 additions & 0 deletions evals/data/monthly_equity/august_fees.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
entity_id,fees
001,1020
002,1525
003,525
004,1575
005,1224
006,1745
007,816
008,1525
009,1550
010,1020
011,616
012,1845
013,1335
014,1648
015,714
016,1550
017,1428
018,1845
019,816
020,1648
021,1122
022,1745
023,926
024,1945
025,1224
26 changes: 26 additions & 0 deletions evals/data/monthly_equity/july_balances.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
entity_id,ending_capital
001,51000
002,71500
003,25500
004,81500
005,61200
006,86700
007,40800
008,71500
009,76500
010,51000
011,30600
012,91800
013,66300
014,81600
015,35700
016,76500
017,71400
018,91800
019,40800
020,81600
021,56000
022,86700
023,45900
024,96900
025,61200
26 changes: 26 additions & 0 deletions evals/data/monthly_equity/july_fees.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
entity_id,fees
001,1000
002,1500
003,500
004,1500
005,1200
006,1700
007,800
008,1500
009,1500
010,1000
011,600
012,1800
013,1300
014,1600
015,700
016,1500
017,1400
018,1800
019,800
020,1600
021,1100
022,1700
023,900
024,1900
025,1200
30 changes: 30 additions & 0 deletions evals/data/simple_recon/transactions_eagle.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Transaction ID,Share Quantity
12975,20
16889,25
57686,24
53403,22
42699,0
91084,45
54337,50
77676,90
15910,50
99261,55
80515,60
92822,105
50167,70
66884,80
75172,60
95943,95
38998,90
26340,60
71307,55
63924,60
61505,105
44536,70
82173,80
58695,60
53707,95
55218,245
31105,212
39614,230
58356,72
30 changes: 30 additions & 0 deletions evals/data/simple_recon/transactions_excel.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Transaction ID,Share Quantity
12975,20
16889,25
57686,24
53403,22
42699,40
91084,45
54337,50
77676,90
15910,50
25660,55
80515,60
92822,105
50167,70
66884,80
75172,60
95943,95
99274,90
26340,50
71307,55
63924,60
61505,105
44536,70
95194,80
58695,60
53707,95
72138,245
31105,212
39614,231
58356,72
Loading

0 comments on commit c1724d0

Please sign in to comment.