Skip to content

Latest commit

 

History

History

benchmark_datasets

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Benchmark Datasets

KGpip is evaluated against state-of-the-art on the 121 benchmark datasets shown bellow:

ID Dataset Rows Columns Classes Numerical Categorical Textual Size (MB) Task Source Papers
1 adult 48842 14 2 6 8 0 5.7 binary AutoML FLAML, AL
2 airlines 539383 7 2 4 3 0 18.3 binary AutoML FLAML
3 albert 425240 78 2 78 0 0 155.4 binary AutoML FLAML
4 Amazon_employee_access 32769 9 2 9 0 0 1.9 binary AutoML FLAML
5 APSFailure 76000 170 2 170 0 0 74.8 binary AutoML FLAML
6 Australian 690 14 2 14 0 0 0 binary AutoML FLAML
7 bank-marketing 45211 16 2 7 9 0 3.5 binary AutoML FLAML
8 blood-transfusion-service-center 748 4 2 4 0 0 0 binary AutoML FLAML
9 christine 5418 1636 2 1636 0 0 31.4 binary AutoML FLAML
10 credit-g 1000 20 2 7 13 0 0.1 binary AutoML FLAML
11 guillermo 20000 4296 2 4296 0 0 424.5 binary AutoML FLAML
12 higgs 98050 28 2 28 0 0 43.3 binary AutoML FLAML, VolcanoML
13 jasmine 2984 144 2 144 0 0 1.7 binary AutoML FLAML
14 kc1 2109 21 2 21 0 0 0.1 binary AutoML FLAML, VolcanoML
15 KDDCup09_appetency 50000 230 2 192 38 0 32.8 binary AutoML FLAML
16 kr-vs-kp 3196 36 2 0 36 0 0.5 binary AutoML FLAML
17 MiniBooNE 130064 50 2 50 0 0 69.4 binary AutoML FLAML
18 nomao 34465 118 2 118 0 0 19.3 binary AutoML FLAML
19 numerai28.6 96320 21 2 21 0 0 34.9 binary AutoML FLAML
20 phoneme 5404 5 2 5 0 0 0.3 binary AutoML FLAML, VolcanoML
21 riccardo 20000 4296 2 4296 0 0 414 binary AutoML FLAML
22 sylvine 5124 20 2 20 0 0 0.4 binary AutoML FLAML
23 car 1728 6 4 0 6 0 0.1 multi-class AutoML FLAML
24 cnae-9 1080 856 9 856 0 0 1.8 multi-class AutoML FLAML
25 connect-4 67557 42 3 42 0 0 5.5 multi-class AutoML FLAML
26 covertype 581012 54 7 54 0 0 71.7 multi-class AutoML FLAML, AL
27 dilbert 10000 2000 5 2000 0 0 176 multi-class AutoML FLAML
28 dionis 416188 60 355 60 0 0 110.1 multi-class AutoML FLAML
29 fabert 8237 800 7 800 0 0 13 multi-class AutoML FLAML
30 Fashion-MNIST 70000 784 10 784 0 0 148 multi-class AutoML FLAML
31 helena 65196 27 100 27 0 0 14.6 multi-class AutoML FLAML
32 jannis 83733 54 4 54 0 0 36.7 multi-class AutoML FLAML
33 jungle_chess_2pcs_raw_endgame_complete 44819 6 3 6 0 0 0.6 multi-class AutoML FLAML
34 mfeat-factors 2000 216 10 216 0 0 1.4 multi-class AutoML FLAML
35 robert 10000 7200 10 7200 0 0 268.1 multi-class AutoML FLAML
36 segment 2310 19 7 19 0 0 0.3 multi-class AutoML FLAML, VolcanoML
37 shuttle 58000 9 7 9 0 0 1.5 multi-class AutoML FLAML
38 vehicle 846 18 4 18 0 0 0.1 multi-class AutoML FLAML
39 volkert 58310 180 10 180 0 0 65.1 multi-class AutoML FLAML
40 2dplanes 40768 10 - 10 0 0 2.4 regression PMLB FLAML
41 bng_breastTumor 116640 9 - 9 0 0 6 regression PMLB FLAML
42 bng_echomonths 17496 9 - 9 0 0 2.3 regression PMLB FLAML
43 bng_lowbwt 31104 9 - 9 0 0 2.4 regression PMLB FLAML
44 bng_pbc 1000000 18 - 18 0 0 220.8 regression PMLB FLAML
45 bng_pharynx 1000000 10 - 10 0 0 68.6 regression PMLB FLAML
46 bng_pwLinear 177147 10 - 10 0 0 10.6 regression PMLB FLAML
47 fried 40768 10 - 10 0 0 8.1 regression PMLB FLAML
48 house_16H 22784 16 - 16 0 0 5.8 regression PMLB FLAML
49 house_8L 22784 8 - 8 0 0 2.8 regression PMLB FLAML
50 houses 20640 8 - 8 0 0 1.8 regression PMLB FLAML
51 mv 40768 11 - 11 0 0 5.9 regression PMLB FLAML
52 poker 1025010 10 - 10 0 0 23 regression PMLB FLAML
53 pol 15000 48 - 48 0 0 3 regression PMLB FLAML
54 breast_cancer_wisconsin 569 30 2 30 0 0 0.1 binary PMLB AL
55 detecting-insults-in-social-commentary 3947 2 2 0 1 1 0.8 binary Kaggle AL
56 fri_c1_1000_25 1000 25 2 25 0 0 0.2 binary OpenML AL
57 Hill_Valley_with_noise 1212 100 2 100 0 0 0.8 binary PMLB AL
58 Hill_Valley_without_noise 1212 100 2 100 0 0 1.5 binary PMLB AL
59 ionosphere 351 34 2 34 0 0 0.1 binary PMLB AL
60 MagicTelescope 19020 11 2 11 0 0 1.5 binary OpenML AL
61 OVA_Breast 1545 10936 2 10936 0 0 103.3 binary OpenML AL
62 pc4 1458 37 2 37 0 0 0.2 binary OpenML AL, VolcanoML
63 quake 2178 3 2 3 0 0 0 binary OpenML AL, VolcanoML
64 sick 3772 29 2 7 22 0 0.3 binary OpenML AL, VolcanoML
65 spambase 4601 57 2 57 0 0 1.1 binary PMLB AL, VolcanoML
66 titanic 891 11 2 6 4 1 0.1 binary Kaggle AL
67 car_evaluation 1728 21 4 21 0 0 0.1 multi-class PMLB AL
68 glass 205 9 5 9 0 0 0 multi-class PMLB AL
69 kropt 28056 6 18 3 3 0 0.5 multi-class OpenML AL, VolcanoML
70 mnist_784 70000 784 10 784 0 0 122 multi-class OpenML AL, VolcanoML
71 sentiment-analysis-on-movie-reviews 156060 3 5 2 0 1 8.1 multi-class Kaggle AL
72 splice 3190 61 3 0 61 0 0.4 multi-class OpenML AL
73 spooky-author-identification 19579 2 3 0 1 1 3.1 multi-class Kaggle AL
74 wine_quality_red 1599 11 6 11 0 0 0.1 multi-class PMLB AL
75 wine_quality_white 4898 11 7 11 0 0 0.3 multi-class PMLB AL
76 housing-prices 1460 80 - 37 43 0 0.4 regression Kaggle AL
77 mercedes-benz-greener-manufacturing 4209 377 - 369 8 0 3.1 regression Kaggle AL
78 ailerons 13750 40 2 40 0 0 2.2 binary OpenML VolcanoML
79 analcatdata_supreme 4052 7 2 7 0 0 0.1 binary OpenML VolcanoML
80 bank32nh_833 8192 32 2 32 0 0 2.1 binary OpenML VolcanoML
81 cpu_act_761 8192 21 2 21 0 0 0.7 binary OpenML VolcanoML
82 cpu_small_735 8192 12 2 12 0 0 0.4 binary OpenML VolcanoML
83 delta_ailerons 7129 5 2 5 0 0 0.3 binary OpenML VolcanoML
84 delta_elevators 9517 6 2 6 0 0 0.3 binary OpenML VolcanoML
85 eeg-eye-state 14980 14 2 14 0 0 1.6 binary OpenML VolcanoML
86 electricity 45312 8 2 8 0 0 2.9 binary OpenML VolcanoML
87 jm1 10885 21 2 21 0 0 0.8 binary OpenML VolcanoML
88 kin8nm_807 8192 8 2 8 0 0 0.6 binary OpenML VolcanoML
89 mammography 11183 6 2 6 0 0 0.8 binary OpenML VolcanoML
90 mc1 9466 38 2 38 0 0 1 binary OpenML VolcanoML
91 ozone-level-8hr 2534 72 2 72 0 0 0.9 binary OpenML VolcanoML
92 page-blocks 5473 10 2 10 0 0 0.2 binary OpenML VolcanoML
93 pollen_871 3848 5 2 5 0 0 0.1 binary OpenML VolcanoML
94 puma32H_752 8192 32 2 32 0 0 2.3 binary OpenML VolcanoML
95 puma8NH_816 8192 8 2 8 0 0 0.6 binary OpenML VolcanoML
96 space_ga_737 3107 6 2 6 0 0 0.2 binary OpenML VolcanoML
97 waveform-5000 5000 40 2 40 0 0 1 binary OpenML VolcanoML
98 wind_847 6574 14 2 14 0 0 0.4 binary OpenML VolcanoML
99 abalone 4177 8 28 7 1 0 0.2 multi-class OpenML VolcanoML
100 optdigits 5620 64 10 64 0 0 0.8 multi-class OpenML VolcanoML
101 pendigits 10992 16 10 16 0 0 0.7 multi-class OpenML VolcanoML
102 satimage 6430 36 6 36 0 0 2.1 multi-class OpenML VolcanoML
103 bank32nh_558 8192 32 - 32 0 0 2.4 regression OpenML VolcanoML
104 bank8FM 8192 8 - 8 0 0 0.6 regression OpenML VolcanoML
105 cpu_act_573 8192 21 - 21 0 0 1 regression OpenML VolcanoML
106 cpu_small_227 8192 12 - 12 0 0 0.6 regression OpenML VolcanoML
107 debutanizer 2394 7 - 7 0 0 0.2 regression OpenML VolcanoML
108 kin8nm_189 8192 8 - 8 0 0 1.1 regression OpenML VolcanoML
109 Moneyball 1232 14 - 12 2 0 0.1 regression OpenML VolcanoML
110 pollen_529 3848 5 - 5 0 0 0.2 regression OpenML VolcanoML
111 puma32H_308 8192 32 - 32 0 0 2.7 regression OpenML VolcanoML
112 puma8NH_225 8192 8 - 8 0 0 0.7 regression OpenML VolcanoML
113 rainfall_bangladesh 16755 3 - 1 2 0 0.4 regression OpenML VolcanoML
114 socmob 1156 5 - 1 4 0 0.1 regression OpenML VolcanoML
115 space_ga_507 3107 6 - 6 0 0 0.5 regression OpenML VolcanoML
116 stock 950 9 - 9 0 0 0.1 regression OpenML VolcanoML
117 sulfur 10081 6 - 6 0 0 0.6 regression OpenML VolcanoML
118 us_crime 1994 127 - 126 1 0 1.1 regression OpenML VolcanoML
119 weather_izmir 1461 9 - 9 0 0 0.1 regression OpenML VolcanoML
120 wind_503 6574 14 - 14 0 0 0.5 regression OpenML VolcanoML
121 witmer_census_1980 50 5 - 4 1 0 0 regression OpenML VolcanoML