1603AnsselAvg

Answer Sentence Selection Avg Parameters

Avg is the super-baseline primitive model. We don't fit into this model terribly well because it is usually combined with #overlaps features; we have these features not in the output layer, but as boolean flags in the input layer, which works great for sequence models, but tends not to carry well here.

A generalization of this model are the Deep Averaging Networks.

wang

Basic configuration:

{"Ddim": "1", "balance_class": "False", "batch_size": "160", "deep": "0", "e_add_flags": "True", "embdim": "300", "epoch_fract": "0.25", "inp_e_dropout": "0.3333333333333333", "inp_w_dropout": "0", "l2reg": "1e-05", "loss": "<function ranknet at 0x7fc9bedab1e0>", "mlpsum": "sum", "nb_epoch": "16", "nnact": "relu", "nninit": "glorot_uniform", "pact": "tanh", "pdim": "1", "project": "True", "ptscorer": "<function dot_ptscorer at 0x7fc9bee25c80>"}

Baseline (dot) - val MRR [0.738617, 0.757253, 0.761796]. The model has high capacity, train MRR [0.894115, 0.883907].

No dropout (dot) - val MRR [0.758046, 0.752949].

MLP - val MRR [0.783238, 0.801978, 0.815309]

prelu, MLP - val MRR [0.763924, 0.800371, 0.769908]

inp w dropout 1/3, inp e dropout 0, dot - val MRR [0.685435, 0.637266, 0.686166] (train MRR ~0.78).

inp w dropout 1/3, inp e dropout 0, MLP - val MRR [0.811568, 0.816171, 0.787179]

inp w dropout 1/3, inp e dropout 0, prelu, MLP - val MRR [0.808087, 0.800500, 0.786611]

wdrop=1/3, edrop=0, deep=2, nnrelu, ptan (the DAN setup), dot - val MRR [0.700993, 0.683716].

wdrop=1/3, edrop=0, deep=2, nnrelu, prelu, dot - val MRR [0.577521, 0.550104, 0.572985]. relu+dot-product is apparently broken.

wdrop=1/3, edrop=0, deep=2, nnrelu, prelu, MLP - val MRR [0.836593, 0.825581, 0.837040]. Interestingly, train MRR very similar rather than higher! But overfits in later epochs as usual.

wdrop=1/3, edrop=0, deep=2, nntan, ptan, dot - val MRR [0.721977, 0.762339, 0.733813].

preprojection, dot - val MRR [0.746822, 0.715433, 0.736194]

preprojection, wact=tanh, dot - val MRR [0.752625, 0.731026, 0.732255]

Summary:

avg - MLP scorer
DAN - wdrop=1/3, edrop=0, deep=2, nnrelu, prelu, MLP

curatedv2

avg avg--54b99ec379e7b1f7:

data/anssel/yodaqa/curatedv2-training.csv: Final 16×MRR 0.422881 ±0.024685 ([0.33452901764116261, 0.3931410565443143, 0.39909597738367225, 0.41231405841116214, 0.41325463716361405, 0.37186078145533225, 0.49982874161212121, 0.40756042218502497, 0.50647069386334109, 0.4286900284877278, 0.4065685474688075, 0.46102497491920252, 0.50187616440225535, 0.39167557800282671, 0.41094591234568728, 0.42725600076695414])
data/anssel/yodaqa/curatedv2-val.csv: Final 16×MRR 0.402618 ±0.006664 ([0.39892797729489105, 0.39870362592004654, 0.39778075514098149, 0.39299765475695297, 0.38903619387488453, 0.38921935281809206, 0.4080020064960857, 0.39626816362720391, 0.41329760835019591, 0.38189135573032057, 0.41169816467081433, 0.42104578186220643, 0.42851404642057173, 0.41765364131709998, 0.39474961669144093, 0.40209950498651531])
data/anssel/yodaqa/curatedv2-test.csv: Final 16×MAP 0.229694 ±0.001715 ([0.2264, 0.2295, 0.2314, 0.2365, 0.2259, 0.2295, 0.2354, 0.2283, 0.2312, 0.2253, 0.2288, 0.2265, 0.2315, 0.2333, 0.2284, 0.2272])
data/anssel/yodaqa/curatedv2-test.csv: Final 16×MRR 0.329356 ±0.003511 ([0.32751450177533797, 0.32544806960862077, 0.3227488111356045, 0.33792882789461465, 0.31620752116343509, 0.32284366465325665, 0.34098541021259626, 0.32819800850929975, 0.33764663358773195, 0.33326103339966201, 0.33122273471712604, 0.33175874303833197, 0.33508045403718906, 0.3316988897583889, 0.32150742875045329, 0.32563937301654222])
train-val MRR Pearsonr: 0.537236
val-test MRR Pearsonr: 0.472787

DAN avg--319d7509639edea2:

data/anssel/yodaqa/curatedv2-training.csv: Final 16×MRR 0.437119 ±0.014494 ([0.41440829726374867, 0.47767381760225341, 0.45129114965782507, 0.43991166513516494, 0.44649672818291436, 0.43286962810613139, 0.38737242357049562, 0.42204667584559719, 0.48557986660966396, 0.40046855605965692, 0.40705333859455761, 0.41339660824460284, 0.43792812858022906, 0.45557329861070023, 0.45711286863862882, 0.46471975197187332])
data/anssel/yodaqa/curatedv2-val.csv: Final 16×MRR 0.430754 ±0.014477 ([0.45245667374645371, 0.4639741248772899, 0.46365711236771934, 0.44075580661389147, 0.4439072609658532, 0.43155958065270422, 0.41093083533336333, 0.44938808452069945, 0.40563127952970307, 0.41007257502458949, 0.3637029860006421, 0.42355841729680954, 0.42917624703210616, 0.47148932795297482, 0.40126146906342813, 0.43053608760449374])
data/anssel/yodaqa/curatedv2-test.csv: Final 16×MAP 0.233000 ±0.002657 ([0.2295, 0.2325, 0.2371, 0.2324, 0.225, 0.2314, 0.2331, 0.236, 0.2325, 0.246, 0.2305, 0.2322, 0.2334, 0.2403, 0.2254, 0.2307])
data/anssel/yodaqa/curatedv2-test.csv: Final 16×MRR 0.354075 ±0.010307 ([0.35363555742125763, 0.3396429663997963, 0.36026135213613869, 0.35083953300707993, 0.35611984075021852, 0.3510943064981229, 0.36526956976776498, 0.37629284706474192, 0.33996420292056162, 0.40580235668628906, 0.32100332586565239, 0.35168847431774969, 0.35185053919782927, 0.36997754760383028, 0.32581503602934547, 0.34594879170567922])
train-val MRR Pearsonr: 0.343015
val-test MRR Pearsonr: 0.345480

ubuntu

avg avg-6a53ffe05128cded (230s/epoch)

Val MRR: 0.619541
Val 2-R@1: 0.796728  
Val 10-R@1: 0.460685  10-R@2: 0.608231  10-R@5: 0.844990

DAN avg-67b4c8a23223a557 (230s/epoch)

Val MRR: 0.610210
Val 2-R@1: 0.782822
Val 10-R@1: 0.457924  10-R@2: 0.588446  10-R@5: 0.822955

No-dropout DAN avg--6921e0cc14cfef98

Val MRR: 0.615150
Val 2-R@1: 0.784509
Val 10-R@1: 0.463497  10-R@2: 0.595399  10-R@5: 0.825971

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1603AnsselAvg

Answer Sentence Selection Avg Parameters

wang

curatedv2

ubuntu

Clone this wiki locally