Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JMEE #4

Open
shiqing1234 opened this issue Sep 10, 2019 · 9 comments
Open

JMEE #4

shiqing1234 opened this issue Sep 10, 2019 · 9 comments
Labels
question Further information is requested

Comments

@shiqing1234
Copy link

very thank you for your code,i want to ask you do you use your output data to feed the JMEE model to achieve event extract and get the same F1 in the JMEE paper?because i find that in the JMEE paper the sentences in dev/test/train is different from yours
This data split includes 40 newswire articles (881 sentences) for the test set, 30 other documents (1087 sentences) for the development set and 529 remaining documents (21,090 sentences) for the training set
i am looking forward to your reply~very thank you

@kkkyan
Copy link

kkkyan commented Sep 14, 2019

I found the same quetion.
It seems that JMEE use extra sentences without event label.

@bowbowbow bowbowbow added the question Further information is requested label Oct 2, 2019
@bowbowbow
Copy link
Contributor

bowbowbow commented Oct 2, 2019

@shiqing1234
I also think the number of sentences in the dataset mentioned in the JMEE paper is strange.

The preprocessing results of this repository clearly differ from that of JMEE. But the results of this code are more similar to previous studies likes JOINTEVENTENTITY (Yang and Mitchell, 2016), JRNN (Nguyen et al., 2016) as shown in the bellow table.

Documents Sentences
JOINTEVENTENTITY (Yang and Mitchell, 2016) Train 529 14837
Dev 40 863
Test 30 672
JRNN (Nguyen et al., 2016) Train 529 14849
Dev 40 836
Test 30 672
JMEE (Liu et al., 2018) Train 529 21090
Dev 40 1087
Test 30 881
This repository Train 529 14724
Dev 40 875
Test 30 713

The difference seems to have occurred because there are no promised rules for splitting sentences within the sgm format files. For example, there is this sentence in the data.

On page 256 you say, "As the years went by" -- this is when you were in the Senate -- "less and less information was new, fewer and fewer arguments were fresh, and the repetitiveness of the old arguments became tiresome."

I didn't split the sentence when -- appeared, but JMEE might split the sentence when -- appeared, resulting in more sentences. However, it is not clear because JME did not open the preprocessing code.

@kkkyan
This preprocessing code also included sentences without event triggers in the data.

@bowbowbow
Copy link
Contributor

bowbowbow commented Oct 2, 2019

@shiqing1234 @kkkyan
I couldn't achieve the same F1 score as the JMEE paper when I feed the preprocessing results of this code to JMEE model.

I ran JMEE with this command and got disappointing results.

python -m enet.run.ee.runner --train "ace-05-splits/train.json"  --test "ace-05-splits/test.json" --dev "ace-05-splits/dev.json" --earlystop 10 --restart 10 --optimizer "adadelta" --lr 1 --webd "./ace-05-splits/glove.6B.300d.txt" --batch 8 --epochs 99999 --device "cuda:0" --out "models/enet-081" --hps "{'wemb_dim': 300, 'wemb_ft': True, 'wemb_dp': 0.5, 'pemb_dim': 50, 'pemb_dp': 0.5, 'eemb_dim': 50, 'eemb_dp': 0.5, 'psemb_dim': 50, 'psemb_dp': 0.5, 'lstm_dim': 220, 'lstm_layers': 1, 'lstm_dp': 0, 'gcn_et': 3, 'gcn_use_bn': True, 'gcn_layers': 3, 'gcn_dp': 0.5, 'sa_dim': 300, 'use_highway': True, 'loss_alpha': 5}"

The following results were printed on the console:

Epoch 40  dev loss:  3.0913915507072076 
dev ed p:  0.48264984227129337  dev ed r:  0.6375  dev ed f1:  0.5493716337522442 
dev ae p:  0.2878411910669975  dev ae r:  0.1281767955801105  dev ae f1:  0.17737003058103976
Epoch 40  test loss:  2.784576788090766 
test ed p:  0.3360323886639676  test ed r:  0.590047393364929  test ed f1:  0.4282029234737747 
test ae p:  0.20881226053639848  test ae r:  0.12219730941704036  test ae f1:  0.15417256011315417

Epoch 80  dev loss:  3.8771536317780955 
dev ed p:  0.5329949238578681  dev ed r:  0.65625  dev ed f1:  0.5882352941176472 
dev ae p:  0.24006908462867013  dev ae r:  0.15359116022099448  dev ae f1:  0.18733153638814018
Epoch 80  test loss:  3.8047063166558157 
test ed p:  0.3799705449189985  test ed r:  0.6113744075829384  test ed f1:  0.46866485013623976 
test ae p:  0.22857142857142856  test ae r:  0.18834080717488788  test ae f1:  0.20651505838967424

Epoch 120  dev loss:  4.38567394134314 
dev ed p:  0.572992700729927  dev ed r:  0.6541666666666667  dev ed f1:  0.6108949416342413 
dev ae p:  0.23627287853577372  dev ae r:  0.1569060773480663  dev ae f1:  0.18857901726427623
Epoch 120  test loss:  4.248081724495084 
test ed p:  0.40793650793650793  test ed r:  0.6090047393364929  test ed f1:  0.48859315589353614 
test ae p:  0.2297476759628154  test ae r:  0.19394618834080718  test ae f1:  0.21033434650455926

Epoch 160  dev loss:  4.3482774938757345 
dev ed p:  0.574585635359116  dev ed r:  0.65  dev ed f1:  0.6099706744868035 
dev ae p:  0.23304347826086957  dev ae r:  0.14806629834254142  dev ae f1:  0.18108108108108106
Epoch 160  test loss:  4.217268991275621 
test ed p:  0.41423948220064727  test ed r:  0.6066350710900474  test ed f1:  0.49230769230769234 
test ae p:  0.23285714285714285  test ae r:  0.1827354260089686  test ae f1:  0.20477386934673367

Epoch 199  dev loss:  4.394452537438701 
dev ed p:  0.5831775700934579  dev ed r:  0.65  dev ed f1:  0.6147783251231527 
dev ae p:  0.23861566484517305  dev ae r:  0.14475138121546963  dev ae f1:  0.1801925722145805
Epoch 199  test loss:  4.19947991046335 
test ed p:  0.4169381107491857  test ed r:  0.6066350710900474  test ed f1:  0.4942084942084942 
test ae p:  0.2422907488986784  test ae r:  0.18497757847533633  test ae f1:  0.20979020979020976

@kkkyan
Copy link

kkkyan commented Oct 3, 2019

@bowbowbow
I think you must have found there were some bugs in the JMEE code. Acoording to issue #4 and #5 in JMEE.
I tried to fix those issues but still can't represent the paper result.
I used adam with 1e-3 lr, and got my best reuslt:

dev-ed

p r f1
0.528 0.758 0.6217

dev-ae

p r f1
0.4587 0.3314 0.3788

test-ed

p r f1
0.4569 0.7087 0.5557

test-ae

p r f1
0.4132 0.2141 0.282

I am trying to use some new params to test model, the original params is not belivable.

In addition, in issue #4 under JMEE, there's someone who can reach
epoch:106|loss: 9.72400|ed_p: 0.75342|ed_r: 0.74728|ed_f1: 0.75034|ae_p: 0.37625|ae_r: 0.31840|ae_f1: 0.34492|lr:0.2621440000
I try my best but still have a low f1 in de.

If you get some better result, please let me know.

@ZhenHuaZhou68
Copy link

@ shiqing1234 @kkkyan
当我将此代码的预处理结果输入JMEE模型时,我无法获得与JMEE论文相同的F1分数。

我使用此命令运行了JMEE,结果令人失望。

python -m enet.run.ee.runner --train “ ace-05-splits / train.json ”   --test “ ace-05-splits / test.json ” --dev “ ace-05-splits / dev。 json “ --earlystop 10 --restart 10 --optimizer ” adadelta “ --lr 1 --webd ” ./ace-05-splits/glove.6B.300d.txt “ --batch 8 --epochs 99999-设备“ cuda:0 ” --out “ models / enet-081 ” --hps “{'wemb_dim':300,'wemb_ft':True,'wemb_dp':0.5,'pemb_dim':50,'pemb_dp':0.5,'eemb_dim':50,'eemb_dp':0.5,'psemb_dim':50,' psemb_dp':0.5,'lstm_dim':220,'lstm_layers':1,'lstm_dp':0,'gcn_et':3,'gcn_use_bn':True,'gcn_layers':3,'gcn_dp':0.5,'sa_dim' :300,'use_highway':真,'loss_alpha':5} “

在控制台上打印了以下结果:

Epoch 40  dev loss:  3.0913915507072076 
dev ed p:  0.48264984227129337  dev ed r:  0.6375  dev ed f1:  0.5493716337522442 
dev ae p:  0.2878411910669975  dev ae r:  0.1281767955801105  dev ae f1:  0.17737003058103976
Epoch 40  test loss:  2.784576788090766 
test ed p:  0.3360323886639676  test ed r:  0.590047393364929  test ed f1:  0.4282029234737747 
test ae p:  0.20881226053639848  test ae r:  0.12219730941704036  test ae f1:  0.15417256011315417

Epoch 80  dev loss:  3.8771536317780955 
dev ed p:  0.5329949238578681  dev ed r:  0.65625  dev ed f1:  0.5882352941176472 
dev ae p:  0.24006908462867013  dev ae r:  0.15359116022099448  dev ae f1:  0.18733153638814018
Epoch 80  test loss:  3.8047063166558157 
test ed p:  0.3799705449189985  test ed r:  0.6113744075829384  test ed f1:  0.46866485013623976 
test ae p:  0.22857142857142856  test ae r:  0.18834080717488788  test ae f1:  0.20651505838967424

Epoch 120  dev loss:  4.38567394134314 
dev ed p:  0.572992700729927  dev ed r:  0.6541666666666667  dev ed f1:  0.6108949416342413 
dev ae p:  0.23627287853577372  dev ae r:  0.1569060773480663  dev ae f1:  0.18857901726427623
Epoch 120  test loss:  4.248081724495084 
test ed p:  0.40793650793650793  test ed r:  0.6090047393364929  test ed f1:  0.48859315589353614 
test ae p:  0.2297476759628154  test ae r:  0.19394618834080718  test ae f1:  0.21033434650455926

Epoch 160  dev loss:  4.3482774938757345 
dev ed p:  0.574585635359116  dev ed r:  0.65  dev ed f1:  0.6099706744868035 
dev ae p:  0.23304347826086957  dev ae r:  0.14806629834254142  dev ae f1:  0.18108108108108106
Epoch 160  test loss:  4.217268991275621 
test ed p:  0.41423948220064727  test ed r:  0.6066350710900474  test ed f1:  0.49230769230769234 
test ae p:  0.23285714285714285  test ae r:  0.1827354260089686  test ae f1:  0.20477386934673367

Epoch 199  dev loss:  4.394452537438701 
dev ed p:  0.5831775700934579  dev ed r:  0.65  dev ed f1:  0.6147783251231527 
dev ae p:  0.23861566484517305  dev ae r:  0.14475138121546963  dev ae f1:  0.1801925722145805
Epoch 199  test loss:  4.19947991046335 
test ed p:  0.4169381107491857  test ed r:  0.6066350710900474  test ed f1:  0.4942084942084942 
test ae p:  0.2422907488986784  test ae r:  0.18497757847533633  test ae f1:  0.20979020979020976

Have you changed the format of train.json, test.json, dev.json?

@ll0ruc
Copy link

ll0ruc commented Oct 21, 2019

@shiqing1234 @kkkyan
I couldn't achieve the same F1 score as the JMEE paper when I feed the preprocessing results of this code to JMEE model.

I ran JMEE with this command and got disappointing results.

python -m enet.run.ee.runner --train "ace-05-splits/train.json"  --test "ace-05-splits/test.json" --dev "ace-05-splits/dev.json" --earlystop 10 --restart 10 --optimizer "adadelta" --lr 1 --webd "./ace-05-splits/glove.6B.300d.txt" --batch 8 --epochs 99999 --device "cuda:0" --out "models/enet-081" --hps "{'wemb_dim': 300, 'wemb_ft': True, 'wemb_dp': 0.5, 'pemb_dim': 50, 'pemb_dp': 0.5, 'eemb_dim': 50, 'eemb_dp': 0.5, 'psemb_dim': 50, 'psemb_dp': 0.5, 'lstm_dim': 220, 'lstm_layers': 1, 'lstm_dp': 0, 'gcn_et': 3, 'gcn_use_bn': True, 'gcn_layers': 3, 'gcn_dp': 0.5, 'sa_dim': 300, 'use_highway': True, 'loss_alpha': 5}"

The following results were printed on the console:

Epoch 40  dev loss:  3.0913915507072076 
dev ed p:  0.48264984227129337  dev ed r:  0.6375  dev ed f1:  0.5493716337522442 
dev ae p:  0.2878411910669975  dev ae r:  0.1281767955801105  dev ae f1:  0.17737003058103976
Epoch 40  test loss:  2.784576788090766 
test ed p:  0.3360323886639676  test ed r:  0.590047393364929  test ed f1:  0.4282029234737747 
test ae p:  0.20881226053639848  test ae r:  0.12219730941704036  test ae f1:  0.15417256011315417

Epoch 80  dev loss:  3.8771536317780955 
dev ed p:  0.5329949238578681  dev ed r:  0.65625  dev ed f1:  0.5882352941176472 
dev ae p:  0.24006908462867013  dev ae r:  0.15359116022099448  dev ae f1:  0.18733153638814018
Epoch 80  test loss:  3.8047063166558157 
test ed p:  0.3799705449189985  test ed r:  0.6113744075829384  test ed f1:  0.46866485013623976 
test ae p:  0.22857142857142856  test ae r:  0.18834080717488788  test ae f1:  0.20651505838967424

Epoch 120  dev loss:  4.38567394134314 
dev ed p:  0.572992700729927  dev ed r:  0.6541666666666667  dev ed f1:  0.6108949416342413 
dev ae p:  0.23627287853577372  dev ae r:  0.1569060773480663  dev ae f1:  0.18857901726427623
Epoch 120  test loss:  4.248081724495084 
test ed p:  0.40793650793650793  test ed r:  0.6090047393364929  test ed f1:  0.48859315589353614 
test ae p:  0.2297476759628154  test ae r:  0.19394618834080718  test ae f1:  0.21033434650455926

Epoch 160  dev loss:  4.3482774938757345 
dev ed p:  0.574585635359116  dev ed r:  0.65  dev ed f1:  0.6099706744868035 
dev ae p:  0.23304347826086957  dev ae r:  0.14806629834254142  dev ae f1:  0.18108108108108106
Epoch 160  test loss:  4.217268991275621 
test ed p:  0.41423948220064727  test ed r:  0.6066350710900474  test ed f1:  0.49230769230769234 
test ae p:  0.23285714285714285  test ae r:  0.1827354260089686  test ae f1:  0.20477386934673367

Epoch 199  dev loss:  4.394452537438701 
dev ed p:  0.5831775700934579  dev ed r:  0.65  dev ed f1:  0.6147783251231527 
dev ae p:  0.23861566484517305  dev ae r:  0.14475138121546963  dev ae f1:  0.1801925722145805
Epoch 199  test loss:  4.19947991046335 
test ed p:  0.4169381107491857  test ed r:  0.6066350710900474  test ed f1:  0.4942084942084942 
test ae p:  0.2422907488986784  test ae r:  0.18497757847533633  test ae f1:  0.20979020979020976

I got the result same as you, ed f1=0.54 ae f1=0.30. I am trying to improve the result. Do you have get better?? @kkkyan

@kkkyan
Copy link

kkkyan commented Oct 24, 2019

@ScuLilei2014
No. It's impossible to represent the SOTA result in paper.

@CaoJonas
Copy link

CaoJonas commented Dec 6, 2019

@kkkyan Hello, I have dataset tac-kmp-2017, but I want to use ace2005-preprocessing for preprocessing, Do you provide an A.apf.xml file and an A.sgm for me.

@xiaomn
Copy link

xiaomn commented Jan 12, 2021

@shiqing1234 @kkkyan
I couldn't achieve the same F1 score as the JMEE paper when I feed the preprocessing results of this code to JMEE model.

I ran JMEE with this command and got disappointing results.

python -m enet.run.ee.runner --train "ace-05-splits/train.json"  --test "ace-05-splits/test.json" --dev "ace-05-splits/dev.json" --earlystop 10 --restart 10 --optimizer "adadelta" --lr 1 --webd "./ace-05-splits/glove.6B.300d.txt" --batch 8 --epochs 99999 --device "cuda:0" --out "models/enet-081" --hps "{'wemb_dim': 300, 'wemb_ft': True, 'wemb_dp': 0.5, 'pemb_dim': 50, 'pemb_dp': 0.5, 'eemb_dim': 50, 'eemb_dp': 0.5, 'psemb_dim': 50, 'psemb_dp': 0.5, 'lstm_dim': 220, 'lstm_layers': 1, 'lstm_dp': 0, 'gcn_et': 3, 'gcn_use_bn': True, 'gcn_layers': 3, 'gcn_dp': 0.5, 'sa_dim': 300, 'use_highway': True, 'loss_alpha': 5}"

The following results were printed on the console:

Epoch 40  dev loss:  3.0913915507072076 
dev ed p:  0.48264984227129337  dev ed r:  0.6375  dev ed f1:  0.5493716337522442 
dev ae p:  0.2878411910669975  dev ae r:  0.1281767955801105  dev ae f1:  0.17737003058103976
Epoch 40  test loss:  2.784576788090766 
test ed p:  0.3360323886639676  test ed r:  0.590047393364929  test ed f1:  0.4282029234737747 
test ae p:  0.20881226053639848  test ae r:  0.12219730941704036  test ae f1:  0.15417256011315417

Epoch 80  dev loss:  3.8771536317780955 
dev ed p:  0.5329949238578681  dev ed r:  0.65625  dev ed f1:  0.5882352941176472 
dev ae p:  0.24006908462867013  dev ae r:  0.15359116022099448  dev ae f1:  0.18733153638814018
Epoch 80  test loss:  3.8047063166558157 
test ed p:  0.3799705449189985  test ed r:  0.6113744075829384  test ed f1:  0.46866485013623976 
test ae p:  0.22857142857142856  test ae r:  0.18834080717488788  test ae f1:  0.20651505838967424

Epoch 120  dev loss:  4.38567394134314 
dev ed p:  0.572992700729927  dev ed r:  0.6541666666666667  dev ed f1:  0.6108949416342413 
dev ae p:  0.23627287853577372  dev ae r:  0.1569060773480663  dev ae f1:  0.18857901726427623
Epoch 120  test loss:  4.248081724495084 
test ed p:  0.40793650793650793  test ed r:  0.6090047393364929  test ed f1:  0.48859315589353614 
test ae p:  0.2297476759628154  test ae r:  0.19394618834080718  test ae f1:  0.21033434650455926

Epoch 160  dev loss:  4.3482774938757345 
dev ed p:  0.574585635359116  dev ed r:  0.65  dev ed f1:  0.6099706744868035 
dev ae p:  0.23304347826086957  dev ae r:  0.14806629834254142  dev ae f1:  0.18108108108108106
Epoch 160  test loss:  4.217268991275621 
test ed p:  0.41423948220064727  test ed r:  0.6066350710900474  test ed f1:  0.49230769230769234 
test ae p:  0.23285714285714285  test ae r:  0.1827354260089686  test ae f1:  0.20477386934673367

Epoch 199  dev loss:  4.394452537438701 
dev ed p:  0.5831775700934579  dev ed r:  0.65  dev ed f1:  0.6147783251231527 
dev ae p:  0.23861566484517305  dev ae r:  0.14475138121546963  dev ae f1:  0.1801925722145805
Epoch 199  test loss:  4.19947991046335 
test ed p:  0.4169381107491857  test ed r:  0.6066350710900474  test ed f1:  0.4942084942084942 
test ae p:  0.2422907488986784  test ae r:  0.18497757847533633  test ae f1:  0.20979020979020976

@kkkyan @shiqing1234 Thank you for your contribution. I got the preprocessing output, and downloaded glove.6B.300d.txt file from web.
But when i execute python -m enet.run.ee.runner --train "ace-05-splits/train.json" --test "ace-05-splits/test.json" --dev "ace-05-splits/dev.json" --earlystop 10 --restart 10 --optimizer "adadelta" --lr 1 --webd "./ace-05-splits/glove.6B.300d.txt" --batch 8 --epochs 99999 --device "cuda:0" --out "models/enet-081" --hps "{'wemb_dim': 300, 'wemb_ft': True, 'wemb_dp': 0.5, 'pemb_dim': 50, 'pemb_dp': 0.5, 'eemb_dim': 50, 'eemb_dp': 0.5, 'psemb_dim': 50, 'psemb_dp': 0.5, 'lstm_dim': 220, 'lstm_layers': 1, 'lstm_dp': 0, 'gcn_et': 3, 'gcn_use_bn': True, 'gcn_layers': 3, 'gcn_dp': 0.5, 'sa_dim': 300, 'use_highway': True, 'loss_alpha': 5}"
It occue to errors.

Traceback (most recent call last):
File "/home/xiaomengnan/.conda/envs/torch-1.3_py3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/xiaomengnan/.conda/envs/torch-1.3_py3.7/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/xiaomengnan/nlp/EventExtraction/EMNLP2018-JMEE/enet/run/ee/runner.py", line 237, in
EERunner().run()
File "/home/xiaomengnan/nlp/EventExtraction/EMNLP2018-JMEE/enet/run/ee/runner.py", line 95, in run
keep_events=1)
File "/home/xiaomengnan/nlp/EventExtraction/EMNLP2018-JMEE/enet/corpus/Data.py", line 190, in init
super(ACE2005Dataset, self).init(path, fields, **kwargs)
File "/home/xiaomengnan/nlp/EventExtraction/EMNLP2018-JMEE/enet/corpus/Corpus.py", line 20, in init
examples = self.parse_example(path, fields)
File "/home/xiaomengnan/nlp/EventExtraction/EMNLP2018-JMEE/enet/corpus/Data.py", line 201, in parse_example
jl = json.loads(line, encoding="utf-8")
File "/home/xiaomengnan/.conda/envs/torch-1.3_py3.7/lib/python3.7/json/init.py", line 348, in loads
return _default_decoder.decode(s)
File "/home/xiaomengnan/.conda/envs/torch-1.3_py3.7/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/xiaomengnan/.conda/envs/torch-1.3_py3.7/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)

I try to replace train.json with sample.json. the error is also.Would you meet before?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

7 participants