Skip to content

Scheme Evaluation and Mapping for Structural Text Representation

License

Notifications You must be signed in to change notification settings

huji-nlp/semstr

Repository files navigation

Scheme Evaluation and Mapping for Structural Text Representation

Collection of utilities for conversion and evaluation of semantic and syntactic text representation schemes.

Requirements

  • Python 3.6

Install

Create a Python virtual environment:

virtualenv --python=/usr/bin/python3 venv
. venv/bin/activate              # on bash
source venv/bin/activate.csh     # on csh

Install the latest release:

pip install semstr

Alternatively, install the latest code from GitHub (may be unstable):

git clone https://github.com/danielhers/semstr
cd semstr
pip install .

Convert

To convert an SDP file to CoNLL-U, for example, run:

$ python semstr/convert.py test_files/20001001.sdp -f conllu
Converting: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 76.49file/s, file=20001001.sdp]

In this example, multiple heads are preserved in the deps column:

$ cat 20001001.conllu
# format = sdp
# sent_id = 20001001
# text = Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .
1	Pierre	Pierre	NNP	NNP	_	0	root	0:root	_
2	Vinken	_generic_proper_ne_	NNP	NNP	_	1	compound	1:compound|6:ARG1|9:ARG1	_
3	,	_	,	,	_	1	orphan	1:orphan	_
4	61	_generic_card_ne_	CD	CD	_	1	orphan	1:orphan	_
5	years	year	NNS	NNS	_	4	ARG1	4:ARG1	_
6	old	old	JJ	JJ	_	5	measure	5:measure	_
7	,	_	,	,	_	1	orphan	1:orphan	_
8	will	will	MD	MD	_	1	orphan	1:orphan	_
9	join	join	VB	VB	_	1	orphan	1:orphan	_
10	the	the	DT	DT	_	1	orphan	1:orphan	_
11	board	board	NN	NN	_	9	ARG2	9:ARG2|10:BV	_
12	as	as	IN	IN	_	1	orphan	1:orphan	_
13	a	a	DT	DT	_	1	orphan	1:orphan	_
14	nonexecutive	_generic_jj_	JJ	JJ	_	1	orphan	1:orphan	_
15	director	director	NN	NN	_	12	ARG2	12:ARG2|13:BV|14:ARG1	_
16	Nov.	Nov.	NNP	NNP	_	1	orphan	1:orphan	_
17	29	_generic_dom_card_ne_	CD	CD	_	16	of	16:of	_
18	.	_	.	.	_	1	orphan	1:orphan	_

For any other source and target formats, just replace test_files/20001001.sdp and conllu. Supported formats are: json,conll,conllu,sdp,export,amr,txt.

Author

License

This package is licensed under the GPLv3 or later license (see LICENSE.txt).

            [ ~ Dependencies scanned by PyUp.io ~ ]

Build Status (Travis CI) Build Status (AppVeyor) Build Status (Docs) PyPI version

About

Scheme Evaluation and Mapping for Structural Text Representation

Resources

License

Stars

Watchers

Forks

Packages

No packages published