Skip to content

Commit

Permalink
Merge pull request #119 from HIT-SCIR/develop
Browse files Browse the repository at this point in the history
Update SRL API and models to support LTP 3.4.0
  • Loading branch information
liu946 committed Dec 10, 2017
2 parents d931476 + 67ccdee commit cd04600
Show file tree
Hide file tree
Showing 2,587 changed files with 390,223 additions and 109,474 deletions.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,18 @@ build
include/
lib/
bin/
dist
!patch/include
pyltp.egg-info

*.swp
doc/_build
doc/_static
doc/_templates
!doc/Makefile

###############
# data #
###############
ltp_data

8 changes: 4 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,6 @@ matrix:
sudo: required
python: 2.7
env: TOXENV=py27
- os: linux
sudo: required
python: 3.2
env: TOXENV=py32
- os: linux
sudo: required
python: 3.3
Expand All @@ -27,6 +23,10 @@ matrix:
sudo: required
python: 3.5
env: TOXENV=py35
- os: linux
sudo: required
python: 3.6
env: TOXENV=py36
- os: osx
language: generic
env: TOXENV=py2
Expand Down
4 changes: 2 additions & 2 deletions .travis/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,6 @@ export PYLTPVER=$(${PY} setup.py --version)
$PY setup.py build
$PY setup.py sdist
cd dist/
tar zxvf pyltp-$PYLTPVER.tar.gz > /dev/null
tar zxvf pyltp-$PYLTPVER.tar.gz
cd pyltp-$PYLTPVER
$PY setup.py build >& /dev/null
$PY setup.py build
7 changes: 4 additions & 3 deletions MANIFEST.in
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,9 @@ recursive-include ltp/src/segmentor *.cpp *.h *.hpp
recursive-include ltp/src/postagger *.cpp *.h *.hpp
recursive-include ltp/src/ner *.cpp *.h *.hpp
recursive-include ltp/src/parser.n *.cpp *.h *.hpp
recursive-include ltp/src/srl *.cpp *.h *.hpp *.hh
recursive-include ltp/thirdparty/boost *.h *.hpp *.cpp
recursive-include ltp/thirdparty/eigen-3.2.4 *
recursive-include ltp/src/srl *.cpp *.h
recursive-include ltp/thirdparty/boost *.h *.hpp *.cpp *.ipp
recursive-include ltp/thirdparty/eigen *
recursive-include ltp/thirdparty/dynet *
recursive-include ltp/thirdparty/maxent *.h *.cpp
recursive-include patch *.h *.hpp *.cpp
14 changes: 7 additions & 7 deletions README.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# pyltp
# pyltp

[![PyPI Status](https://badge.fury.io/py/pyltp.svg)](https://badge.fury.io/py/pyltp)
[![Readthedocs](https://readthedocs.org/projects/pyltp/badge/?version=latest)](http://pyltp.readthedocs.io/)
Expand Down Expand Up @@ -37,23 +37,23 @@ segmentor.release()
$ pip install pyltp
```
或从源代码安装

```
$ git clone https://github.com/HIT-SCIR/pyltp
$ git submodule init
$ git submodule update
$ python setup.py install
$ python setup.py install # Mac系统出现版本问题使用 MACOSX_DEPLOYMENT_TARGET=10.7 python setup.py install
```

* 第二步,下载模型文件

[百度云](http://pan.baidu.com/share/link?shareid=1988562907&uk=2738088569),当前模型版本 3.3.1
[七牛云](http://ltp.ai/download.html),当前模型版本 3.4.0

## 版本对应

* pyltp 版本:0.1.9
* LTP 版本:3.3.2
* 模型版本:3.3.1
* pyltp 版本:0.2.0
* LTP 版本:3.4.0
* 模型版本:3.4.0

## 作者

Expand Down
13 changes: 11 additions & 2 deletions appveyor.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
before_build:
- git submodule init
- git submodule update


image:
- Visual Studio 2015
- Visual Studio 2017

environment:
matrix:
- PY: C:\Python36-x64
- PY: C:\Python35-x64

build_script:
- python setup.py build
- "%PY%\\python.exe setup.py build"
8 changes: 2 additions & 6 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,10 @@ pyltp 是 `LTP <https://github.com/HIT-SCIR/ltp>`_ 的 Python 封装,提供了
请先下载完整的 LTP 模型文件

* 下载地址 - `百度云 <http://pan.baidu.com/share/link?shareid=1988562907&uk=2738088569>`_
* 当前模型版本 - 3.3.1
* 当前模型版本 - 3.4.0

请确保下载的模型版本与当前版本的 pyltp 对应,否则会导致程序无法正确加载模型。

如果您是从 Github 获取的源代码,其中包含的 :file:`ltp_data` 目录仅为测试用的模型文件,不能产生正确的分析结果。


请注意编码
----------

Expand Down Expand Up @@ -262,9 +259,8 @@ B、I、E、S位置标签和实体类型标签之间用一个横线 :code:`-`

words = ['元芳', '你', '怎么', '看']
postags = ['nh', 'r', 'r', 'v']
netags = ['S-Nh', 'O', 'O', 'O']
# arcs 使用依存句法分析的结果
roles = labeller.label(words, postags, netags, arcs) # 语义角色标注
roles = labeller.label(words, postags, arcs) # 语义角色标注

# 打印结果
for role in roles:
Expand Down
1 change: 1 addition & 0 deletions doc/changelog.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* 2017年12月05日 升级更新兼容 LTP 3.4.0
6 changes: 3 additions & 3 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,17 +46,17 @@

# General information about the project.
project = u'pyltp'
copyright = u'2016, HIT-SCIR'
copyright = u'2017, HIT-SCIR'
author = u'HIT-SCIR'

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = u'0.1.9'
version = u'0.2.0'
# The full version, including alpha/beta/rc tags.
release = u'0.1.9'
release = u'0.2.0'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
7 changes: 5 additions & 2 deletions doc/install.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
安装 pyltp
===========

* 注:由于新版本增加了新的第三方依赖如dynet等,不再支持 windows 下 python2 环境。

使用 pip 安装
-------------

Expand All @@ -10,8 +12,9 @@

接下来,需要下载 LTP 模型文件。

* 下载地址 - `百度云 <http://pan.baidu.com/share/link?shareid=1988562907&uk=2738088569>`_
* 当前模型版本 - 3.3.1
* 下载地址 - `模型下载 http://ltp.ai/download.html`_
* 当前模型版本 - 3.4.0
* 注意在windows下 3.4.0 版本的 语义角色标注模块 模型需要单独下载,具体查看下载地址链接中的说明。

请确保下载的模型版本与当前版本的 pyltp 对应,否则会导致程序无法正确加载模型。

Expand Down
18 changes: 9 additions & 9 deletions example/example.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
sys.path = [os.path.join(ROOTDIR, "lib")] + sys.path

# Set your own model path
MODELDIR=os.path.join(ROOTDIR, "ltp_data")
MODELDIR=os.path.join(ROOTDIR, "./ltp_data")

from pyltp import SentenceSplitter, Segmentor, Postagger, Parser, NamedEntityRecognizer, SementicRoleLabeller

Expand All @@ -17,33 +17,33 @@
segmentor = Segmentor()
segmentor.load(os.path.join(MODELDIR, "cws.model"))
words = segmentor.segment(sentence)
print "\t".join(words)
print("\t".join(words))

postagger = Postagger()
postagger.load(os.path.join(MODELDIR, "pos.model"))
postags = postagger.postag(words)
# list-of-string parameter is support in 0.1.5
# postags = postagger.postag(["中国","进出口","银行","与","中国银行","加强","合作"])
print "\t".join(postags)
print("\t".join(postags))

parser = Parser()
parser.load(os.path.join(MODELDIR, "parser.model"))
arcs = parser.parse(words, postags)

print "\t".join("%d:%s" % (arc.head, arc.relation) for arc in arcs)
print("\t".join("%d:%s" % (arc.head, arc.relation) for arc in arcs))

recognizer = NamedEntityRecognizer()
recognizer.load(os.path.join(MODELDIR, "ner.model"))
netags = recognizer.recognize(words, postags)
print "\t".join(netags)
print("\t".join(netags))

labeller = SementicRoleLabeller()
labeller.load(os.path.join(MODELDIR, "srl/"))
roles = labeller.label(words, postags, netags, arcs)
labeller.load(os.path.join(MODELDIR, "pisrl.model"))
roles = labeller.label(words, postags, arcs)

for role in roles:
print role.index, "".join(
["%s:(%d,%d)" % (arg.name, arg.range.start, arg.range.end) for arg in role.arguments])
print(role.index, "".join(
["%s:(%d,%d)" % (arg.name, arg.range.start, arg.range.end) for arg in role.arguments]))

segmentor.release()
postagger.release()
Expand Down
2 changes: 1 addition & 1 deletion ltp
Submodule ltp updated 3589 files
Binary file removed ltp_data/cws.model
Binary file not shown.
Binary file removed ltp_data/ner.model
Binary file not shown.
Binary file removed ltp_data/parser.model
Binary file not shown.
Binary file removed ltp_data/pos.model
Binary file not shown.
116 changes: 0 additions & 116 deletions ltp_data/srl/Chinese.xml

This file was deleted.

8 changes: 0 additions & 8 deletions ltp_data/srl/prg.model

This file was deleted.

20 changes: 0 additions & 20 deletions ltp_data/srl/srl.cfg

This file was deleted.

2 changes: 0 additions & 2 deletions ltp_data/srl/srl.model

This file was deleted.

Empty file added patch/__init__.py
Empty file.
Loading

0 comments on commit cd04600

Please sign in to comment.