Remove parsel dependency #15

kmike · 2018-11-13T21:39:36Z

This PR is on top of #14.

_html_to_text is promoted to a public html_text.etree_to_text
html_text.cleaner object is exposed
parsel is imported only when needed
create_root_node implementation is copy-pasted to parse_html,
to remove dependency
parsel is removed from install_requiers
README is updated

Motivation: make it possible for parsel to depend on html-text for scrapy/parsel#127.

codecov-io · 2018-11-13T21:41:05Z

Codecov Report

Merging #15 into master will increase coverage by 0.11%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master      #15      +/-   ##
==========================================
+ Coverage   97.82%   97.93%   +0.11%     
==========================================
  Files           2        2              
  Lines          92       97       +5     
  Branches       17       18       +1     
==========================================
+ Hits           90       95       +5     
  Misses          2        2

Impacted Files	Coverage Δ
html_text/__init__.py	`100% <100%> (ø)`	⬆️
html_text/html_text.py	`97.89% <100%> (+0.11%)`	⬆️

html_text/html_text.py

tox.ini

lopuhin

Hey @kmike I like the idea of removing parsel dependency and I like the code changes 👍
My main concern is that we don't check if we work without parsel, left a more detailed comment inline.

* _html_to_text is promoted to a public html_text.etree_to_text * html_text.cleaner object is exposed * parsel is imported only when needed * create_root_node implementation is copy-pasted to parse_html, to remove dependency * parsel is removed from install_requiers * README is updated The goal is to allow using html_text in parsel.

this is to cover all branches in parse_html function

lopuhin

Looks great, thanks! 👍

lopuhin reviewed Nov 14, 2018

View reviewed changes

html_text/html_text.py Show resolved Hide resolved

lopuhin reviewed Nov 14, 2018

View reviewed changes

tox.ini Outdated Show resolved Hide resolved

lopuhin reviewed Nov 14, 2018

View reviewed changes

kmike added 2 commits November 17, 2018 15:26

TST add missing test case

2bfcf2c

this is to cover all branches in parse_html function

kmike force-pushed the remove-parsel-dependency branch from 603a7d9 to 2bfcf2c Compare November 17, 2018 10:27

kmike mentioned this pull request Nov 17, 2018

[WIP] text extraction in Selector and SelectorList scrapy/parsel#127

Open

12 tasks

kmike added 6 commits November 17, 2018 15:50

Merge branch 'master' into remove-parsel-dependency

0b6c8ef

DOC make it more explicit selector_to_text is parsel-specific

ef979dc

TST run tests without parsel by default, ann environments with parsel

1ac7349

Merge branch 'master' into remove-parsel-dependency

6879cf8

TST enable parsel-specific tests on Travis

028d2e1

disable coverage "project" check

d8fa17a

lopuhin approved these changes Nov 19, 2018

View reviewed changes

lopuhin merged commit 80289f1 into master Nov 19, 2018

lopuhin deleted the remove-parsel-dependency branch November 19, 2018 08:29

lopuhin mentioned this pull request Dec 10, 2018

remove "sudo: false" now that travis no longer supports it TeamHG-Memex/soft404#17

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove parsel dependency #15

Remove parsel dependency #15

kmike commented Nov 13, 2018

codecov-io commented Nov 13, 2018 •

edited

Loading

lopuhin left a comment

lopuhin left a comment

Remove parsel dependency #15

Remove parsel dependency #15

Conversation

kmike commented Nov 13, 2018

codecov-io commented Nov 13, 2018 • edited Loading

Codecov Report

lopuhin left a comment

Choose a reason for hiding this comment

lopuhin left a comment

Choose a reason for hiding this comment

codecov-io commented Nov 13, 2018 •

edited

Loading