Sample Zika Extraction (from old wiki) #3984
Closed
chenlica
started this conversation in
archived-wiki
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
From the page https://github.com/apache/texera/wiki/Sample-Zika-Extraction (may be dangling)
====
For all the operators, leave limit and offset empty
create KeywordSource with properties:
keyword: zika
data source: promed
matching type: conjunction (default)
attribute: content
create Projection
attributes: _id, webpage, content
connect KeywordSource with Projection
create Regex_Person
regex:
(A|a|(an)|(An)) .{1,40} ((woman)|(man))
attribute: content
connect Projection with Regex_Person
create NLP_Location
type: location
attribute: content
connect Projection with NLP_Location
create Regex_Date
regex:
(((0?[1-9])|(1[0-2]))(\s|-|.|/)((0?[1-9])|([12][0-9])|(3[01]))(\s|-|.|/)([0-9]{4}|[0-9]{2}))|((0?[1-9])|([12][0-9])|(3[01])) ((jan(uary)?)|(feb(ruary)?)|(mar(ch)?)|(apr(il)?)|(may)|(june?)|(july?)|(aug(ust)?)|(sep(tember)?)|(oct(ober)?)|(nov(ember)?)|(dec(ember)?))
attribute: content
connect Projection with Regex_Date
create Join1
Join attribute: content
id attribute: _id (default)
PredicateType: CharacterDistance (default)
distance: 100
connect Regex_Person and NLP_Location with Join1
create Join2
(same properties as Join1)
Connect Join1 and Regex_Date with Join2
Create TupleStreamSink (view results)
connect Join2 with TupleStreamSinkFor all the operators, leave limit and offset empty
Here's a screenshot of the query plan:
Beta Was this translation helpful? Give feedback.
All reactions