Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: 'NoneType' object is not iterable in command get-fuzzy-augmented-matches #98

Open
binh-vu opened this issue Nov 12, 2021 · 0 comments

Comments

@binh-vu
Copy link

binh-vu commented Nov 12, 2021

I run the command get-fuzzy-augmented-matches on this canonical file: canonical.csv and got the error: TypeError: 'NoneType' object is not iterable.

Command:

tl --log-file log.txt --url http://ckg07:9200 --index wikidatadwd-augmented-09 \
    clean -c label -o label_clean canonical.csv \
    / get-fuzzy-augmented-matches -c label_clean \
       --auxiliary-fields graph_embedding_complex,class_count,property_count,context 
       --auxiliary-folder aux_files

Error:

entered except
Command: get-fuzzy-augmented-matches
Error Message: Traceback (most recent call last):
  File "/data/binhvu/table-linker/tl/cli/get-fuzzy-augmented-matches.py", line 74, in run
    odf = em.get_matches(column=kwargs['column'],
  File "/data/binhvu/table-linker/tl/candidate_generation/get_fuzzy_augmented_matches.py", line 44, in get_matches
    return self.utility.create_candidates_df(df,
  File "/data/binhvu/table-linker/tl/candidate_generation/utility.py", line 32, in create_candidates_df
    for _candidates_format, candidates_aux_dict in executor.map(
  File "/data/binhvu/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 611, in result_iterator
    yield fs.pop().result()
  File "/data/binhvu/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/data/binhvu/anaconda3/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/data/binhvu/anaconda3/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/data/binhvu/table-linker/tl/candidate_generation/utility.py", line 90, in create_candidates
    candidate_dict, candidate_aux_dict = self.es.search_term_candidates(
  File "/data/binhvu/table-linker/tl/candidate_generation/es_search.py", line 378, in search_term_candidates
    hits = self.create_fuzzy_augmented_union(fuzzy_augmented_hits, fuzzy_augmented_keyword_lower_hits)
  File "/data/binhvu/table-linker/tl/candidate_generation/es_search.py", line 318, in create_fuzzy_augmented_union
    for item in fuzzy_augmented_keyword_lower_hits:
TypeError: 'NoneType' object is not iterable

Before throwing the exception, querying ES also returns error with response 500:

Query ES error with response 500!
{'error': {'root_cause': [{'type': 'too_complex_to_determinize_exception', 'reason': 'too_complex_to_determinize_exception: Determinizing automaton with 27934 states and 55578 transitions would result in more than 10000 states.'}], 'type': 'search_phase_execution_exception', 'reason': 'all shards failed', 'phase': 'query', 'grouped': True, 'failed_shards': [{'shard': 0, 'index': 'wikidatadwd-augmented-09', 'node': '_1cSOPZbS42KxMr93lgE6Q', 'reason': {'type': 'fuzzy_terms_exception', 'reason': "fuzzy_terms_exception: Term too complex: prithviraj sukumaran directorial debut vivek oberoi 's malayalam debut film highest-grossing malayalam film.crossed ₹50 crore mark in 4 days, ₹100 crore mark in 8 days and ₹150 crores in 21 days. first malayalam film to gross over ₹50 crores in overseas box office.", 'caused_by': {'type': 'too_complex_to_determinize_exception', 'reason': 'too_complex_to_determinize_exception: Determinizing automaton with 27934 states and 55578 transitions would result in more than 10000 states.'}}}], 'caused_by': {'type': 'too_complex_to_determinize_exception', 'reason': 'too_complex_to_determinize_exception: Determinizing automaton with 27934 states and 55578 transitions would result in more than 10000 states.'}}, 'status': 500}

Because of chaining commands, the next command get-exact-matches won't execute because of empty input, which obfuscate the real errors to users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant