You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Move the following features into the static document feature set since they can all be precomputed and don't depend on the query (note that url slash count and url length are already there, but they need to be removed from doc_entry and friends):
// The number of times the <title> tag appears in the document
int tag_title_count = 0;
// The number of times the <heading> tag appears in the document
int tag_heading_count = 0; // Indri heading field includes tags h1-h4
// The number of inlinks in the document
int tag_inlink_count = 0;
// The number of times the <applet> tag appears in the document
int tag_applet_count = 0;
// The number of times the <object> tag appears in the document
int tag_object_count = 0;
// The number of times the <embed> tag appears in the document
int tag_embed_count = 0;
// Number of slashes in URL
int url_slash_count = 0;
// URL length
size_t url_length = 0;
The text was updated successfully, but these errors were encountered:
To add to this, f_url_slash_count is output via the config.ini according to the documentation, but f_url_length was omitted. So if one is using a config copied from the documentation you will see this behaviour.
Related to #4 where the url features are static features that are
already computed in `generate_static_doc_features`.
This is a breaking change for previously created forward indexes that
include the `UrlStats` information. Currently there is no internal
versioning for indexes that are created. See #23.
Move the following features into the static document feature set since they can all be precomputed and don't depend on the query (note that url slash count and url length are already there, but they need to be removed from
doc_entry
and friends):The text was updated successfully, but these errors were encountered: