-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New ind_11
Inconsistent contract objects across the crisis
#8
Comments
I am working to build a pipeline that build a dictionary on a traning set. Preliminary work is here: Some main issues must be solved in a certain order:
On .3, consider this:
The last 2 points IMO are solved adopted a FA/PCA approach for the composite indicator, which polishes the multivariate structure from multicollinearity. I also want to know more about roberta and what exactly it (she?) could do; is it pre-trained right? Because if we need to train it, it would incur the same issue as above. |
|
text mining brainstorm
there are a number of techniques that may do the job, but each of them one of more of the following:
As a consequence we (@giuliogcantone and me) tried to figure out what we can do, these are some of the proposals:
cpv
exact description and compare against the objects over a number of similarity measures (Levenshtein, Jaccard etc) ✅We are currently investigating the third solution, but we are really open to discuss the other two.
The text was updated successfully, but these errors were encountered: