Conclusions

The data is very noisy. Manual labeling would definitely tune the machine. To keep the objectivity, maybe some expert or student of science politics/politics would create an interesting dataset.
To accomplish this would be better to have better research not only with the quotes available on the web but from books and academic paper, newspaper and blog.
The labeling by authors can be good but it has a big limitation: It is always possible that the mind of the person, usually more coherent than most of the politician's mind, can float. I feel that every time we try to have a rigid classification about something not so rigid, like the human mind, is forced and no natural. A biased author not always will have biased content.
The machine evaluates at the same way a close (acceptable) error with a far category. A method for ordinal classification could fit better our problem.
To define such a complex topic it requires more data. The machine works mainly combining words. More combination we use to feed the machine, more likely the opinion mining can be better.
Since I use the last 60 years, the machine not overfit to the last period time, actually probably underfit. At the same time, new and specific topics (like Brexit) are meaningless.
There is a big homologation in the political lexicon. Especially in the 3 central ideologies.
There are words like "right" which are very difficult to interpret, due to many meaning that the word could have. Metaphor and humor are a big problem too.
There is a patter that needs to be investigated to get better result.

Future implementations

Better Dataset, different Dataset: Quotes without noise, investigate the same method with a Twitter database, and newspaper/blog database.
Use topics as part of the features
Locate the Database in the space and in time: Use English was important to collect more data but to improve, create a model for each country (space) and for each topic could bring a massive evolution in the project. Exactly we should locate the database in a shorter period of time.
Create a specific metric evaluation that could be compatible with this ordinal classification problem. Bespoke confusion matrix and classification report.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conclusions.md

Conclusions.md

Conclusions

Future implementations

Files

Conclusions.md

Latest commit

History

Conclusions.md

File metadata and controls

Conclusions

Future implementations