Skip to content

Releases: davidmogar/cucco

Cucco v2.2.1

27 Jun 21:00
Compare
Choose a tag to compare

Oopsie!

Previous release was not as well-tested as it should have been. Sorry 😢

This version fixes a problem with the encoding while processing a file using cucco CLI.

Cucco v2.2.0

27 Jun 20:43
Compare
Choose a tag to compare

Cucco is back with a minor release.

This version fixes issue #42, adding the possibility to normalize a single file using cucco CLI. The library itself has suffered an small change too. Normalization function 'remove_extra_whitespaces' has been renamed to 'remove_extra_white_spaces'. This means that any previous code or config file using this function will break if using the last version. Sorry for that 😞

Happy normalization 🐔

Cucco v2.1.0

09 Jun 22:19
Compare
Choose a tag to compare

Moving forward!

This minor release is a needed step to be able to use the library in an API.

Now remove_stop_words function allows to specify the language to use. Also, the little cucco in not as lazy as before and will always load the stop words file for the language indicated in the Configuration class. If lazy_load is not used, all of them will be loaded.

Enjoy! 🐔

Cucco v2.0.0

04 Jun 14:29
Compare
Choose a tag to compare

Yay! Cucco has reached version 2 and it comes with some nice goodies.

But before some words from our sponsors. I'll give the floor to Mike, CEO of Feathers & CO:

Thank you all for this invitation.

To open this event I would like to talk about...

Shut your beak and tell me what's new

Ok ok, so here is the list of new features for cucco:

  • New CLI: If you just want to use cucco from the command line, today is your day. This CLI can normalize short texts, a given file or even any file changing inside a watched directory.
  • Config management: A new class to handle all the config has been added to cucco. This class allows to load normalizations to apply from a yaml file.
  • Logging: Not a big deal but now it's easier to see what is happening.

Lots of new things... it sounds complicated... any docs?

Not yet, but I'm working on this. All the docs for cucco will be available at cucco.io cucco-soon. In the meantime...

...

Run Mike! Run for your life!

Cucco v1.1.0

15 May 11:07
Compare
Choose a tag to compare

This version improve the stop words removal functionality adding support for 50 languages and a simplified format for the stop words files (one word per line without comments).

Cucco v1.0.0

01 Mar 20:58
Compare
Choose a tag to compare

Almost two years after the first release of the Python text normalizer, version 1.0.0 is released.

What is new?

  • New name! Say hi to cucco ✋
  • New normalization functions.
  • More stability thanks to a great test coverage.
  • Code refactored to make it more readable and easier to extend.

Special thanks to @feinsteinben who helped to extend the library and, more important, helped me to get some motivation to keep improving it.

Normalizr v0.1.9

31 Aug 12:35
Compare
Choose a tag to compare

This version adds a logger to remove unwanted messages while using the library.

To change log level for Normalizr the logger_level parameter can be used like this:

import logging

from normalizr import Normalizr

normalizr = Normalizr(logger_level=logging.DEBUG)

Normalizr v0.1.8

15 May 10:32
Compare
Choose a tag to compare

The new release brings the next changes:

  • Fix for remove_accent_marks function that normalize the text back to NFKC after strip accents. Also, NFKD format is forced now and format attribute has gone away.
  • Stop words file lazy loading option. By default the stop word file is loaded on instance creation but this behavior can now be changed using lazy_load option.

Normalizr v0.1.6

09 May 13:48
Compare
Choose a tag to compare

Version 0.1.6 brings an improved way to execute a group of normalizations using a list. The next piece of code shows the same normalizations applied over the text "Who let the dog out?" using functions calls and list invocation:

normalizr = Normalizr(language='en')

# Without normalize function
text = 'Who    let   the dog out?'
text = normalizr.remove_extra_whitespaces(text)
text = normalizr.replace_punctuation(text, replacement=' ')
text = normalizr.remove_stop_words(text)
text = normalizr.remove_extra_whitespaces(text)

print(text)

# With normalize function
normalizations = [
        'remove_extra_whitespaces',
        ('replace_punctuation', {'replacement': ' '}),
        'remove_stop_words',
        'remove_extra_whitespaces'
]

print(normalizr.normalize('Who    let   the dog out?', normalizations))

Normalizr v0.1.5

09 May 09:33
Compare
Choose a tag to compare

This new version comes with URLs and emojis replacement. It also changes some functions behavior to replace instead of remove (functions names have changed also).