Skip to content

Gradle port for language detection library implemented in plain Java (aliases: language identification, language guessing)

License

Notifications You must be signed in to change notification settings

yandooo/language-detect

Repository files navigation

language-detect

Build Status Download

Gradle port for language detection library implemented in plain Java (aliases: language identification, language guessing)

Abstract

  • Generate language profiles from Wikipedia abstract xml
  • Detect language of a text using naive Bayesian filter
  • 99% over precision for 53 languages

News

  • 21/12/2015
    • Moved to gradle build system
    • Fixes issues with some tests
    • Add support for JDK 7
    • Update DetectorFactory to support classpath lookup
    • Add upload to Bintray
  • 03/03/2014
    • Distribute a new package with short-text profiles (47 languages)
      • Build latest codes
      • Remove Apache Nutch's plugin (for API deprecation)
  • 01/12/2012
    • Migrate the repository of language-detection from subversion into git
      • for Maven support ....

Requires

Add the repositories:

repositories {
    mavenCentral()
    maven { url  "http://dl.bintray.com/oembedler/maven" }
}

Dependency:

dependencies {
  compile 'com.cybozu.labs:language-detect:INSERT_LATEST_VERSION_HERE'
}

How to use the latest build with Maven:

<repository>
    <snapshots>
        <enabled>false</enabled>
    </snapshots>
    <id>bintray-oembedler-maven</id>
    <name>bintray</name>
    <url>http://dl.bintray.com/oembedler/maven</url>
</repository>

Dependency:

<dependency>
    <groupId>com.cybozu.labs</groupId>
    <artifactId>language-detect</artifactId>
    <version>INSERT_LATEST_VERSION_HERE</version>
</dependency>

Usage

import java.util.ArrayList;
import com.cybozu.labs.langdetect.Detector;
import com.cybozu.labs.langdetect.DetectorFactory;
import com.cybozu.labs.langdetect.Language;

class LangDetectSample {
    public void init(String profileDirectory) throws LangDetectException {
        DetectorFactory.loadProfile(profileDirectory);
    }
    public String detect(String text) throws LangDetectException {
        Detector detector = DetectorFactory.create();
        detector.append(text);
        return detector.detect();
    }
    public ArrayList<Language> detectLangs(String text) throws LangDetectException {
        Detector detector = DetectorFactory.create();
        detector.append(text);
        return detector.getProbabilities();
    }
}

About

Gradle port for language detection library implemented in plain Java (aliases: language identification, language guessing)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages