Skip to content

Latest commit

 

History

History
111 lines (76 loc) · 4.54 KB

README.md

File metadata and controls

111 lines (76 loc) · 4.54 KB

fasttext.wasm.js

Node and Browser env supported WebAssembly version of fastText: Library for efficient text classification and representation learning.

NPM version Download monthly

WebAssembly version of fastText(archived) with compressed lid.176.ftz model (~900KB) and a typescript wrapper. This project focuses on cross-platform, zero-dependency and out-of-the-box.

Features

  • Written in TypeScript
  • Supported Node, Worker, Browser and Browser extension runtime
  • Integrated language identification and normalized default result, supported 176 languages
  • Significantly faster and accurate than languagedetect and franc, superior to eld and cld.

Usage

In Node.js, you should use this approach for binding js best performance.

import { getLIDModel } from 'fasttext.wasm.js'

const lidModel = await getLIDModel()
await lidModel.load()
const result = await lidModel.identify('Hello, world!')
console.log(result.alpha2) // 'en'

In others environments, use like below:

import { getLIDModel } from 'fasttext.wasm.js/common'

const lidModel = await getLIDModel()
// Default paths:
// {
//   wasmPath: '<globalThis.location.origin>/fastText/fastText.common.wasm',
//   modelPath: '<globalThis.location.origin>/fastText/models/lid.176.ftz',
// }
await lidModel.load()
const result = await lidModel.identify('Hello, world!')
console.log(result.alpha2) // 'en'

Do not forget that download and place /fastText/fastText.common.wasm and /fastText/models/lid.176.ftz in public root directory. You can override the default paths if necessary.

Benchmark

Dataset papluca/language-identification/test accuracy test result in Node.js runtime:

Name Error Rate Accuracy Total
fastText 0.02 0.98 10000
cld 0.04 0.96 10000
eld 0.06 0.94 10000
languageDetect 0.24 0.76 10000
franc 0.27 0.73 10000

How to?

codesandbox/fasttext.wasm.js

  • Run Bench Test task for accuracy test
  • Run Bench task for benchmark test

or

  • Clone the repo
  • pnpm i
  • pnpm run build
  • cd bench
  • pnpm run test for accuracy test
  • pnpm run bench for benchmark test

Related

Credits

References

Build & Publish

Requirements

Pay attention, add source ./emsdk_env.sh to shell profile to auto load emsdk env, and export EMSDK_QUIET=1 can be used to suppress these messages.

  • npm run build
  • npx changeset
  • npx changeset version
  • git commit
  • npx changeset publish
  • git push --follow-tags

changeset prerelease doc

License

MIT License © 2023 Yuns