Skip to content

Return a list of illegal byte spans within a UTF-8 encoded buffer or array.

License

Notifications You must be signed in to change notification settings

quicbit-js/qb-utf8-illegal-bytes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

qb-utf8-illegal-bytes

npm downloads dependencies dev dependencies code analysis

Compact and vary fast function that returns locations of non-legal UTF-8 encoded bytes in an array of bytes (array or typed array holding integer values 0-255)

Complies with the 100% test coverage and minimum dependency requirements of qb-standard .

Install

npm install qb-utf8-illegal-bytes

API Change 1.x -> 2.x

The 2.0 API has been updated to comply with standard parameters 'src', 'off' and 'lim' in the glossary (rather than the previous opt.beg, opt.enc). We will be using this approach for array-like inputs throughout quicbit libraries.

illegal_bytes(src, off, lim)

Return locations of non-legal UTF-8 encoded bytes in a src array or array-like object of bytes (integers 0-255). Start checking at the off (defaults to 0) and ending before the optional lim (defaults to src.length). Resulting ranges are returned as an array of [off, lim] pairs where again, off is inclusive and lim is exclusive.

Example

var illegal_bytes = require( 'qb-utf8-illegal-bytes' )

var src = [0x61, 0x62, 0xF0, 0x83, 0x63, 0x64, 0xC2]        // alternatively, this can be a node Buffer or Uint8Array
console.log( illegal_bytes( src ) )

> [ [2,4], [6,7] ]

...which shows that bytes at offsets 2, 3 and 6 are illegal

console.log( illegal_bytes( src, 3 ) );  // start at offset 3 ( byte 0x83 )

> [ [3,4], [6,7] ]

...which shows that bytes at offsets 3 and 6 are illegal

About

Return a list of illegal byte spans within a UTF-8 encoded buffer or array.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published