-
Notifications
You must be signed in to change notification settings - Fork 283
Modifying encoding tables
Alexander Shtuchkin edited this page Dec 9, 2019
·
2 revisions
Sometimes you need to adjust existing encoding tables. Here's a small overview of how to do that (thanks @btsimonh in #226 for writing it down).
In order to add a DBCS table based on another, you need to do a few things:
- 1 you need to call iconv.getCodec(); so that iconv.encodings exists.
- 2 create a table (or extra parts you want to add to a table).
- 3 create a new encoding definition (like in dbcs-data.js). Note now I based it on a previous table (cp950) without having to directly require the relevant table file - requiring was difficult because of paths.
- 4 Add the new definition directly to iconv.encodings.
- 5 use your sparkly new table :).
Example code snippet:
var iconv = require('iconv-lite');
var private = [
["fa40","\ue000", 62],
["faa1","\ue03f", 93],
["fb40","\ue09d", 62],
["fba1","\ue0dc", 93],
["fc40","\ue13a", 62],
["fca1","\ue179", 93],
["fd40","\ue1d7", 62],
["fda1","\ue216", 93],
["fe40","\ue274", 62],
["fea1","\ue2b3", 93],
];
try {
iconv.getCodec(); // if you get ANY named table here, then you won't except.
} catch(e) {
// ignore
console.log('ignored:', e);
}
var big5pua = {
type: '_dbcs',
table: function() {
var tab = iconv.encodings['cp950'].table();
return tab.concat(private);
},
encodeSkipVals: [0xa2cc, 0xa2ce],
};
iconv.encodings['big5pua'] = big5pua;
// test our two duplicate characters and the first PUA character
const buf = Buffer.from('fa4020fa7efaa120fafefb4020fb7efba120fbfe20fefe20a2cca451a2cea4ca', 'hex');
const str = iconv.decode(buf, 'big5pua');
const buf2 = iconv.encode(str, 'big5pua');
console.log('src:',buf);
console.log('string:['+str+']');
var be = Buffer.from(str, 'utf16le').swap16();
console.log('string in utf16be:', be);
console.log('back to big5:',buf2);