-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need some help figuring out how to deserialize CJSON #72
Comments
Hello! We can try to help you to understand how CJSON works. The general description of CJSON you can find here: https://github.com/Restream/reindexer/tree/master/cpp_src/core/cjson (readme.md). This is how cjson tag looks like: https://github.com/Restream/reindexer/blob/master/cpp_src/core/cjson/ctag.h What you need to know is that there are 2 types of cjson: tuple and 'transportable' cjson. The first one is just like a scheme for an item - it contains a brief description of all the fields (name tag, type number and field number). If field is an index then tuple contains only this kind of information (encoded field index value allows to get field's real value quickly from the real Item), if that field is not an index then fieldTag is -1 and its value is encoded right after this tag in CJSON. So values of non-indexed fields are stored in CJSON. The second type of CJSON is a 'transportable' cjson - we need it to transfer queries' results from one client to another (i.e. network connection or CGO serialization). This type of CJSON encodes each field's value (not just a reference to it by field index) - so it consumes more memory. Hope it will help you somehow. It's hard to answer your specific question (not enough information) but you definitely don't need base64 to work with CJSON. We'll be happy to help you with this - you can contact me on Telegram here @slow_cheetah. Have a good day! Best wishes, Reindexer team. |
This is how it is implemented in Golang: https://github.com/Restream/reindexer/tree/master/cjson |
ctag and carraytag always have the same size. If tag.field is -1 then field's value is encoded next to it, otherwise comes the next tag (tag of the next field) + CJSON structure is recursive (same as JSON) - it's that simple. It just looks scary. So you first read ctag, then in some cases you read field's value (or just go to the next tag) - do it recursively until TAG_END is read. That's all. Here is the briefest example of what has been described above: void skipCjsonTag(ctag tag, Serializer &rdser) {
const bool embeddedField = (tag.Field() < 0);
switch (tag.Type()) {
case TAG_ARRAY: {
if (embeddedField) {
carraytag atag = rdser.GetUInt32();
for (int i = 0; i < atag.Count(); i++) {
ctag t = atag.Tag() != TAG_OBJECT ? atag.Tag() : rdser.GetVarUint();
skipCjsonTag(t, rdser);
}
} else {
rdser.GetVarUint();
}
} break;
case TAG_OBJECT:
for (ctag otag = rdser.GetVarUint(); otag.Type() != TAG_END; otag = rdser.GetVarUint()) {
skipCjsonTag(otag, rdser);
}
break;
default:
if (embeddedField) rdser.GetRawVariant(KeyValueType(tag.Type()));
}
} This is an actual piece of code used in Reindexer when some tag (+its value) needs to be skipped. You don't need TagsMatcher and PayloadType here. Just a simple recursive code. |
You want to parse this binary format CJSON like some text string - it makes no sense. I can tell you what, for example, reindexer::Item item = rx.NewItem(nsName);
err = item.FromJSON(jsonString);
if (err.ok()) {
err = item.GetCJSON(); // here is your cjson
} That's how you can play with it - set all possible sets of JSON to get appropriate CJSON. int Type() const { return tag_ & ((1 << typeBits) - 1); }
int Name() const { return (tag_ >> typeBits) & ((1 << nameBits) - 1); }
int Field() const { return (tag_ >> (typeBits + nameBits)) - 1; } And the result is just a single integer field - good luck decoding it as a string object, this is definitely not the area where I can help you. All you need is to get CJSON as a byte array in C# and start doing what |
If you let us know what the goal of your secret mission is, then we'll probably give you better advices. |
Ok, clear. As for template <typename T, typename std::enable_if<sizeof(T) == 8 && std::is_integral<T>::value>::type * = nullptr>
void PutVarUint(T v) {
grow(10);
len_ += uint64_pack(v, buf_ + len_);
}
template <typename T, typename std::enable_if<sizeof(T) <= 4 && std::is_integral<T>::value>::type * = nullptr>
void PutVarUint(T v) {
grow(10);
len_ += uint32_pack(v, buf_ + len_);
}
template <typename T, typename std::enable_if<std::is_enum<T>::value>::type * = nullptr>
void PutVarUint(T v) {
assert(v >= 0 && v < 128);
if (len_ + 1 >= cap_) grow(1);
buf_[len_++] = v;
} I'm not sure how familiar you are with C++ templates magic and SFINAE, but |
As for encoding CJSON non-index fields (values + tags) take a look at CJsonBuilder &CJsonBuilder::Put(int tagName, int64_t arg) {
if (type_ == ObjType::TypeArray) {
itemType_ = TAG_VARINT;
} else {
putTag(tagName, TAG_VARINT);
}
ser_->PutVarint(arg);
++count_;
return *this;
}
inline void CJsonBuilder::putTag(int tagName, int tagType) { ser_->PutVarUint(static_cast<int>(ctag(tagType, tagName))); } It can help to understand how bytes are encoded. |
As for C++ IDE for Windows you might try to use CLion - it works perfectly well with CMake projects + there is an opportunity to use it for free for the first 30 days (might be enough to accomplish your task). |
I'll try to explain CJSON the easiest possible way here. We start it with ser_->PutVarUint(static_cast<int>(ctag(TAG_OBJECT, tagName, -1)));
Then we go to the next field ser_->PutVarUint(static_cast<int>(ctag(TAG_STRING, kNameTagName, kNameField))); right here we do not serialize name's value - it is in index and CJSON is just a tuple. In case of sending these results to some friend abroad instead of ser_->PutVString(teddyNameValueString); or simply like this: ser_->PutVarUint(static_cast<int>(ctag(TAG_STRING, kNameTagName, -1)));
ser_->PutVString(teddyNameValueString); In case of the last field ser_->PutVarUint(static_cast<int>(ctag(TAG_VARINT, kRatingTagName, -1)));
ser_->PutVarint(teddyRatingValue); And the final action is an indicator that we've finished this Item: ser_->PutVarUint(static_cast<int>(ctag(TAG_END))); I hope I answered all of your questions and now you understand how to distinguish a field in CJSON buffer. |
To make it work in CLion you need to open folder reindexer/cpp_src - it will find CMakelists.txt and prepare the project. To understand how to decode something, you first need to understand how to encode it - at least, this is how we do it in Reindexer and I gave you all the hints for that. MongoDB has BJSON (Binary JSON), we have CJSON (Binary JSON) and it is our implementation but there are analogs, probably. Sorry, I won't explain to you every bit of B41F - with all the information I provided you with, it's more than enough to understand it. There are Decoder examples both in Golang and C++ - you don't need to invent the wheel to make an analog in C#, just rewrite it properly. Good luck with that! |
No description provided.
The text was updated successfully, but these errors were encountered: