Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type guessing in MapDecoder? #16

Open
gtauriello opened this issue Jul 18, 2018 · 6 comments
Open

Type guessing in MapDecoder? #16

gtauriello opened this issue Jul 18, 2018 · 6 comments
Labels
wait for usecase Waiting for a real usecase scenario for the proposed enhancement

Comments

@gtauriello
Copy link
Collaborator

I was thinking of ways to help generic parsing of custom properties (e.g. imagine being able to visualize whatever anyone adds as custom atom or residue properties in an MMTF file in PyMOL ;-)). This is complementary #13 and related to possible extensions of the MMTF spec, where we know what per-atom/per-residue/... properties are (see rcsb/mmtf#32).

My idea is to extend the MapDecoder class with a guessType function which returns an enum for all possible types. Currently that would be something like:

enum DecodableType {
  NOT_SUPPORTED,
  NO_KEY,
  FLOAT,
  INT32,
  CHAR,
  STRING,
  FLOAT_VECTOR,
  INT8_VECTOR,
  INT16_VECTOR,
  INT32_VECTOR,
  STRING_VECTOR,
  CHAR_VECTOR
}

The first two are if key is not found or type is not supported. A user could then use that output and decide to call decode with an appropriately typed object. I think it should be possible to implement something like this with rather minimal changes.

What do you guys think?

@gtauriello
Copy link
Collaborator Author

One more thing that came to my mind. This change would also require some extra convenience functionality in MapDecoder to iterate over the keys in the map (or just a getter function for the private string-key/msgpack-object map).

@speleo3
Copy link
Collaborator

speleo3 commented Jul 18, 2018

@gtauriello - do you have some example code how you would use this?

@gtauriello
Copy link
Collaborator Author

Ok I cooked up something quickly to read and normalize any vector of values for per-residue/per-atom/... quantities (e.g. to use as color maps).

// input obj is assumed to be msgpack-map (e.g. groupProperties field as in rcsb/mmtf#32)
// output is key/vector pairs for coloring
std::map<std::string, std::vector<float>>
getNormalizedVectorMap(const msgpack::object& obj) {
    // each vector has values in [0,1] for coloring
    std::map<std::string, std::vector<float>> color_maps;
    // parse all keys
    mmtf::MapDecoder md(obj);
    for (auto& key_data : md.getDataMap()) {
        std::string& key = key_data.first;
        msgpack::object* data = key_data.second;
        switch (md.guessType(key)) {
        case mmtf::MapDecoder::FLOAT_VECTOR: {
            color_maps[key] = getNormalizedValues<float>(md, data);
            break;
        }
        case mmtf::MapDecoder::INT8_VECTOR: {
            color_maps[key] = getNormalizedValues<int8_t>(md, data);
            break;
        }
        case mmtf::MapDecoder::INT16_VECTOR: {
            color_maps[key] = getNormalizedValues<int16_t>(md, data);
            break;
        }
        case mmtf::MapDecoder::INT32_VECTOR: {
            color_maps[key] = getNormalizedValues<int32_t>(md, data);
            break;
        }
        default: {
            // silently skip rest or write message...
            break;
        }
        }
    }
    return color_maps;
}

// Templatized function mapping numeric values to [0,1]
template<typename T>
std::vector<float>
getNormalizedValues(mmtf::MapDecoder& md, msgpack::object* data) {
    std::vector<float> normalized_values;
    std::vector<T> values;
    md.decode(data, values);
    // somehow normalize values into normalized_values
    return normalized_values;
}

// As described above
DecodableType MapDecoder::guessType(const std::string& key);
// Access to internal data_map_
const std::map<std::string, msgpack::object*>& MapDecoder::getDataMap();
// Convenience function to decode msgpack::object* directly
template<typename T>
void MapDecoder::decode(msgpack::object* obj, T& target);

@danpf
Copy link
Collaborator

danpf commented Jul 20, 2018

The only problem in your example is that it's difficult make a 1:1 map of atomProperties in c++ because it could be a list of strings or floats. I like the idea, i'm just not exactly sure how to go about it. Maybe return map to the enum of the mmtf::MapDecoder::TYPE, and just let the user handle that?

@gtauriello
Copy link
Collaborator Author

@danpf The idea is that in something like C++ you will have to convert the input data into a shared format anyways (actually that's true for any programming language if you want a common functionality from a generic input). Of course my example above could easily be expanded to also color based on input with a DecodableType of STRING_VECTOR if you wanted to add a legend or so. The guessType approach can help with anything that we decode to anyways (incl. binary encodings). For the rest (i.e. NOT_SUPPORTED above), the user has access to the msgpack::object so you can always go for the msgpack type directly and do something custom...

@speleo3
Copy link
Collaborator

speleo3 commented Jul 22, 2018

I suggest to wait with adding such functionality until we have a use case. Without a use case (some application which wants to use this API) we might engineer into the wrong direction.

@gtauriello gtauriello added the wait for usecase Waiting for a real usecase scenario for the proposed enhancement label Jul 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wait for usecase Waiting for a real usecase scenario for the proposed enhancement
Projects
None yet
Development

No branches or pull requests

3 participants