-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DATATYPE] Refactor data-types #2562
Comments
|
The superduper-framework only provides the most basic datatypes:
Plan 1For different databackends, we can configure datatype conversion relationships, such as:
Thus, datatype conversions will occur in the following areas:
Example: JSON insert data input_data = {"data": {"a": "b"}}
input_schema = Schema({"data": "dict"})
schema = _convert_schema(input_schema)
## schema = Schema({"data": "json"})
encode_data = schema.encode(input_data)
encode_data = {"data": '{"a": "b"}'} query data input_data = {"data": '{"a": "b"}'}
input_schema = Schema({"data": "dict"})
schema = _convert_schema(input_schema)
## schema = Schema({"data": "json"})
decode_data = schema.decode(input_data)
decode_data = {"data": {"a": "b"}} Plan 2:We also define this through configuration files within the preset datatype definitions. datatypes:
vector: ibis.datatype.sql_datatype
# or vector: postgresql.datatype.sqldatatype
dict: json class Vector:
def __post__init(self):
datatype_config = CFG.xxxx
obj = cls(...)
if self.class .__name__ in datatype_config:
cls_real_datatype = # import the real_datatype class
self.real_datatype = cls_real_datatype(...)
else:
self.real_datatype = None
def encode_data(...):
return (self.real_datatype or self).encode_data()
def decode_data(...):
return (self.real_datatype or self).encode_data() |
Regarding If the data is bytes, convert it to base64 and add the prefix Then, during decode_data: If the input data is a str with the |
Issue is that not all output types are supported by every database, and to support certain operations, the data needs to be in the correct format.
@kartik4949 @jieguangzhou to provide input.
_fields
based on annotationsFile
,Blob
, etc. #2649Listener
#2499CFG.bytes_encoding
to databackend #2628DataType
without all parameters and use it for "switch"Vector
type #2629pickle
,dill
etc.. #2634_artifact_schema
inline with_schema
#2635lazy
in encodables #2636bytes_encoding
databackend dependent and only for inline data #2641The text was updated successfully, but these errors were encountered: