-
Notifications
You must be signed in to change notification settings - Fork 334
Add support for reading YXDB files #14602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
62f725e
Initial Alteryx YXDB reader integration.
jdunkerley a339ff5
Revert to simpler exception based error handling.
jdunkerley 6026b98
Formatting.
jdunkerley f7a60c1
Use 0.1.3 to get last 2 fixes.
jdunkerley 0bbc7cf
Legal Review.
jdunkerley 794e258
Format
jdunkerley 93b7d3f
Format.
jdunkerley d548aa4
Docs
jdunkerley 76d47a2
Add spatial to geojson functions.
jdunkerley b37ba78
Docs
jdunkerley f2c545b
Docs
jdunkerley 4521b7e
PR comments.
jdunkerley d3003c2
Legal Review
jdunkerley File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
22 changes: 22 additions & 0 deletions
22
...ibution/lib/Standard/Table/0.0.0-dev/THIRD-PARTY/uk.co.jdunkerley.yxdb-java-0.1.4/LICENSE
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| MIT License | ||
|
|
||
| Copyright (c) 2022 tlarsendataguy | ||
| Copyright (c) 2025 jdunkerley | ||
|
|
||
| Permission is hereby granted, free of charge, to any person obtaining a copy | ||
| of this software and associated documentation files (the "Software"), to deal | ||
| in the Software without restriction, including without limitation the rights | ||
| to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
| copies of the Software, and to permit persons to whom the Software is | ||
| furnished to do so, subject to the following conditions: | ||
|
|
||
| The above copyright notice and this permission notice shall be included in all | ||
| copies or substantial portions of the Software. | ||
|
|
||
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
| IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
| FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
| AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
| LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
| OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
| SOFTWARE. |
13 changes: 13 additions & 0 deletions
13
distribution/lib/Standard/Table/0.0.0-dev/docs/api/Alteryx_Format.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| ## Enso Signatures 1.0 | ||
| ## module Standard.Table.Alteryx_Format | ||
| - type Alteryx_Format | ||
| - Alteryx_Format | ||
| - for_file_write file:Standard.Base.Any.Any -> Standard.Base.Any.Any | ||
| - for_read file:Standard.Base.System.File_Format_Metadata.File_Format_Metadata -> Standard.Base.Any.Any | ||
| - get_dropdown_options -> Standard.Base.Any.Any | ||
| - get_name_patterns -> (Standard.Base.Data.Vector.Vector Standard.Base.System.File_Format.File_Name_Pattern) | ||
| - read self file:Standard.Base.Any.Any on_problems:Standard.Base.Errors.Problem_Behavior.Problem_Behavior -> Standard.Base.Any.Any | ||
| - read_stream self stream:Standard.Base.System.Input_Stream.Input_Stream metadata:Standard.Base.System.File_Format_Metadata.File_Format_Metadata= -> Standard.Base.Any.Any | ||
| - resolve constructor:Standard.Base.Any.Any -> Standard.Base.Any.Any | ||
| - spatial_column_to_geojson table:(Standard.Table.Table.Table&Standard.Table.In_Memory_Table.In_Memory_Table)= column:(Standard.Base.Data.Text.Text|Standard.Base.Data.Numbers.Integer)= -> Standard.Base.Any.Any | ||
| - spatial_to_geojson spatial_object:(Standard.Base.Data.Array.Array|Standard.Base.Data.Vector.Vector)= -> Standard.Base.Data.Text.Text |
159 changes: 159 additions & 0 deletions
159
distribution/lib/Standard/Table/0.0.0-dev/src/Alteryx_Format.enso
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,159 @@ | ||
| from Standard.Base import all | ||
| import Standard.Base.Enso_Cloud.Data_Link_Helpers | ||
| import Standard.Base.Errors.Common.Type_Error | ||
| import Standard.Base.Errors.Illegal_Argument.Illegal_Argument | ||
| import Standard.Base.Errors.Illegal_State.Illegal_State | ||
| import Standard.Base.Errors.File_Error.File_Error | ||
| import Standard.Base.Runtime.Context | ||
| import Standard.Base.System.File.Generic.Writable_File.Writable_File | ||
| import Standard.Base.System.File_Format.File_Name_Pattern | ||
| import Standard.Base.System.File_Format_Metadata.File_Format_Metadata | ||
| import Standard.Base.System.Input_Stream.Input_Stream | ||
| from Standard.Base.Metadata.Choice import Option | ||
| from Standard.Base.Metadata.Widget import Text_Input | ||
|
|
||
| import project.Internal.Java_Problems | ||
| import project.Internal.Telemetry | ||
| import project.Internal.Widget_Helpers | ||
| import project.Table.Table | ||
| import project.In_Memory_Table.In_Memory_Table | ||
| from project.In_Memory_Table import from_java_table | ||
|
|
||
| polyglot java import java.io.FileNotFoundException | ||
| polyglot java import java.lang.IllegalArgumentException | ||
| polyglot java import java.lang.IllegalStateException | ||
| polyglot java import org.enso.table.read.AlteryxYXDBReader | ||
|
|
||
| ## A file format for reading Alteryx YXDB files. | ||
| type Alteryx_Format | ||
| Alteryx_Format | ||
|
|
||
| ## --- | ||
| icon: Spatial | ||
| --- | ||
| Converts an Alteryx Spatial Object to a GeoJSON representation. | ||
|
|
||
| ## Arguments | ||
| - spatial_object: The spatial object to convert, represented as a byte array. | ||
|
|
||
| ## Returns | ||
| A GeoJSON representation of the spatial object. | ||
| spatial_to_geojson (spatial_object : Array | Vector = Missing_Argument.throw "spatial_object") -> Text = | ||
| AlteryxYXDBReader.spatialObjectToGeoJSON spatial_object | ||
|
|
||
| ## --- | ||
| icon: Spatial | ||
| --- | ||
| Converts a column of Alteryx Spatial Objects to a GeoJSON representation. | ||
|
|
||
| ## Arguments | ||
| - table: The table containing the spatial objects. | ||
| - column: The name of the column containing the spatial objects. | ||
|
|
||
| ## Returns | ||
| An updated table with the specified column converted to GeoJSON. | ||
| @column _column_widget | ||
| spatial_column_to_geojson (table : Table & In_Memory_Table = Missing_Argument.throw "table") (column : Text | Integer = Missing_Argument.throw "column") = | ||
| input_column = table.at column | ||
| converted_column = input_column.map v-> Alteryx_Format.spatial_to_geojson v | ||
| table.set converted_column input_column.name | ||
|
|
||
| ## --- | ||
| private: true | ||
| --- | ||
| Resolve an unresolved constructor to the actual type. | ||
| resolve : Function -> Alteryx_Format | Nothing | ||
| resolve constructor = | ||
| Panic.catch Type_Error (constructor:Alteryx_Format) _->Nothing | ||
|
|
||
| ## --- | ||
| private: true | ||
| --- | ||
| If the File_Format supports reading from the file, return a configured | ||
| instance. | ||
| for_read : File_Format_Metadata -> Alteryx_Format | Nothing | ||
| for_read file:File_Format_Metadata = | ||
| case file.guess_extension of | ||
| ".yxdb" -> Alteryx_Format.Alteryx_Format | ||
| _ -> Nothing | ||
|
|
||
| ## --- | ||
| private: true | ||
| --- | ||
| If this File_Format should be used for writing to that file, return a | ||
| configured instance. | ||
| for_file_write : Writable_File -> Alteryx_Format | Nothing | ||
| for_file_write file = | ||
| _ = file | ||
| Nothing | ||
|
|
||
| ## --- | ||
| private: true | ||
| --- | ||
| get_dropdown_options : Vector Option | ||
| get_dropdown_options = [Option "Alteryx Database" "..Alteryx_Format"] | ||
|
|
||
| ## --- | ||
| private: true | ||
| --- | ||
| get_name_patterns -> Vector File_Name_Pattern = | ||
| [File_Name_Pattern.Value "Alteryx Database" ["*.yxdb"]] | ||
|
|
||
| ## --- | ||
| private: true | ||
| --- | ||
| Implements the `File.read` for this `File_Format` | ||
| read : File -> Problem_Behavior -> Any | ||
| read self file on_problems:Problem_Behavior = | ||
| result = _read_file file on_problems | ||
| result.if_not_error <| | ||
| Telemetry.log "File_Format.read" "Read file: format={}, output={}, row_count={}, column_count={}" ["Alteryx_Format", "Table", result.row_count, result.column_count] | ||
| result | ||
|
|
||
| ## --- | ||
| private: true | ||
| --- | ||
| Implements decoding the format from a stream. | ||
| read_stream : Input_Stream -> File_Format_Metadata -> Any | ||
| read_stream self stream:Input_Stream (metadata : File_Format_Metadata = File_Format_Metadata.no_information) = | ||
| _ = metadata | ||
|
|
||
| ## Currently stream must be materialised (but no actual reason...) | ||
| tmp_file = File.create_temporary_file "alteryx_database_read" ".yxdb" | ||
|
|
||
| ## Write stream to temporary file | ||
| write_result = Context.Output.with_enabled <| | ||
| inner = Panic.catch Any (stream.write_to_file tmp_file) caught_panic-> Error.throw caught_panic.payload | ||
| if inner.is_error then tmp_file.delete | ||
| inner | ||
| Error.return_if_error write_result | ||
|
|
||
| result = Panic.recover Any <| | ||
| self.read tmp_file ..Report_Warning | ||
|
|
||
| ## Clean up temporary file | ||
| Context.Output.with_enabled <| | ||
| Panic.catch Any (tmp_file.delete) _->Nothing | ||
|
|
||
| result | ||
|
|
||
| private _read_file path on_problems:Problem_Behavior = | ||
| Data_Link_Helpers.as_file path file-> | ||
| Java_Problems.with_problem_aggregator on_problems java_problem_aggregator-> | ||
| java_table = Panic.catch Any handler=handle_java_exceptions <| | ||
| AlteryxYXDBReader.read file.path java_problem_aggregator | ||
| from_java_table java_table | ||
|
|
||
| private handle_java_exceptions caught_panic = | ||
| error = case caught_panic.payload of | ||
| ex : IllegalArgumentException -> Illegal_Argument.Error ex.getMessage ex | ||
| ex : IllegalStateException -> Illegal_State.Error ex.getMessage ex | ||
| ex : FileNotFoundException -> File_Error.Not_Found ex.getMessage | ||
| _ -> Illegal_State.Error "Unexpected error reading Alteryx YXDB file." caught_panic.payload | ||
| Error.throw error | ||
|
|
||
| private _column_widget arg cache=Nothing = | ||
| _ = arg | ||
| table = cache.if_not_nothing <| cache "table" | ||
| if table.is_nothing then Text_Input display=..Always else | ||
| Widget_Helpers.make_column_name_selector table display=..Always | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
116 changes: 116 additions & 0 deletions
116
std-bits/table/src/main/java/org/enso/table/read/AlteryxYXDBReader.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,116 @@ | ||
| package org.enso.table.read; | ||
|
|
||
| import java.io.FileNotFoundException; | ||
| import java.io.IOException; | ||
| import java.nio.file.Files; | ||
| import java.nio.file.Path; | ||
| import java.time.format.DateTimeParseException; | ||
| import java.util.Arrays; | ||
| import java.util.stream.IntStream; | ||
| import org.enso.table.data.column.builder.Builder; | ||
| import org.enso.table.data.column.storage.type.AnyObjectType; | ||
| import org.enso.table.data.column.storage.type.BigDecimalType; | ||
| import org.enso.table.data.column.storage.type.BooleanType; | ||
| import org.enso.table.data.column.storage.type.DateTimeType; | ||
| import org.enso.table.data.column.storage.type.DateType; | ||
| import org.enso.table.data.column.storage.type.FloatType; | ||
| import org.enso.table.data.column.storage.type.IntegerType; | ||
| import org.enso.table.data.column.storage.type.StorageType; | ||
| import org.enso.table.data.column.storage.type.TextType; | ||
| import org.enso.table.data.column.storage.type.TimeOfDayType; | ||
| import org.enso.table.data.table.Column; | ||
| import org.enso.table.data.table.Table; | ||
| import org.enso.table.problems.ProblemAggregator; | ||
| import uk.co.jdunkerley.yxdb.Spatial; | ||
| import uk.co.jdunkerley.yxdb.YxdbField; | ||
| import uk.co.jdunkerley.yxdb.YxdbReader; | ||
| import uk.co.jdunkerley.yxdb.YxdbType; | ||
|
|
||
| public final class AlteryxYXDBReader { | ||
| /** | ||
| * Reads an Alteryx YXDB file and returns its contents as a Table. | ||
| * | ||
| * @param path the path to the YXDB file. | ||
| * @return a Table containing the data from the YXDB file. | ||
| */ | ||
| public static Table read(String path, ProblemAggregator problemAggregator) | ||
| throws FileNotFoundException, IllegalArgumentException, IllegalStateException { | ||
| // Test that the path exists | ||
| if (!Files.exists(Path.of(path))) { | ||
| throw new FileNotFoundException(path); | ||
| } | ||
|
|
||
| try (var yxdbReader = new YxdbReader(path)) { | ||
| var recordCount = yxdbReader.numRecords(); | ||
| var fields = yxdbReader.fields(); | ||
| var storageTypes = | ||
| Arrays.stream(fields).map(AlteryxYXDBReader::mapYXDBField).toArray(StorageType[]::new); | ||
| var storages = | ||
| Arrays.stream(storageTypes) | ||
| .map(st -> st.makeBuilder(recordCount, problemAggregator)) | ||
| .toArray(Builder[]::new); | ||
|
|
||
| while (yxdbReader.next()) { | ||
| try { | ||
| for (int i = 0; i < storages.length; i++) { | ||
| var yxdbValue = | ||
| storageTypes[i] instanceof AnyObjectType | ||
| ? yxdbReader.readBlob(i) | ||
| : yxdbReader.read(i); | ||
| storages[i].append(yxdbValue); | ||
| } | ||
| } catch (IndexOutOfBoundsException _) { | ||
| throw new IllegalArgumentException( | ||
| "The YXDB file appears to be corrupted on row " + storages[0].getCurrentSize()); | ||
| } catch (DateTimeParseException _) { | ||
| throw new IllegalArgumentException( | ||
| "The YXDB file contains invalid date/time data on row " | ||
| + storages[0].getCurrentSize()); | ||
| } | ||
| } | ||
|
|
||
| var columns = | ||
| IntStream.range(0, storages.length) | ||
| .mapToObj(i -> new Column(fields[i].name(), storages[i].seal())) | ||
| .toArray(Column[]::new); | ||
| return new Table(columns); | ||
| } catch (IllegalArgumentException exc) { | ||
| throw exc; | ||
| } catch (IOException exc) { | ||
| var message = exc.getMessage(); | ||
| throw new IllegalArgumentException(exc.getMessage(), exc); | ||
| } catch (Exception exc) { | ||
| throw new IllegalStateException("An unexpected error occurred: " + exc.getMessage(), exc); | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Converts a spatial object in byte array format to its GeoJSON representation. | ||
| * | ||
| * @param spatialObj the spatial object as a byte array. | ||
| * @return the GeoJSON representation of the spatial object. | ||
| */ | ||
| public static String spatialObjectToGeoJSON(byte[] spatialObj) { | ||
| return spatialObj == null ? null : Spatial.toGeoJson(spatialObj); | ||
| } | ||
|
|
||
| private static StorageType<?> mapYXDBField(YxdbField field) { | ||
| return switch (field.yxdbType()) { | ||
jdunkerley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| case YxdbType.BOOLEAN -> BooleanType.INSTANCE; | ||
| case YxdbType.BYTE -> IntegerType.INT_8; | ||
| case YxdbType.INT16 -> IntegerType.INT_16; | ||
| case YxdbType.INT32 -> IntegerType.INT_32; | ||
| case YxdbType.INT64 -> IntegerType.INT_64; | ||
| case YxdbType.FLOAT, YxdbType.DOUBLE -> FloatType.FLOAT_64; | ||
| case YxdbType.DECIMAL -> BigDecimalType.INSTANCE; | ||
| case YxdbType.STRING, YxdbType.WSTRING -> TextType.variableLengthWithLimit(field.size()); | ||
| case YxdbType.V_STRING, YxdbType.V_WSTRING -> TextType.VARIABLE_LENGTH; | ||
| case YxdbType.DATE -> DateType.INSTANCE; | ||
| case YxdbType.TIME -> TimeOfDayType.INSTANCE; | ||
| case YxdbType.DATETIME -> DateTimeType.INSTANCE; | ||
| case YxdbType.BLOB, YxdbType.SPATIAL_OBJ -> AnyObjectType.INSTANCE; | ||
| default -> | ||
| throw new IllegalStateException("Unsupported YXDB field type: " + field.yxdbType()); | ||
| }; | ||
| } | ||
| } | ||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.