Skip to content
This repository was archived by the owner on Jul 25, 2024. It is now read-only.
This repository was archived by the owner on Jul 25, 2024. It is now read-only.

Planned Upgrade: Optional flag to output SQL Compliant Column Names #29

@nickaustinlee

Description

@nickaustinlee

Prior versions of the Labelbox Connector for Databricks tried to preserve column names to how they were expressed in the JSON output of Labelbox. For instance "Labeled Data" was expressed as a column named "Labeled Data".

Downstream workflows sometimes require accessing these columns in ways where a space in the name is impractical. Additionally, spaces need to be removed prior to saving the table as a Delta Lake table. Right now developers can run a simple column reformat to solve these issues.

To make it easier for developers downstream but avoid breaking existing code which may reference column names with spaces, we are exploring the addition of a flag "SQL_friendly_columns" which will output dataframes with the following characteristics:

  • All spaces will be replaced with underscores in column names
  • The dot format which we currently use to express nesting will be replaced with underscores.
  • All character cases will be preserved to match Labelbox JSON character case

Examples:

"Labeled Data" --> "Labeled_Data"
"Label.objects.title" --> "Label_objects_title"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions