Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a default policy for single input/target file scenario #32

Open
cappelletto opened this issue Apr 18, 2023 · 0 comments
Open

Define a default policy for single input/target file scenario #32

cappelletto opened this issue Apr 18, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@cappelletto
Copy link
Owner

It is possible to use the same CSV file for both target (labels) and input (latents) file. However, the dataLoader object will duplicate the entries during the join operation. We can default to force join_left when a single input file is provided (or name duplication, it is equivalent)

This might require either:

  • Provide the CLI option for a single input/target file definition at invocation time
  • Detect name duplication at runtime and enable the join_left option

We will always assume that the input CSV (latents) contains the relevant metadata fields we want to propagate to the exported dataframe

@cappelletto cappelletto added the enhancement New feature or request label Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant