Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: DataFrame.dtypes for data must be int, float or bool. #41

Closed
ChuliangXiao opened this issue Sep 5, 2023 · 4 comments
Closed
Labels
bug Something isn't working

Comments

@ChuliangXiao
Copy link

Try to use LGBMClassifier

from snowflake.ml.modeling.lightgbm import LGBMClassifier

clf = LGBMClassifier(
    input_cols=INPUT_COLUMS,
    label_cols=LABEL_COLUMN,
    output_cols=OUTPUT_COLUMN,
    **params
)
clf.fit(train_df)

result = clf.predict(test_df)

result.groupBy(OUTPUT_COLUMN).count().show()

and got the following error

raise ValueError("DataFrame.dtypes for data must be int, float or bool.\n"
ValueError: DataFrame.dtypes for data must be int, float or bool.
Did not expect the data types in the following fields: ... some fields...

while all those fields are DoubleType() or LongType(). Still got the same issue after .cast(FloatType()).

@sfc-gh-wzhao sfc-gh-wzhao added the bug Something isn't working label Sep 6, 2023
@sfc-gh-xjiang
Copy link

Hi @ChuliangXiao, this error may come from LightGBM model itself. Would you mind looking at solutions like this or this? If they don't work for you, could you provide any more details for your problem, such as the schema of your table - which would be easier for me to reproduce your error?

@ChuliangXiao
Copy link
Author

Hi @ChuliangXiao, this error may come from LightGBM model itself. Would you mind looking at solutions like this or this? If they don't work for you, could you provide any more details for your problem, such as the schema of your table - which would be easier for me to reproduce your error?

Had the same issue with XGBoost.
It's fine while manually creating a stored procedure in which Snowflake DF is converted .toPandas() to Pandas DF

@sfc-gh-xjiang sfc-gh-xjiang removed their assignment Sep 14, 2023
@sfc-gh-xjiang
Copy link

Hey @ChuliangXiao , how is everything going? I want to follow up the issue and see if you can provide a dataset/table schema so that I can reproduce the issue. If the problem is solved, I can close this issue.

@sfc-gh-wzhao
Copy link
Collaborator

Hi @ChuliangXiao , we are closing this stale issue. If you have any updates, please re-open it. Thank you for supporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants