Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to make predictions using catboost model with either classification or regression #1681

Open
tbergot opened this issue Mar 5, 2025 · 0 comments

Comments

@tbergot
Copy link

tbergot commented Mar 5, 2025

Expected

We can make predictions using the catboost model with PGML

Problem

We are running into an error when running predictions when a model is trained as catboost, neither with 'classification' nor 'regression'.

SQL Error [XX000]: ERROR: ValueError: cannot reshape array of size 4 into shape (0)

Additional notes

  • We were able to make it work using xgboost (classification and regression). Only breaks with catboost
  • We also tried passing a different number of arguments (features) to the predict method, but it still fails

Reproduction steps

  1. Run the "quick start with docker" command (without the GPUs instruction as we don't need them):
docker run \
    -it \
    -v postgresml_data:/var/lib/postgresql \
    -p 5433:5432 \
    -p 8000:8000 \
    [ghcr.io/postgresml/postgresml:2.10.0](http://ghcr.io/postgresml/postgresml:2.10.0) \
    sudo -u postgresml psql -d postgresml
  1. In another terminal, run the following sql script using psql
CREATE EXTENSION pgml cascade;
-- Simple test data
CREATE TABLE pgml.training_data (
    id SERIAL PRIMARY KEY,
    feature1 REAL,
    feature2 REAL,
    feature3 REAL,
    target INTEGER
);
INSERT INTO pgml.training_data (feature1, feature2, feature3, target)
VALUES
    (1.2, 3.4, 5.6, 0),
    (2.3, 4.5, 6.7, 1),
    (3.4, 5.6, 7.8, 0),
    (4.5, 6.7, 8.9, 1),
    (5.6, 7.8, 9.0, 0);
--Training and using a xgboost model
SELECT * FROM pgml.train(
	project_name => 'classification_xgboost',
	task => 'classification',
    relation_name => 'pgml.training_data',
    y_column_name => 'target',
    algorithm => 'xgboost'
);
select pgml.predict(
    project_name => 'classification_xgboost',
    features => ARRAY[10::integer, 0.1::real, 0.3::real, 0.1::real]
) AS prediction;
-- Works well
-- Now with catboost
SELECT * FROM pgml.train(
	project_name => 'classification_catboost',
	task => 'classification',
    relation_name => 'pgml.training_data',
    y_column_name => 'target',
    algorithm => 'catboost'
);
select pgml.predict(
    project_name => 'classification_catboost',
    features => ARRAY[10::integer, 0.1::real, 0.3::real, 0.1::real]
) AS prediction;
--Throws error:
--SQL Error [XX000]: ERROR: ValueError: cannot reshape array of size 4 into shape (0)

Any help regarding this issue would be appreciated. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant