Skip to content

Conversation

@RekGRpth
Copy link
Collaborator

@RekGRpth RekGRpth commented Sep 25, 2025

Enable build as PG extension for GP/GG

Commit c2ac230 added Madlib building as an extension for Postgres. This patch
adds Madlib building as an extension for Greenplum/Greengage.

Ticket: ADBDEV-8377

@bandetto
Copy link
Collaborator

bandetto commented Sep 28, 2025

Build fails for Greengage without #1, so I suggest rewording the description.

After doing:

$ cd madlib
$ mkdir build && cd build
$ cmake -G Ninja ..
$ ninja
$ sudo ninja install
$ ninja extension
$ sudo ninja extension-install

CREATE EXTENSION/DROP EXTENSION madlib doesn't create madlib schema and it's functions for me. madpack -p greenplum install works as expected, without creating the extension. Please provide expected result or steps to build and install the extension from source.

@RekGRpth
Copy link
Collaborator Author

Build fails for Greengage without #1, so I suggest rewording the description.

According to the task, it makes no sense to install an extension that is not for Greengage, so this patch depends on the patch #1.

@RekGRpth
Copy link
Collaborator Author

After doing:

$ cd madlib
$ mkdir build && cd build
$ cmake -G Ninja ..
$ ninja
$ sudo ninja install
$ ninja extension
$ sudo ninja extension-install

According to the documentation, madlib as an extension should be installed like this:

./configure
make extension-install

Also, according to the same documentation, several Python packages must be installed first.

Copy link
Collaborator

@red1452 red1452 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I built madlib as extension by following commands:

./configure
make extension-install

But it requires to apply patch. Without this patch there are compilation errors:

In file included from /home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/dbconnector.hpp:275:
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/Allocator_impl.hpp: In instantiation of ‘madlib::dbconnector::postgres::MutableArrayHandle<T> madlib::dbconnector::postgres::Allocator::allocateArray(std::size_t) const [with T = double; std::size_t = long unsigned int]’:
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/EigenIntegration_impl.hpp:311:46:   required from ‘ArrayType* madlib::dbconnector::postgres::VectorToNativeArray(const Eigen::MatrixBase<Derived>&) [with Derived = Eigen::Matrix<double, -1, 1>; ArrayType = ArrayType]’
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/TypeTraits_impl.hpp:650:5:   required from here
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/Allocator_impl.hpp:78:50: error: ‘std::array<long unsigned int, 1> numElements’ has incomplete type
   78 |         std::array<std::size_t, BOOST_PP_INC(n)> numElements = {{ \
      |                                                  ^~~~~~~~~~~
/usr/include/boost/preprocessor/repetition/limits/repeat_256.hpp:18:36: note: in expansion of macro ‘MADLIB_ALLOCATE_ARRAY_DEF’
   18 | # define BOOST_PP_REPEAT_1_1(m, d) m(2, 0, d)
      |                                    ^
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/Allocator_impl.hpp: In instantiation of ‘madlib::dbconnector::postgres::MutableArrayHandle<T> madlib::dbconnector::postgres::Allocator::allocateArray(std::size_t) const [with T = int; std::size_t = long unsigned int]’:
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/EigenIntegration_impl.hpp:311:46:   required from ‘ArrayType* madlib::dbconnector::postgres::VectorToNativeArray(const Eigen::MatrixBase<Derived>&) [with Derived = Eigen::Matrix<int, -1, 1>; ArrayType = ArrayType]’
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/TypeTraits_impl.hpp:663:5:   required from here
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/Allocator_impl.hpp:78:50: error: ‘std::array<long unsigned int, 1> numElements’ has incomplete type
   78 |         std::array<std::size_t, BOOST_PP_INC(n)> numElements = {{ \
      |                                                  ^~~~~~~~~~~
/usr/include/boost/preprocessor/repetition/limits/repeat_256.hpp:18:36: note: in expansion of macro ‘MADLIB_ALLOCATE_ARRAY_DEF’
   18 | # define BOOST_PP_REPEAT_1_1(m, d) m(2, 0, d)
      |                                    ^

Madlib creates all objects in the "public" schema:

                                                                                                         List of functions
 Schema |        Name        | Result data type  |                                                                            Argument data types                                                                            | Type 
--------+--------------------+-------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------
 public | pca_sparse_project | character varying |                                                                                                                                                                           | func
 public | pca_sparse_project | void              | source_table text, pc_table text, out_table text, row_id text, col_id text, val_id text, row_dim integer, col_dim integer                                                 | func
 public | pca_sparse_project | void              | source_table text, pc_table text, out_table text, row_id text, col_id text, val_id text, row_dim integer, col_dim integer, residual_table text                            | func
 public | pca_sparse_project | void              | source_table text, pc_table text, out_table text, row_id text, col_id text, val_id text, row_dim integer, col_dim integer, residual_table text, result_summary_table text | func
 public | pca_sparse_project | character varying | usage_string text                                                                                                                                                         | func
(5 rows)

postgres=# \d+
                                   List of relations
 Schema |          Name           |   Type   |  Owner  | Storage |  Size  | Description 
--------+-------------------------+----------+---------+---------+--------+-------------
 public | migrationhistory        | table    | evgeniy | heap    | 32 kB  | 
 public | migrationhistory_id_seq | sequence | evgeniy |         | 128 kB | 
(2 rows)

Is it correct? Should we add changes from patch?

@RekGRpth
Copy link
Collaborator Author

Should we add changes from patch?

I updated the branch and this is no longer needed.

@RekGRpth
Copy link
Collaborator Author

Madlib creates all objects in the "public" schema:

No, Madlib is installed in the default scheme in search path, but you can always explicitly set the scheme when creating extension.

@RekGRpth
Copy link
Collaborator Author

I built madlib as extension by following commands:

on greengage, I hope?

@RekGRpth
Copy link
Collaborator Author

I built madlib as extension by following commands:

The system boost needs to be removed, Madlib will put his own.

@red1452
Copy link
Collaborator

red1452 commented Sep 30, 2025

Should we add changes from patch?

I updated the branch and this is no longer needed.

I built madlib with changes from #1, but it also requires changes from patch

commit 4eceaad9f8bbaac373d182f5aed2a5c97c46e150 (HEAD -> ADBDEV-8377)
Merge: 82b0e092 daa72a1d
Author: Evgeniy Ratkov <[email protected]>
Date:   Tue Sep 30 13:30:45 2025 +0300

    Merge remote-tracking branch 'origin/madlib2-master' into ADBDEV-8377

commit daa72a1da1ebc73c5d3578e06bbcdec45c3f82d2 (origin/madlib2-master)
Author: Georgy Shelkovy <[email protected]>
Date:   Tue Sep 30 14:42:23 2025 +0500

    Make Madlib build Greengage compatible (#1)
    
    Madlib 2 already supported Greenplum 6 and Greenplum 7. Add support for
    Greengage 6 and Greengage 7:
    
    1) Fix regular expressions in src/madpack/utilities.py,
    src/ports/greenplum/cmake/FindGreenplum.cmake, and
    src/ports/postgres/modules/utilities/utilities.py_in, and add additional
    condition in src/madpack/madpack.py.
    
    2) Madlib 2 only supports Python 3, remove specific environment variables from
    src/madpack/madpack.py, and fix files src/madpack/sort-module.py,
    src/ports/postgres/CMakeLists.txt, and src/CMakeLists.txt.
    
    3) Fix an indent error in the src/ports/postgres/madpack/SQLCommon.m4_in file.
    
    Ticket: ADBDEV-8376

commit 82b0e09290b5fd766fdedd5678f5e8859f5ebade (origin/ADBDEV-8377)
Author: Georgy Shelkovy <[email protected]>
Date:   Thu Sep 25 09:04:01 2025 +0500

    Enable build as PG extension for GP/GG

@red1452
Copy link
Collaborator

red1452 commented Sep 30, 2025

I built madlib as extension by following commands:

on greengage, I hope?

of course

@RekGRpth
Copy link
Collaborator Author

but it also requires changes from patch

No, I am successfully compiled without additional patches like that (as described in the documentation).

@red1452
Copy link
Collaborator

red1452 commented Sep 30, 2025

I built madlib as extension by following commands:

The system boost needs to be removed, Madlib will put his own.

Madlib does not download required libs, there is just compilation error:

In file included from /home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/dbconnector.hpp:38,
                 from /home/evgeniy/gpdb/madlib/src/modules/assoc_rules/assoc_rules.cpp:11:
/home/evgeniy/gpdb/madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/dbconnector.hpp:79:10: fatal error: boost/mpl/if.hpp: No such file or directory
   79 | #include <boost/mpl/if.hpp>
      |          ^~~~~~~~~~~~~~~~~~
compilation terminated.

@RekGRpth
Copy link
Collaborator Author

compilation error:

Can you show a full log?

@red1452
Copy link
Collaborator

red1452 commented Sep 30, 2025

Can you show a full log?

log.txt

@RekGRpth
Copy link
Collaborator Author

Can you show a full log?

log.txt

Found Boost: /usr/local/lib/cmake/Boost-1.82.0/BoostConfig.cmake (found version "1.82.0")

madlib will put his own boost_1_61_0.tar.gz !

@dkovalev1
Copy link
Collaborator

It would be great to put steps for building extension, making it loaded, verifying it was loaded and running tests to the some description file. README.md has some information but it's not related to greengage and local demo cluster.

@RekGRpth
Copy link
Collaborator Author

RekGRpth commented Oct 1, 2025

It would be great to put steps for building extension, making it loaded, verifying it was loaded and running tests to the some description file. README.md has some information but it's not related to greengage and local demo cluster.

I suggest fixing README in a separate task, because this task is only about build as PG extension for GP/GG.

@dkovalev1
Copy link
Collaborator

I suggest fixing README in a separate task, because this task is only about build as PG extension for GP/GG

I think that completed work should include instructions how to use that work, otherwise we and potential users will have to waste a lot of time in a future discovering how to do basic things.

@RekGRpth
Copy link
Collaborator Author

RekGRpth commented Oct 1, 2025

I suggest fixing README in a separate task, because this task is only about build as PG extension for GP/GG

I think that completed work should include instructions how to use that work, otherwise we and potential users will have to waste a lot of time in a future discovering how to do basic things.

The current README files don't contain instructions for installing Madlib as an extension, even for Postgres, even though this feature has been available for quite some time! They're also quite outdated and don't even contain instructions for a simple, non-extension installation of Madlib. So, I've created a separate task for this.

@dkovalev1
Copy link
Collaborator

okay

@RekGRpth RekGRpth merged commit 9be630e into madlib2-master Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants