Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make new DBsource available? #426

Open
AllamSudhakara opened this issue Nov 18, 2021 · 10 comments
Open

How to make new DBsource available? #426

AllamSudhakara opened this issue Nov 18, 2021 · 10 comments

Comments

@AllamSudhakara
Copy link

Hi, I am trying to connect to SQL Server and Presto. I did the following changes in my local. I am unable to connect to SQL Server.

  1. I changed enum and included SqlServer and Presto.
  2. Changed backend pom file for getting the drivers from Maven
  3. I had to add these two enum values to new_source.html as they are not included automatically

Is there any other configuration that I need to do for in order to achieve successful connection?
Thanks in advance.

@thorsten-papenbrock
Copy link
Member

Hi Allam,

it has actually been quite a while since we last tested and worked on the database connectivity. So before you proceed, you should test if the other connections to, for instance, a MySQL database still work. At the moment, we do not have the capacity to actively maintain the project (unfortunately) so some dependencies to still evolving external systems, such as the databases, might not work any more.

If the database connective in general still works, then your described changes should be fine. After all, Metanome simply uses Java's sql DriverManager to connect to a database via a provided URL:

this.dbConnection = DriverManager.getConnection(this.dbUrl, this.userName, this.password);

So maybe the URL is not provided in the expected form?
Also, do you get a specific error when you build Metanome?
Or does an error show up in the Metanome console output or in the Frontend when you make an connection attempt?

Best,
Thorsten

@AllamSudhakara
Copy link
Author

Hi Thorsten, Thanks for the reply. Interestingly no error message is showing up on screen but some error message comes up on the console. I will debug to see where that error message is being thrown. DefaultDatabaseConnection method error messages are not the ones that came up on console. It could be some version conflict. I will try MySQL today to see if my build actually works.

@AllamSudhakara
Copy link
Author

Hi Thorsten, Error on the console is "de.metanome.backend.resources.ParameterResource.retrieveParameters(java.lang.String), is not resolvable to a concrete type."

@thorsten-papenbrock
Copy link
Member

Mmmmh, that looks like a different issue. The parameter loading is part of the algorithm management and usually not related to the data input sources of the algorithms.
Metanome asks the algorithms for the parameters that they require to be executed. That call seems to fail in your setup. In the past, loading failures have often been caused by incompatible Metanome vs. Algorithms versions. In the most basic case, the algorithms implemented an older version of the Metanome interface. But they might also have build with different Java versions.
To debug this, you should build both Metanome and the algorithm(s) that you want to plug into your setup on the same host. After building Metanome, simply checkout the algorithm repo as well and run the build process also on that repo. Then use these algorithm jars for your testing.
If it still does not work, check when this error occurs. If it has something to do with the database connection, then the error should show up when the database input is registered or a profiling run is actually started.
To rule out the case that your issue is related to some very different problem, first try to run on a simple CSV file. Then, run on MySQL and, finally, test your own connection.

@AllamSudhakara
Copy link
Author

Thanks again. I already checked with CSV and it works. I used the input files that came along with the solution. I downloaded 1.2 release and the corresponding algorithm for testing. I also checked with mysql and the error is same. It is not an error it is warning in console. The full message is "WARNING: Return type, java.util.ArrayList<de.metanome.algorithm_integration.configuration.ConfigurationRequirement>, of method, public java.util.ArrayList> de.metanome.backend.resources.ParameterResource.retrieveParameters(java.lang.String), is not resolvable to a concrete type."
When the connection to MySQL is done, I expected Table Inputs (Choose 1) to be populated but it is not getting populated. I will try building one algorithm but it seems problem could be mainly in the DB connection. The warning message most probably is towards selecting an algorithm if not already selected.
Could you please check 1.2 release if it can connect to jdbc:mysql://localhost:3306/sakila? I installed mysql this morning on windows 10.

@AllamSudhakara
Copy link
Author

Hi Thorsten, If you could make out time for including few more DB connections such as SQLServer, Presto, Teradata, etc.,. and make small improvements for pwd hashing etc., it would help lot many users trying to profile bigdata. Hopefully, you could help harden this framework for big enterprises data governance.

@thorsten-papenbrock
Copy link
Member

"When the connection to MySQL is done, I expected Table Inputs (Choose 1) to be populated but it is not getting populated."
That is a good idea, but not something Metanome actually automatically does. It never got out of the research prototype stage and misses a ton of features that would actually make it a comfortable tool to use. This is one of this features. So to make it work, you need to specify the table(s) that you want to profile in the database manually. This is simply because there is no unified way to query the available tables and we did not take the effort to program all the different ways of doing retrieving these tables. I your case, this is good, because you don't need to implement that for SQLServer, Presto, Teradata, etc either. Once you created a connection, add another datasource, select table and the connection that you just created and then enter a table name. Then use that table as an input for some algorithm. This is the happy path that should work for all database systems. If that throws an error, we need to check again. We have database connections and database tables as inputs, because some algorithms can work on database connections directly and find the tables themselves. Other algorithms do not know databases at all, so that Metanome reads them and passes the records to the algorithm.

"If you could make out time for including few more DB connections"
Freeing up time for projects isn't really easier for researcher than for engineers in industry, unfortunately. We need to acquire the necessary funding before we can continue. But I am in contact with Felix and if the interest by funding agencies is high enough, you will see some development in the future ;-)

@AllamSudhakara
Copy link
Author

"So to make it work, you need to specify the table(s) that you want to profile in the database manually". Did you mean to specify table names manually from UI? I did not see an option of how to do it. If you meant a file to be edited, please provide details on which file needs to be edited. I will test against MySQL. Thanks for your continued support on this issue.

@thorsten-papenbrock
Copy link
Member

Sorry for the late response. You can add a table in the following way:

  1. Start the Metanome tool and open the UI in your webbrowser: http://localhost:8080/#/new
  2. Click the "+" sign in the data source panel.
  3. Click "TABLE INPUT".
  4. Select the database connection that your created earlier.
  5. Click the "Table Name" input field and enter the table name as it is listed in the database.
  6. (Optional) Enter a describing "Comment" for you table.
  7. Klick "Save"
  8. Select the table that you just created as an input for the next execution. You find it in the data source panel when you scroll to the bottom of the list.
    Hope this works for you :-)

@PoisonousBromineChan
Copy link

Hello, I am trying this method, but still encountered some errors. Could you please help me solve my problem #436 ? Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants