-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Db2 z/OS: UnicodeDecodeError Exception thrown in conn_errormsg() #876
Comments
Does the same symptom appear in other functions, e.g ibm_db_conn_warn(), ibm_db_stmt_warn(), ibm_db_conn_error(), ibm_db_stmt_error(), and some other places? |
@g-haas When you specified |
It is possible this is related to issue #852 where we are still trying to understand why the specification of I can confirm that in a z/OS USS shell at least for function
Sample output from that code:
The " SQLCODE=-199" part at the end of the value of Here are the bytes printed for
And here is that same string decoded with IBM1047 with
|
@g-haas Could you please check tagging of your ODBC ini file? Please make sure tagging of your odbc.ini file is "binary" or "mixed":
If file is tagged text (chtag -t -c IBM1047 $DSNAOINI) the Please update us about the test result after tag correction. Thanks. |
Python uses UTF-8 internally so ODBC must use ASCII or UNICODE, so there is no need to add encode/decode or patch ibm_db.c. We've found several customers with incorrect encoding/tagging of the odbc.ini file, which prevents ODBC from properly reading the file thus failing to honor your CURRENTAPPENSC=ASCII setting. Can you post the encoding of your odbc.ini file? Also, the -805 suggests the ODBC plans and packages are not bound. |
@jthyssenrocket I have worked with the customer and the -805 is intended to reproduce the error. This issue is not about fixing the -805 error, its about the encoding issue. Actually I would expect that even if CURRENTAPPENSC is not honoured, the code should be able to print an error message without the hazzle of several encode().decode() call chains. |
I am working with ODBC z/OS development, and the conclusion right now is that it is not possible for the python driver to check if current application encoding scheme is set to ASCII or UNICODE before starting to issue queries. There is no ODBC/CLI API to check this on z/OS. Python is using unicode internally and does not expect to get error messages or data from ODBC back in EBCDIC. There are no encoding issues if the ODBC ini file is tagged as required and has CURRENTAPPENSC=ASCII or UNICODE. |
@jthyssenrocket So you cannot do a read from python with unicode as codeset parameter to check if its correctly tagged or throws an error on reading odbc.ini and evaluate the CURRENTAPPENVSC parm in there?
And catch a possible exception and throw it to the user before doing an actual call to Db2? |
ibm_db is not reading any files. We issue ODBC/CLI calls to the C API implemented by ODBC for z/OS, so we need an C function we can call that returns the current application encoding scheme. Edit: we're working on getting all relevant document updated to highlight the encoding requirement for the ODBC ini file (Db2 for z/OS manual + python ibm_db docs + node.js ibm_db docs). |
@jthyssenrocket Maybe ibm_db should read odbc.ini to avoid all sorts of obscure errors that are caused by incorrectly tagged odbc.ini? |
It is outside the scope of ibm_db to read configuration files for the underlying drivers. We don't read db2dsdriver.cfg files on WIndows either. It is not a unique requirement for ibm_db. Any (unicode) ODBC client that relies on a odbc.ini file in USS have this requirement. It doesn't matter if it is python, node.js, COBOL program, PL/I program, etc. |
We originally used We just observed that, when setting
If you look closely, it's a different one than in the original description (
Sure:
Given the above observation regarding the slightly different
Of course, you're right about the meaning of the |
|
We use the ODBC/CLI "W" (wide) APIs, e.g., SQLConnectW. The "W" APIs are supposed to return UTF-16 content always independent on CURRENTAPPENSC. See https://www.ibm.com/docs/en/db2-for-zos/13?topic=data-db2-odbc-unicode-support Would it be possible to collect APPLTRACE=1 and DIAGTRACE=1, and upload here? There might be some invalid characters in the message returned by Db2. |
We have documented steps to install ibm_db on z/OS here in details. Please check it. Thanks. |
There is a deeper issue: we're using the SQLGetDiagRec API to retrieve error messages (not SQLGetDiagRecW UTF-16 API), so messages are returned in varying codepages. This is likely not an issue on distributed platforms, but on z/OS it appears we're getting a mix of EBCDIC, ASCII, and UNICODE error messages back, but our code assumes it is ASCII It seems it would be more robust to use the SQLGetDiagRecW which always returns UTF-16 error messages instead of assuming the message returned is in ASCII (and use StringOBJ_FromASCII() to convert it to python string). The recommendation from ODBC for z/OS development is also to use SQLGetDiagRecW. We are calling SQLGetDiagRec in lots of places, so this will be a larger change, though. |
Opened Jira bug https://jira.rocketsoftware.com/browse/DBC-14843 to fix this issue in ibm_db driver. Thanks. |
@g-haas, Can you please try with connection string to connect to ibm_db[ by providing wrong details either from hostaname , username or password] and share the logs here. |
@g-haas Can you please share the update ? |
@g-haas, Could you please share the issue you are facing here with all the required information? Thanks |
Problem description
Apparently, when trying to connect to a database server using
ibm_db.connect(),
ibm_db
assumes error strings returned by the database server to be ASCII-encoded. This, however, may not be the case for Db2 on z/OS, as z/OS traditionally uses EBCDIC encoding. As a result, one cannot retrieve a detailed error description usingibm_db.conn_errormsg()
.Here’s what we did. We deliberately specified an invalid plan name in the ODBC config file and called
ibm_db.connect()
. Obviously, the connection failed. We then wanted to get some information about the error. Since in our opinion, it was too early for callingibm_db.stmt_errormsg()
at this point, we triedibm_db.conn_errormsg()
. This call, however, resulted in an exception being thrown and the Python runtime complaining about some UnicodeDecodeError. This also happened if we did not even try toprint()
the error message – the call to theibm_db.conn_errormsg()
method by itself triggered the error.Having a look at
ibm_db.c
, we noticed thatibm_db_conn_errormsg
callsStringOBJ_FromASCII()
.ibm_db_stmt_errormsg
(backingibm_db.stmt_errormsg()
which works fine), in contrast, callsStringOBJ_FromStr()
. Hence, we patchedibm_db_conn_errormsg
, so that it callsStringOBJ_FromStr()
, too, built our custom version of theibm_db
package and rerun our test. Of course, the connection still fails. However, now the call toibm_db.conn_errormsg()
succeeds and error details can be retrieved. After appropriate codepage translations, the text can even be printed.So, why do the two
_errormsg()
methods treat the error messages they receive from the server differently? Is this on purpose?Environment
uname --> OS/390
uname -m --> 8561
db2-connection-test.py
This is the small example we tested with:
odbc.ini
Result (unpatched ibm_db)
Here's the output produced with an the stock ibm_db package:
Result (patched ibm_db)
After applying the modifications mentioned above, the following output is yielded:
The text was updated successfully, but these errors were encountered: