-
Notifications
You must be signed in to change notification settings - Fork 29k
backport libthrift update to fix CVE-2019-0205 & CVE-2020-13949 #53429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-3.5
Are you sure you want to change the base?
Conversation
e818cdf to
50c411b
Compare
779252a to
c4b182f
Compare
|
@LaurentGoderre, thanks for creating this backport PR. I think the backport decision depends on CVE severity and compatibility risk. As the author of the original patch, I'm neutral on it, but if you really want to do such a backport, please include #50022 too. PS: I backported it to the internal Spark 3 branch, and it has been running well for over 6 months in our production cluster and many customer clusters, so I'm confident about the patch itself. The risk comes from the libthrift API change, this might surprise Spark downstream projects or users who use libthrift. cc more PMC members for making a decision - @dongjoon-hyun @LuciferYang @wangyum @sarutak |
…essage.size` Partly port HIVE-26633 for Spark HMS client - respect `hive.thrift.client.max.message.size` if present and the value is positive. > Thrift client configuration for max message size. 0 or -1 will use the default defined in the Thrift library. The upper limit is 2147483648 bytes (or 2gb). Note: it's a Hive configuration, I follow the convention to not document on the Spark side. 1. THRIFT-5237 (0.14.0) changes the max thrift message size from 2GiB to 100MiB 2. HIVE-25098 (4.0.0) upgrades Thrift from 0.13.0 to 0.14.1 3. HIVE-25996 (2.3.10) backports HIVE-25098 to branch-2.3 4. HIVE-26633 (4.0.0) introduces `hive.thrift.client.max.message.size` 5. SPARK-47018 (4.0.0) upgrades Hive from 2.3.9 to 2.3.10 Thus, Spark's HMS client does not respect `hive.thrift.client.max.message.size` and has a fixed max thrift message size 100MiB, users may hit the "MaxMessageSize reached" exception on accessing Hive tables with a large number of partitions. See discussion in apache#46468 (comment) No, it tackles the regression introduced by an unreleased change, namely SPARK-47018. The added code only takes effect when the user configures `hive.thrift.client.max.message.size` explicitly. This must be tested manually, as the current Spark UT does not cover the remote HMS cases. I constructed a test case in a testing Hadoop cluster with a remote HMS. Firstly, create a table with a large number of partitions. ``` $ spark-sql --num-executors=6 --executor-cores=4 --executor-memory=1g \ --conf spark.hive.exec.dynamic.partition.mode=nonstrict \ --conf spark.hive.exec.max.dynamic.partitions=1000000 spark-sql (default)> CREATE TABLE p PARTITIONED BY (year, month, day) STORED AS PARQUET AS SELECT /*+ REPARTITION(200) */ * FROM ( (SELECT CAST(id AS STRING) AS year FROM range(2000, 2100)) JOIN (SELECT CAST(id AS STRING) AS month FROM range(1, 13)) JOIN (SELECT CAST(id AS STRING) AS day FROM range(1, 31)) JOIN (SELECT 'this is some data' AS data) ); ``` Then try to tune `hive.thrift.client.max.message.size` and run a query that would trigger `getPartitions` thrift call. For example, when set to `1kb`, it throws `TTransportException: MaxMessageSize reached`, and the exception disappears after boosting the value. ``` $ spark-sql --conf spark.hive.thrift.client.max.message.size=1kb spark-sql (default)> SHOW PARTITIONS p; ... 2025-02-20 15:18:49 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s. listPartitionNames org.apache.thrift.transport.TTransportException: MaxMessageSize reached at org.apache.thrift.transport.TEndpointTransport.checkReadBytesAvailable(TEndpointTransport.java:81) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.thrift.protocol.TProtocol.checkReadBytesAvailable(TProtocol.java:67) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.thrift.protocol.TBinaryProtocol.readListBegin(TBinaryProtocol.java:297) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_names_result$get_partition_names_resultStandardScheme.read(ThriftHiveMetastore.java) ~[hive-metastore-2.3.10.jar:2.3.10] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_names_result$get_partition_names_resultStandardScheme.read(ThriftHiveMetastore.java) ~[hive-metastore-2.3.10.jar:2.3.10] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_names_result.read(ThriftHiveMetastore.java) ~[hive-metastore-2.3.10.jar:2.3.10] at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88) ~[libthrift-0.16.0.jar:0.16.0] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition_names(ThriftHiveMetastore.java:2458) ~[hive-metastore-2.3.10.jar:2.3.10] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition_names(ThriftHiveMetastore.java:2443) ~[hive-metastore-2.3.10.jar:2.3.10] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionNames(HiveMetaStoreClient.java:1487) ~[hive-metastore-2.3.10.jar:2.3.10] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[?:?] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.base/java.lang.reflect.Method.invoke(Method.java:569) ~[?:?] at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) ~[hive-metastore-2.3.10.jar:2.3.10] at jdk.proxy2/jdk.proxy2.$Proxy54.listPartitionNames(Unknown Source) ~[?:?] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[?:?] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.base/java.lang.reflect.Method.invoke(Method.java:569) ~[?:?] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2349) ~[hive-metastore-2.3.10.jar:2.3.10] at jdk.proxy2/jdk.proxy2.$Proxy54.listPartitionNames(Unknown Source) ~[?:?] at org.apache.hadoop.hive.ql.metadata.Hive.getPartitionNames(Hive.java:2461) ~[hive-exec-2.3.10-core.jar:2.3.10] at org.apache.spark.sql.hive.client.Shim_v2_0.getPartitionNames(HiveShim.scala:976) ~[spark-hive_2.13-4.1.0-SNAPSHOT.jar:4.1.0-SNAPSHOT] ... ``` No. Closes apache#50022 from pan3793/SPARK-49489. Authored-by: Cheng Pan <[email protected]> Signed-off-by: yangjie01 <[email protected]> (cherry picked from commit 2ea5621) Signed-off-by: yangjie01 <[email protected]>
c4b182f to
dca93d7
Compare
|
@pan3793 done! The CVEs being fixed are both high so it would be worth it. |
|
Based on the provided links, can simply upgrading libthrift to version 0.14.0 resolve the issue? Is this also a version that will break the API compatibility of libthrift? I hope we can first try the approach of only upgrading libthrift. |
|
@LuciferYang I tried that at first and it was not compatible with the version of hive |
What changes were proposed in this pull request?
Backport the bump of libthrift from 4.0. This change is a subset of #46468
Why are the changes needed?
To fix 2 high CVE in libthrift (CVE-2019-0205 & CVE-2020-13949)
Does this PR introduce any user-facing change?
No
Was this patch authored or co-authored using generative AI tooling?
No.