Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 37 additions & 5 deletions docs/content/docs/connectors/table/formats/protobuf.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,10 +151,8 @@ Format Options
<td>
If this value is set to true, the format will read empty values as the default values defined in the proto file.
If the value is set to false, the format will generate null values if the data element does not exist in the binary protobuf message.
If proto syntax is proto3, users need to set this to true when using protobuf versions lower than 3.15 as older versions do not support
checking for field presence which can cause runtime compilation issues. Additionally, primtive types will be set to default values
instead of null as field presence cannot be checked for them. Please be aware that setting this to true will cause the deserialization
performance to be much slower depending on schema complexity and message size.
With Flink's current protobuf version (4.32.1), field presence is properly supported for proto3, allowing null handling for non-primitive types.
Please be aware that setting this to true will cause the deserialization performance to be much slower depending on schema complexity and message size.
</td>
</tr>
<tr>
Expand Down Expand Up @@ -291,4 +289,38 @@ OneOf field
In the serialization process, there's no guarantee that the Flink fields of the same one-of group only contain at most one valid value.
When serializing, each field is set in the order of Flink schema, so the field in the higher position will override the field in lower position in the same one-of group.

You can refer to [Language Guide (proto2)](https://developers.google.com/protocol-buffers/docs/proto) or [Language Guide (proto3)](https://developers.google.com/protocol-buffers/docs/proto3) for more information about Protobuf types.
Supported Protobuf Versions
------------

Flink uses protobuf-java 4.32.1 (corresponding to Protocol Buffers version 32), which includes support for:

- **Proto2 and Proto3 syntax**: Traditional `syntax = "proto2"` and `syntax = "proto3"` definitions
- **Protobuf Editions**: The new `edition = "2023"` and `edition = "2024"` syntax introduced in Protocol Buffers v27+
- **Improved proto3 field presence detection**: Better handling of optional fields without the limitations of older protobuf versions

### Using Protobuf Editions

Protobuf Editions provide a unified syntax that combines proto2 and proto3 functionality. If you're using Editions in your `.proto` files, Flink fully supports them:

```
edition = "2023";
package com.example;
option java_package = "com.example";
option java_multiple_files = true;

message SimpleTest {
int64 uid = 1;
string name = 2 [features.field_presence = EXPLICIT];
// ... rest of your message definition
}
```

Editions allow fine-grained control over feature behavior at the file, message, or field level, while maintaining backward compatibility with proto2 and proto3. For more information, see the [Protobuf Editions documentation](https://protobuf.dev/editions/overview/).

Additional Resources
----------------
For more information about Protocol Buffers, refer to:
- [Language Guide (proto2)](https://developers.google.com/protocol-buffers/docs/proto)
- [Language Guide (proto3)](https://developers.google.com/protocol-buffers/docs/proto3)
- [Language Guide (Editions)](https://protobuf.dev/programming-guides/editions/) - for the new Editions syntax
- [Protobuf Editions Overview](https://protobuf.dev/editions/overview/) - understand the motivation and benefits of Editions
25 changes: 24 additions & 1 deletion docs/content/release-notes/flink-2.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,4 +182,27 @@ Bump flink-shaded version to 20.0 to support Smile format.
##### [FLINK-37760](https://issues.apache.org/jira/browse/FLINK-37760)

Bump parquet version to 1.15.3 to resolve parquet-avro module
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).
vulnerability found in [CVE-2025-30065](https://nvd.nist.gov/vuln/detail/CVE-2025-30065).

#### Upgrade Protocol Buffers to 4.32.1
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved to a new flink-2.2.md release notes?


##### [FLINK-38547](https://issues.apache.org/jira/browse/FLINK-38547)

Flink now uses protobuf-java 4.32.1 (corresponding to Protocol Buffers version 32), upgrading from
protobuf-java 3.21.7 (Protocol Buffers version 21). This major upgrade enables:

- **Protobuf Editions Support**: Full support for the new `edition = "2023"` and `edition = "2024"`
syntax introduced in Protocol Buffers v27+. Editions provide a unified approach that combines
proto2 and proto3 functionality with fine-grained feature control.
- **Improved Proto3 Field Presence**: Better handling of optional fields in proto3 without the
limitations of older protobuf versions, eliminating the need to set `protobuf.read-default-values`
to `true` for field presence checking.
- **Enhanced Performance**: Leverages performance improvements and bug fixes from 11 Protocol
Buffers releases (versions 22-32).
- **Modern Protobuf Features**: Access to newer protobuf capabilities including Edition 2024
features and improved runtime behavior.

Users with existing proto2 and proto3 `.proto` files will continue to work without changes. For
those interested in adopting Protobuf Editions, see the updated
[Protobuf format documentation](https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/connectors/table/formats/protobuf/)
for examples and guidance.
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
import org.apache.parquet.hadoop.ParquetWriter;
import org.apache.parquet.hadoop.api.WriteSupport;
import org.apache.parquet.io.OutputFile;
import org.apache.parquet.proto.ProtoWriteSupport;

/** Convenience builder for creating {@link ParquetWriterFactory} instances for Protobuf classes. */
public class ParquetProtoWriters {
Expand Down Expand Up @@ -62,7 +61,8 @@ protected ParquetProtoWriterBuilder<T> self() {

@Override
protected WriteSupport<T> getWriteSupport(Configuration conf) {
return new ProtoWriteSupport<>(clazz);
// Use patched implementation compatible with protobuf 4.x
return new PatchedProtoWriteSupport<>(clazz);
}
}

Expand Down
Loading