-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUPPORT] hudi-common 0.14.0 jar in mavenCentral appears to have corrupt generated avro classes #11602
Comments
#11378 appears to be caused by this same issue |
Here is a reproducer script: #!/usr/bin/env bash
MAVEN="https://repo1.maven.org/maven2"
ARTIFACTS="\
org/apache/avro/avro/1.11.3/avro-1.11.3.jar \
com/fasterxml/jackson/core/jackson-core/2.17.1/jackson-core-2.17.1.jar \
com/fasterxml/jackson/core/jackson-databind/2.17.1/jackson-databind-2.17.1.jar \
com/fasterxml/jackson/core/jackson-annotations/2.17.1/jackson-annotations-2.17.1.jar \
org/slf4j/slf4j-api/2.0.9/slf4j-api-2.0.9.jar \
org/apache/hudi/hudi-common/0.14.0/hudi-common-0.14.0.jar \
"
CLASSPATH=""
for artifact in $ARTIFACTS; do
curl -O "${MAVEN}/${artifact}"
jar=$(basename "$artifact")
CLASSPATH="${CLASSPATH}:${jar}"
done
echo $CLASSPATH
echo 'org.apache.avro.Schema schema = new org.apache.avro.Schema.Parser().parse("{\"type\":\"record\",\"name\":\"HoodieCleanPartitionMetadata\",\"namespace\":\"org.apache.hudi.avro.model\",\"fields\":[{\"name\":\"partitionPath\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"}},{\"name\":\"policy\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"}},{\"name\":\"deletePathPatterns\",\"type\":{\"type\":\"array\",\"items\":{\"type\":\"string\",\"avro.java.string\":\"String\"}}},{\"name\":\"successDeleteFiles\",\"type\":{\"type\":\"array\",\"items\":{\"type\":\"string\",\"avro.java.string\":\"String\"}}},{\"name\":\"failedDeleteFiles\",\"type\":{\"type\":\"array\",\"items\":{\"type\":\"string\",\"avro.java.string\":\"String\"}}},{\"name\":\"isPartitionDeleted\",\"type\":[\"null\",\"boolean\"],\"default\":null}]}"); System.out.println("Class for schema: " + org.apache.avro.specific.SpecificData.get().getClass(schema));' |\
jshell --class-path "${CLASSPATH}" |
due to build profiles varying wrt spark and flink profiles, we don't expect hudi-common jars in the maven repo to be used for all spark/flink versions which changes avro versions over time, which causes compatibility issues. We expect people only use hudi bundle jars like hudi-spark3.5-bundle, hudi-utilities-slim-bundle, hudi-flink1.18-bundle, etc |
@xushiyan understood. I am not an xtable developer. However, it seems pretty clear that the issue is with corrupted classes, not a spark version. I have asked the XTable devs in the linked ticket to comment here. I'm not sure what I can do to make this move forward. |
@lucasmo we should be able to fix this in 0.16.0 (tracking in https://issues.apache.org/jira/browse/HUDI-8028) In the meantime, if you want the hudi-common jar to work, you may build the project with spark 3.4 or 3.5 profile, which will produce a hudi common jar that includes a compatible avro dependency for your spark version (assume you're using spark 3.4 or 3.5) |
Describe the problem you faced
When diagnosing a problem with XTable (see apache/incubator-xtable#466), I noticed that avro classes were unable to even be instantiated for schema in a very simple test case when using
hudi-common-0.14.0
as a dependency.However, this issue does not exist when using
hudi-spark3.4-bundle_2.12-0.14.0
as a dependency, which contains the same avro autogenerated classes. A good specific example isorg/apache/hudi/avro/model/HoodieCleanPartitionMetadata.class
.When compiling hudi locally (tag
release-0.14.0
,mvn clean package -DskipTests -Dspark3.4
, java 1.8), both generated jar files have the correct implementations of avro autogenerated classes.To Reproduce
Steps to reproduce the behavior:
org/apache/hudi/avro/model/HoodieCleanPartitionMetadata.class
in all four of the jarsOR
run the following in Java 11, replacing $PATH_TO_A_HOODIE_AVRO_MODELS_JAR with a path to one of the four jar files
Then, copy and paste this into the shell:
On the MavenCentral hudi-common-0.14.0 jar, you should get:
Expected behavior
The above code snippet prints
Environment Description
everything else n/a, but duplicated issue on macOS and Ubuntu 22.04.
The text was updated successfully, but these errors were encountered: