[BUG] Unable to Write to OpenSearch 2.16.0 using Spark 3.5 #521

amitgenius · 2024-09-26T13:51:29Z

What is the bug?

Using Spark 3.5 Streaming Job reading data from Kafka but while writing to OpenSearch giving following error. Checked _cluster/heath response through same endpoints are working fine.
curl -kgu username:'somepassword' http://xxx-opensearch:9200/_cluster/health?pretty
{
"cluster_name" : "opensearch2x-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 2,
"discovered_master" : true,
"discovered_cluster_manager" : true,
"active_primary_shards" : 21,
"active_shards" : 42,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Here are my spark configurations:

opensearch.nodes
xxx-opensearch

opensearch.port
9200

opensearch.nodes.wan.only
true

opensearch.batch.size.bytes
10mb

opensearch.index.auto.create
true

opensearch.batch.size.entries
100

opensearch.net.http.auth.pass
somepassword

opensearch.net.http.auth.user
someusername

opensearch.batch.write.refresh
false

"Exception in storing in elasticorg.opensearch.hadoop.OpenSearchHadoopIllegalArgumentException: Cannot detect OpenSearch version - typically this happens if the network/OpenSearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'opensearch.nodes.wan.only'

Spark Code:
JavaOpenSearchSpark.saveJsonToOpenSearch(map, "{kafka_topic}" + "-" + date);

Versions:
OpenSearch: 2.16.0.0
Connector: opensearch-hadoop-1.2.0.jar
Spark: 3.5

How can one reproduce the bug?

Steps to reproduce the behavior.

What is the expected behavior?

should write to OpenSearch successfully

What is your host/environment?

Linux, Deployed OpenSearch on Kubernetes,

Do you have any screenshots?

If applicable, add screenshots to help explain your problem.

Do you have any additional context?

Add any other context about the problem.

Xtansia · 2024-10-03T04:02:02Z

There should in theory be a more detailed exception about the specific request that failed following the initial couldn't determine version exception, can you please check the logs?

In general that particular error (assuming you're not using Amazon OpenSearch Serverless) is usually caused by some variety of connection or authentication issue.

amitgenius · 2024-10-03T06:48:43Z

same configurations are working perfectly with OpenSearch 2.11.1.0 and all versions lower than this. When I am upgrading the OpenSearch to 2.16.0 version it is getting this issue.

amitgenius added bug Something isn't working untriaged labels Sep 26, 2024

Xtansia removed the untriaged label Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Unable to Write to OpenSearch 2.16.0 using Spark 3.5 #521

[BUG] Unable to Write to OpenSearch 2.16.0 using Spark 3.5 #521

amitgenius commented Sep 26, 2024 •

edited

Loading

Xtansia commented Oct 3, 2024

amitgenius commented Oct 3, 2024

[BUG] Unable to Write to OpenSearch 2.16.0 using Spark 3.5 #521

[BUG] Unable to Write to OpenSearch 2.16.0 using Spark 3.5 #521

Comments

amitgenius commented Sep 26, 2024 • edited Loading

What is the bug?

How can one reproduce the bug?

What is the expected behavior?

What is your host/environment?

Do you have any screenshots?

Do you have any additional context?

Xtansia commented Oct 3, 2024

amitgenius commented Oct 3, 2024

amitgenius commented Sep 26, 2024 •

edited

Loading