Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unable to Write to OpenSearch 2.16.0 using Spark 3.5 #521

Open
amitgenius opened this issue Sep 26, 2024 · 2 comments
Open

[BUG] Unable to Write to OpenSearch 2.16.0 using Spark 3.5 #521

amitgenius opened this issue Sep 26, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@amitgenius
Copy link

amitgenius commented Sep 26, 2024

What is the bug?

Using Spark 3.5 Streaming Job reading data from Kafka but while writing to OpenSearch giving following error. Checked _cluster/heath response through same endpoints are working fine.
curl -kgu username:'somepassword' http://xxx-opensearch:9200/_cluster/health?pretty
{
"cluster_name" : "opensearch2x-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 2,
"discovered_master" : true,
"discovered_cluster_manager" : true,
"active_primary_shards" : 21,
"active_shards" : 42,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Here are my spark configurations:

opensearch.nodes
xxx-opensearch

opensearch.port
9200

opensearch.nodes.wan.only
true

opensearch.batch.size.bytes
10mb

opensearch.index.auto.create
true

opensearch.batch.size.entries
100

opensearch.net.http.auth.pass
somepassword

opensearch.net.http.auth.user
someusername

opensearch.batch.write.refresh
false

"Exception in storing in elasticorg.opensearch.hadoop.OpenSearchHadoopIllegalArgumentException: Cannot detect OpenSearch version - typically this happens if the network/OpenSearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'opensearch.nodes.wan.only'

Spark Code:
JavaOpenSearchSpark.saveJsonToOpenSearch(map, "{kafka_topic}" + "-" + date);

Versions:
OpenSearch: 2.16.0.0
Connector: opensearch-hadoop-1.2.0.jar
Spark: 3.5

How can one reproduce the bug?

Steps to reproduce the behavior.

What is the expected behavior?

should write to OpenSearch successfully

What is your host/environment?

Linux, Deployed OpenSearch on Kubernetes,

Do you have any screenshots?

If applicable, add screenshots to help explain your problem.

Do you have any additional context?

Add any other context about the problem.

@amitgenius amitgenius added bug Something isn't working untriaged labels Sep 26, 2024
@Xtansia Xtansia removed the untriaged label Oct 3, 2024
@Xtansia
Copy link
Collaborator

Xtansia commented Oct 3, 2024

There should in theory be a more detailed exception about the specific request that failed following the initial couldn't determine version exception, can you please check the logs?

In general that particular error (assuming you're not using Amazon OpenSearch Serverless) is usually caused by some variety of connection or authentication issue.

@amitgenius
Copy link
Author

same configurations are working perfectly with OpenSearch 2.11.1.0 and all versions lower than this. When I am upgrading the OpenSearch to 2.16.0 version it is getting this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants