Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors encountered in bulk import API execution in cosmoDb spark #418

Open
anggelo17 opened this issue Oct 31, 2020 · 0 comments
Open

Errors encountered in bulk import API execution in cosmoDb spark #418

anggelo17 opened this issue Oct 31, 2020 · 0 comments

Comments

@anggelo17
Copy link

anggelo17 commented Oct 31, 2020

I'm trying to do a bulk import in one spark job using cosmodb bulk conf:

 CosmosDBConfig.Endpoint -> config.getString("cosmosdb.endpoint"),
    CosmosDBConfig.Masterkey -> sys.props.get("masterkey").getOrElse("No env"),
    CosmosDBConfig.Database -> config.getString("cosmosdb.database"),
    CosmosDBConfig.Collection -> config.getString("cosmosdb.collection"),
    CosmosDBConfig.Upsert -> config.getString("cosmosdb.upsert"),
    CosmosDBConfig.PreferredRegionsList -> config.getString("cosmosdb.preferredregion"),
    CosmosDBConfig.BulkImport -> "true"

when retrieving/writing one of my documents which size is larger than usual(2.9MB) I get the following exception(my collection has a partition key defined):

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 4 times, most recent failure: Lost task 0.3 in stage 8.0 (TID 246, 10.10.42.5, executor 1): java.lang.Exception: Errors encountered in bulk import API execution. PartitionKeyDefinition: {"paths":["/key/businessUnit/id"],"kind":"Hash"}, Number of failures corresponding to exception of type: com.microsoft.azure.documentdb.DocumentClientException = 1. The failed import docs are:

thanks in advance for the help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant