Errors encountered in bulk import API execution in cosmoDb spark #418

anggelo17 · 2020-10-31T20:45:43Z

I'm trying to do a bulk import in one spark job using cosmodb bulk conf:

 CosmosDBConfig.Endpoint -> config.getString("cosmosdb.endpoint"),
    CosmosDBConfig.Masterkey -> sys.props.get("masterkey").getOrElse("No env"),
    CosmosDBConfig.Database -> config.getString("cosmosdb.database"),
    CosmosDBConfig.Collection -> config.getString("cosmosdb.collection"),
    CosmosDBConfig.Upsert -> config.getString("cosmosdb.upsert"),
    CosmosDBConfig.PreferredRegionsList -> config.getString("cosmosdb.preferredregion"),
    CosmosDBConfig.BulkImport -> "true"

when retrieving/writing one of my documents which size is larger than usual(2.9MB) I get the following exception(my collection has a partition key defined):

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 4 times, most recent failure: Lost task 0.3 in stage 8.0 (TID 246, 10.10.42.5, executor 1): java.lang.Exception: Errors encountered in bulk import API execution. PartitionKeyDefinition: {"paths":["/key/businessUnit/id"],"kind":"Hash"}, Number of failures corresponding to exception of type: com.microsoft.azure.documentdb.DocumentClientException = 1. The failed import docs are:

thanks in advance for the help

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors encountered in bulk import API execution in cosmoDb spark #418

Errors encountered in bulk import API execution in cosmoDb spark #418

anggelo17 commented Oct 31, 2020 •

edited

Loading

Errors encountered in bulk import API execution in cosmoDb spark #418

Errors encountered in bulk import API execution in cosmoDb spark #418

Comments

anggelo17 commented Oct 31, 2020 • edited Loading

anggelo17 commented Oct 31, 2020 •

edited

Loading