Camunda 7 External Task Retry Not Working – Fix
Camunda 7 External Task Retry Not Working – Root Cause & Fix (Production Guide)
External Tasks are widely used in Camunda 7 to decouple business logic from the process engine.
However, a very common production issue teams face is:
❌ External Task retries are not working as expected.
In this blog, we’ll cover why retries fail, real root causes, and how to fix them safely in production.
1️⃣ How External Task Retry Works in Camunda 7
In Camunda 7, retries are controlled by:
-
retriescount -
lockDuration -
lockExpirationTime -
Failure handling logic in the worker
A retry happens only when the worker explicitly reports failure.
2️⃣ Most Common Reasons Retries Don’t Work
❌ 1. handleFailure() Not Called Correctly
Wrong implementation (very common):
Correct implementation:
👉 If handleFailure() is not called, Camunda assumes success.
❌ 2. Retries Count Is Already Zero
If retries = 0, Camunda will never retry again.
Check in:
-
Cockpit → External Tasks → Retries column
Fix:
Or manually reset retries from Cockpit.
❌ 3. Lock Duration Too Short
If lockDuration is too small:
-
Task unlocks before worker finishes
-
Another worker may pick it up
-
Retry logic behaves unpredictably
Fix:
Rule of thumb:
👉 lockDuration > max execution time
❌ 4. Worker Crashes Before Reporting Failure
If the worker:
-
Crashes
-
Gets killed
-
Loses network
Then:
-
handleFailure()is never called -
Retry is not scheduled
What happens instead?
-
Task becomes available again only after lock expires
-
Retry count is unchanged
👉 This is expected behavior, not a bug.
3️⃣ Retry Timeout Misunderstanding
Many developers think retries are immediate.
But:
Means:
-
Camunda waits 1 minute
-
Then the task becomes fetchable again
If retryTimeout is large, it looks like retries are not working.
4️⃣ BPMN Error vs External Task Retry (Wrong Choice)
❌ Using BPMN Error for technical failures:
This:
-
Ends retry logic
-
Moves process forward
✅ Use retries for:
-
Network errors
-
DB timeouts
-
Temporary failures
Use BPMN Error only for business exceptions.
5️⃣ External Task Topic Subscription Issues
Check:
-
Topic name matches BPMN exactly
-
Worker is actually subscribed
-
Worker is polling continuously
Enable logs:
6️⃣ How to Debug Retry Issues (Checklist)
✅ Check retries value in Cockpit
✅ Check worker logs
✅ Verify handleFailure() is called
✅ Confirm lockDuration
✅ Validate retry timeout
✅ Ensure worker is alive
7️⃣ Production-Safe Retry Strategy (Recommended)
This ensures:
-
Controlled retry count
-
No infinite loops
-
Safe recovery
🔑 Key Takeaway
External Task retries do NOT fail silently — they fail due to implementation or configuration issues.
Most problems are caused by:
-
Missing
handleFailure() -
Zero retries
-
Short lock duration
-
Wrong exception strategy
💼 Professional Support Available
If you are:
-
Facing Camunda 7 External Task retry issues
-
Debugging production failures
-
Designing robust retry strategies
I provide paid consulting, production debugging, and Camunda training.
📧 Contact : ishikhanirankari@gmail.com | info@realtechnologiesindia.com
🌐 Website : IT Trainings | Digital metal podium
Comments
Post a Comment