You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 3, 2021. It is now read-only.
This may just be a question of finding the right feature in the UI somewhere.
When I get an error from Spark, like this:
Task 79613 in stage 92.0 failed 4 times, most recent failure: Lost task 79613.3 in stage 92.0 (TID 457119, 10.128.80.91, executor 421): ExecutorLostFailure (executor 421 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
I'd like to SSH into the instance where that error occurred and check the executor logs to figure out why it died. That means accessing that node in the Azure Batch portal and adding my SSH key so that I can log in. Nodes are listed by the tvm-... node names in the portal, and when you have 60 instances in your cluster, it's not fun to go through clicking "properties" to find the one with the IP you're looking for.
The aztk spark cluster get ... command lists external IP addresses and node names, so I can't use that to find the instance either.
Is there some place I can track down an instance from its internal IP address?