Skip to content
This repository was archived by the owner on Feb 3, 2021. It is now read-only.
This repository was archived by the owner on Feb 3, 2021. It is now read-only.

IP <-> Node name lookup #528

@jaley

Description

@jaley

This may just be a question of finding the right feature in the UI somewhere.

When I get an error from Spark, like this:

Task 79613 in stage 92.0 failed 4 times, most recent failure: Lost task 79613.3 in stage 92.0 (TID 457119, 10.128.80.91, executor 421): ExecutorLostFailure (executor 421 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

I'd like to SSH into the instance where that error occurred and check the executor logs to figure out why it died. That means accessing that node in the Azure Batch portal and adding my SSH key so that I can log in. Nodes are listed by the tvm-... node names in the portal, and when you have 60 instances in your cluster, it's not fun to go through clicking "properties" to find the one with the IP you're looking for.

The aztk spark cluster get ... command lists external IP addresses and node names, so I can't use that to find the instance either.

Is there some place I can track down an instance from its internal IP address?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions