Conversation
Why are these changes needed? ----------------------------- Provides support for IPv6 only and dual-stack IPv4/IPv6. Inspired by this discussion https://discuss.ray.io/t/lack-of-ipv6-support/2323 IPv6 is much more common than it used to be; in some cases the only available stack is IPv6. Commonly machines come with dual-stack connectivity enabled (IPv4 and IPv6 available at the same time). I've already reviewed the source code of [kuberay][kuberay]. I've determined due to its use of native Go net methods kuberay already supports IPv6. Design ------ - Centralize networking code to make it easier to support IPv6 alongside IPv4. This is simpler and clarifies where in your network code you need to support this when using low level python sockets. - Favor IPv4 since that is what this project is typically used to; code will fall back to IPv6 if IPv4 is not available. - It uses dual-stack IPv4 and IPv6 when it is available; for example when listening on a port across all interfaces. Shell code ---------- I notice the following scripts may or may not work with IPv6. They should probably updated using [bash parameter expansion][bash-pe]. * `doc/source/cluster/doc_code/slurm-basic.sh` * `doc/source/cluster/doc_code/slurm-template.sh` If I update them; it will be in a separate PR. For example, the following bash code properly parses both IPv4 and IPv6 `ip:port` into two parts. # IPv6 loopback port 8080 ip_and_port=::1:8080 ip="${ip_and_port%:*}" port="${ip_and_port##*:}" # IPv4 loopback port 8080 ip_and_port=127.0.0.1:8080 ip="${ip_and_port%:*}" port="${ip_and_port##*:}" Related issue number -------------------- I don't know if it will close all of these issues but I did an issues search for `ipv6`. * closes ray-project#6967 * ray-project#43416 * ray-project#41966 * ray-project#43910 Checks ------ Initially, this pull request is in-draft. This is to enable reviewing sooner and manual testing to occur. Automated tests to follow. - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [x] This PR is not tested :( [bash-pe]: https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html [kuberay]: https://github.com/ray-project/kuberay Signed-off-by: Sam Gleske <sam.mxracer@gmail.com>
* Fixed a few function calls missing net import. * Missed ip parse * Fix net.py syntax * Fix tls_utils.py syntax Signed-off-by: Sam Gleske <sam.mxracer@gmail.com>
Misuse of dualstack. It should only be used when listening for a port on all interfaces. flake8 fixes Signed-off-by: Sam Gleske <sam.mxracer@gmail.com>
There was a problem hiding this comment.
Another review pass fixing issues I learned studying how Python sockets support IPv6.
I thought dualstack works when binding with a host but it does not (e.g. localhost dualstack only binds to IPv6 ::1 but not 127.0.0.1).
Binding dualstack only works when binding a port to all interfaces i.e. sock.bind(('', port)).
samrocketman
left a comment
There was a problem hiding this comment.
I appear to be misusing @staticmethod.
|
I guess my issues with staticmethod are best described here https://stackoverflow.com/questions/1859959/static-methods-how-to-call-a-method-from-another-method |
|
After messing around with |
b3adfab to
2e841fb
Compare
- Bugfix python/ray/util/client/server/proxier.py not obtaining local_ip address from localhost. - Remove use of staticmethod; the way I used it isn't compatible below Python 3.10 - Supprt parsing IPv6 URL format `proto://[ip]:port/` in net.py - Fixed API linting issues in net.py - Fix pylint issues Signed-off-by: Sam Gleske <sam.mxracer@gmail.com>
Signed-off-by: Sam Gleske <875669+samrocketman@users.noreply.github.com>
- More pylint changes. - Exception handling added for existing tests. Signed-off-by: Sam Gleske <sam.mxracer@gmail.com>
Some methods pass bytes instead of strings when parsing IP and port. I added extra type safety instead of assuming. Fixed some pylint issues. Trying to use net class in pxi. Signed-off-by: Sam Gleske <sam.mxracer@gmail.com>
Thanks for taking a look. I will likely continue chasing all of the broader changes in this pull request and continue to iterate until I succeed with an IPv6 supported build.
That's okay I appreciate your review.
This PR is raised. I think, in general, maintainer support is a hard requirement. Since I initially opened this pull request there have been a few merged PRs which added socket code which would have broken IPv6 if it was hypothetically available. Because of this, there needs to be supporting tests and an knowlege-sharing initiative for maintainers to actively preserve IPv6 support with new contributions. I am happy to produce some sort of written guide or whatever written form makes sense for maintainers. I mentioned this in an earlier comment but I am reiterating it here. I need maintainers to advise on how they want this done. At some point, I'll come back to this and dig in again (I'm away on vacation at the moment and this change start to finish has been spare time with exception for updating the PR once). |
|
@jjyao can you clarify what adding that label means? |
|
This pull request has been automatically marked as stale because it has not had You can always ask for help on our discussion forum or Ray's public slack channel. If you'd like to keep this open, just leave any comment, and the stale label will be removed. |
|
Not stale; maintainers please label so stale bot does not mark as stale. See also my previous questions |
|
This pull request has been automatically marked as stale because it has not had You can always ask for help on our discussion forum or Ray's public slack channel. If you'd like to keep this open, just leave any comment, and the stale label will be removed. |
|
Not stale; maintainers please label so stale bot does not mark as stale. See also my previous questions |
|
This pull request has been automatically marked as stale because it has not had You can always ask for help on our discussion forum or Ray's public slack channel. If you'd like to keep this open, just leave any comment, and the stale label will be removed. |
|
Not stale; maintainers please label so stale bot does not mark as stale. See also my previous questions |
|
To be clear: I do not plan to resolve conflicts or work on this further until some maintainers come out of the wood work to talk with me about these changes. I've requested multiple times in this pull request for a maintainer to chime in and there appears to be silence. Because of this, I do not find it is worth my time to keep resolving conflicts with no maintainer conversation; my time is not free. You're welcome to reach out to me in this pull request or you may advise another method that's preferred. |
|
@samrocketman thanks a ton for starting this work, and sorry about neglecting this! I think we should do a quick call (or email / Slack) discussion to figure out the path forward. Some folks at ByteDance recently mentioned they have an implementation of this feature as well and are interested in contributing it. Could you email me at robertnishihara [at] gmail.com or message me on the Ray Slack to have a higher bandwidth conversation? FYI @richardliaw @jjyao |
|
@robertnishihara emailed |
|
Update: today at 2pm EDT (18:00 UTC-0) there was an in-person meeting between maintainers and I where we went over the current pull request and general information on how to support IPv6. Much of what I've already shared in this pull request was shared within the meeting. |
|
You need this diff as well or something similar |
|
@psaab mind opening a PR to my fork so that you get attributed for that diff? |
|
This pull request has been automatically marked as stale because it has not had You can always ask for help on our discussion forum or Ray's public slack channel. If you'd like to keep this open, just leave any comment, and the stale label will be removed. |
|
@samrocketman FYI @Yicheng-Lu-llll has been doing some work on this in #55129, #55281, #55282. |
|
Closing this in lieu of the other PRs from @Yicheng-Lu-llll |
Why are these changes needed?
Provides support for IPv6 only and dual-stack IPv4/IPv6.
Inspired by this discussion https://discuss.ray.io/t/lack-of-ipv6-support/2323
IPv6 is much more common than it used to be; in some cases the only available stack is IPv6. Commonly machines come with dual-stack connectivity enabled (IPv4 and IPv6 available at the same time).
I've already reviewed the source code of kuberay. I've determined due to its use of native Go net methods kuberay already supports IPv6.
Design
python/ray/_private/net.py. Centralize networking code to make it easier to support IPv6 alongside IPv4. This is simpler and clarifies where in your network code you need to support this when using low level python sockets.Shell code
I notice the following scripts may or may not work with IPv6. They should probably updated using bash parameter expansion.
doc/source/cluster/doc_code/slurm-basic.shdoc/source/cluster/doc_code/slurm-template.shIf I update them; it will be in a separate PR.
For example, the following bash code properly parses both IPv4 and IPv6
ip:portinto two parts.Related issue number
I don't know if it will close all of these issues but I did an issues search for
ipv6.Duplicated by
Checks
Initially, this pull request is in-draft. This is to enable reviewing sooner and manual testing to occur. Automated tests to follow.
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.doc/source/tune/api/under the corresponding.rstfile.