[#117021] [Ruby master Feature#20318] Pattern matching `case ... in` support for triple-dot arguments — "bradgessler (Brad Gessler) via ruby-core" <ruby-core@...>

Issue #20318 has been reported by bradgessler (Brad Gessler).

11 messages 2024/03/01

[#117027] [Ruby master Bug#20319] Singleton class is being frozen lazily in some cases — "andrykonchin (Andrew Konchin) via ruby-core" <ruby-core@...>

Issue #20319 has been reported by andrykonchin (Andrew Konchin).

8 messages 2024/03/01

[#117036] [Ruby master Bug#20321] `require': cannot load such file — "Justman10000 (Justin Nogossek) via ruby-core" <ruby-core@...>

Issue #20321 has been reported by Justman10000 (Justin Nogossek).

14 messages 2024/03/01

[#117067] [Ruby master Feature#20326] Add an `undefined` for use as a default argument. — "shan (Shannon Skipper) via ruby-core" <ruby-core@...>

Issue #20326 has been reported by shan (Shannon Skipper).

7 messages 2024/03/06

[#117115] [Ruby master Feature#20331] Should parser warn hash duplication and when clause? — "yui-knk (Kaneko Yuichiro) via ruby-core" <ruby-core@...>

Issue #20331 has been reported by yui-knk (Kaneko Yuichiro).

11 messages 2024/03/12

[#117147] [Ruby master Feature#20335] `Thread.each_caller_location` should accept the same arguments as `caller` and `caller_locations` — "byroot (Jean Boussier) via ruby-core" <ruby-core@...>

Issue #20335 has been reported by byroot (Jean Boussier).

13 messages 2024/03/14

[#117157] [Ruby master Misc#20336] DevMeeting-2024-04-17 — "mame (Yusuke Endoh) via ruby-core" <ruby-core@...>

Issue #20336 has been reported by mame (Yusuke Endoh).

15 messages 2024/03/14

[#117212] [Ruby master Feature#20345] Add `--target-rbconfig` option to mkmf — "katei (Yuta Saito) via ruby-core" <ruby-core@...>

Issue #20345 has been reported by katei (Yuta Saito).

9 messages 2024/03/18

[#117240] [Ruby master Feature#20350] Return chilled string from Symbol#to_s — "Dan0042 (Daniel DeLorme) via ruby-core" <ruby-core@...>

Issue #20350 has been reported by Dan0042 (Daniel DeLorme).

10 messages 2024/03/19

[#117288] [Ruby master Misc#20387] Meta-ticket for ASAN support — "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" <ruby-core@...>

Issue #20387 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).

10 messages 2024/03/22

[#117321] [Ruby master Bug#20393] `after_fork_ruby` clears all pending interrupts for both parent and child process. — "ioquatix (Samuel Williams) via ruby-core" <ruby-core@...>

Issue #20393 has been reported by ioquatix (Samuel Williams).

6 messages 2024/03/26

[#117324] [Ruby master Feature#20394] Add an offset parameter to `String#to_i` — "byroot (Jean Boussier) via ruby-core" <ruby-core@...>

Issue #20394 has been reported by byroot (Jean Boussier).

16 messages 2024/03/26

[#117341] [Ruby master Feature#20396] ObjectSpace.dump_all(string_value: false): skip dumping the String contents — "byroot (Jean Boussier) via ruby-core" <ruby-core@...>

Issue #20396 has been reported by byroot (Jean Boussier).

8 messages 2024/03/27

[#117390] [Ruby master Feature#20404] `2pi` — "mame (Yusuke Endoh) via ruby-core" <ruby-core@...>

Issue #20404 has been reported by mame (Yusuke Endoh).

9 messages 2024/03/31

[ruby-core:117177] [Ruby master Bug#20301] `Set#add?` does two hash look-ups

From: "Eregon (Benoit Daloze) via ruby-core" <ruby-core@...>
Date: 2024-03-14 15:04:39 UTC
List: ruby-core #117177
Issue #20301 has been updated by Eregon (Benoit Daloze).


That implementation using `size` is not thread-safe, even on CRuby AFAIK.
For example, if T2 calls `add?` with a new element while T1 calls `add?` with an existing element.
If T1 is just before `m = size` when T2 executes `add(o)`, then both threads return "element added" but T1 did not add an element (incorrect result).

The original code with two lookups does not have that race condition.
However it can have the race condition that two threads adding the same new element both return "element added".
`Hash#exchange_value` would fix that.

----------------------------------------
Bug #20301: `Set#add?` does two hash look-ups
https://bugs.ruby-lang.org/issues/20301#change-107267

* Author: AMomchilov (Alexander Momchilov)
* Status: Open
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
A common usage of `Set`s is to keep track of seen objects, and do something different whenever an object is seen for the first time, e.g.:

```ruby
SEEN_VALUES = Set.new
	
def receive_value(value)
	if SEEN_VALUES.add?(value)
		puts "Saw #{value} for the first time."
	else
		puts "Already seen #{value}, ignoring."
	end
end

receive_value(1) # Saw 1 for the first time.
receive_value(2) # Saw 2 for the first time.
receive_value(3) # Saw 3 for the first time.
receive_value(1) # Already seen 1, ignoring.
```

Readers might reasonably assume that `add?` is only looking up into the set a single time, but it's actually doing two separate look-ups! ([source](https://github.com/ruby/ruby/blob/c976cb5/lib/set.rb#L517-L525))

```rb
class Set
  def add?(o
    # 1. `include?(o)` looks up into `@hash`
    # 2. if the value isn't there, `add(o)` does a second look-up into `@hash`
    add(o) unless include?(o)
  end
end
```

This gets especially expensive if the values are large hash/arrays/objects, whose `#hash` is expensive to compute.

We can optimize this if it was possible to set a value in hash, *and* retrieve the value that was already there, in a single go. I propose adding `Hash#exchange_value` to do exactly that. If that existed, we can re-implement `#add?` as:

```rb
class Set
  def add?(o)
    # Only requires a single look-up into `@hash`!
    self unless @hash.exchange_value(o, true)
  end
```

Here's a proof-of-concept implementation: https://github.com/ruby/ruby/pull/10093

# Theory

How much of a benefit this has depends on 2 factors:

1. How much `#hash` is called, which depends on how many **new** objects are added to the set.
    * If every object is new, then `#hash` used to be called twice on every `#add?`.
        * This is where this improvement makes the biggest (2x!) change.
    * If every object has already been seen, then `#hash` was never being called twice before anyway, so there would be no improvement.
        * It's important to not regress in this case, because many use cases of sets don't deal with many distinct objects, but just need to do quick checks against an existing set.
    * Every other case lies somewhere in between those two, depending on the % of objects which are new.
2. How slow `#hash` is to compute for the key
    * If the hash is slow to compute, this change will make a bigger improvement
    * If the hash value is fast to compute, then it won't matter as much. Even if we called it half as much, it's a minority of the total time, so it won't have much net impact.

# Benchmark summary

|                           | All objects are new | All objects are preexisting |
|---------------------------|-------:|------:|
| objects with slow `#hash` | 100.0% | ~0.0% |
| objects with fast `#hash` |  24.5% |  4.6% |

As we see, this change makes a huge improvement the cases where it helps, and crucially, doesn't slow down the cases where it can't.

For the complete benchmark source code and results, see the PR: https://github.com/ruby/ruby/pull/10093



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

In This Thread