ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
10 days	Replace ROBJECT_EMBED by ROBJECT_HEAP	Jean Boussier
	The embed layout is way more common than the heap one, especially since WVA. I think it makes for more readable code to inverse the flag.
11 days	Fix ObjectSpace.count_objects to allocate all symbols it uses eagerly	Benoit Daloze
	* To not count them as program allocations. * Similar to https://github.com/ruby/ruby/pull/13906
11 days	Fix deadlock when malloc in Ractor lock	Peter Zhu
	If we malloc when the current Ractor is locked, we can deadlock because GC requires VM lock and Ractor barrier. If another Ractor is waiting on this Ractor lock, then it will deadlock because the other Ractor will never join the barrier. For example, this script deadlocks: r = Ractor.new do loop do Ractor::Port.new end end 100000.times do \|i\| r.send(nil) puts i end On debug builds, it fails with this assertion error: vm_sync.c:75: Assertion Failed: vm_lock_enter:cr->sync.locked_by != rb_ractor_self(cr) On non-debug builds, we can see that it deadlocks in the debugger: Main Ractor: frame #3: 0x000000010021fdc4 miniruby`rb_native_mutex_lock(lock=<unavailable>) at thread_pthread.c:115:14 frame #4: 0x0000000100193eb8 miniruby`ractor_send0 [inlined] ractor_lock(r=<unavailable>, file=<unavailable>, line=1180) at ractor.c:73:5 frame #5: 0x0000000100193eb0 miniruby`ractor_send0 [inlined] ractor_send_basket(ec=<unavailable>, rp=0x0000000131092840, b=0x000000011c63de80, raise_on_error=true) at ractor_sync.c:1180:5 frame #6: 0x0000000100193eac miniruby`ractor_send0(ec=<unavailable>, rp=0x0000000131092840, obj=4, move=<unavailable>, raise_on_error=true) at ractor_sync.c:1211:5 Second Ractor: frame #2: 0x00000001002208d0 miniruby`rb_ractor_sched_barrier_start [inlined] rb_native_cond_wait(cond=<unavailable>, mutex=<unavailable>) at thread_pthread.c:221:13 frame #3: 0x00000001002208cc miniruby`rb_ractor_sched_barrier_start(vm=0x000000013180d600, cr=0x0000000131093460) at thread_pthread.c:1438:13 frame #4: 0x000000010028a328 miniruby`rb_vm_barrier at vm_sync.c:262:13 [artificial] frame #5: 0x00000001000dfa6c miniruby`gc_start [inlined] rb_gc_vm_barrier at gc.c:179:5 frame #6: 0x00000001000dfa68 miniruby`gc_start [inlined] gc_enter(objspace=0x000000013180fc00, event=gc_enter_event_start, lock_lev=<unavailable>) at default.c:6636:9 frame #7: 0x00000001000dfa48 miniruby`gc_start(objspace=0x000000013180fc00, reason=<unavailable>) at default.c:6361:5 frame #8: 0x00000001000e3fd8 miniruby`objspace_malloc_increase_body [inlined] garbage_collect(objspace=0x000000013180fc00, reason=512) at default.c:6341:15 frame #9: 0x00000001000e3fa4 miniruby`objspace_malloc_increase_body [inlined] garbage_collect_with_gvl(objspace=0x000000013180fc00, reason=512) at default.c:6741:16 frame #10: 0x00000001000e3f88 miniruby`objspace_malloc_increase_body(objspace=0x000000013180fc00, mem=<unavailable>, new_size=<unavailable>, old_size=<unavailable>, type=<unavailable>) at default.c:8007:13 frame #11: 0x00000001000e3c44 miniruby`rb_gc_impl_malloc [inlined] objspace_malloc_fixup(objspace=0x000000013180fc00, mem=0x000000011c700000, size=12582912) at default.c:8085:5 frame #12: 0x00000001000e3c30 miniruby`rb_gc_impl_malloc(objspace_ptr=0x000000013180fc00, size=12582912) at default.c:8182:12 frame #13: 0x00000001000d4584 miniruby`ruby_xmalloc [inlined] ruby_xmalloc_body(size=<unavailable>) at gc.c:5128:12 frame #14: 0x00000001000d4568 miniruby`ruby_xmalloc(size=<unavailable>) at gc.c:5118:34 frame #15: 0x00000001001eb184 miniruby`rb_st_init_existing_table_with_size(tab=0x000000011c2b4b40, type=<unavailable>, size=<unavailable>) at st.c:559:39 frame #16: 0x00000001001ebc74 miniruby`rebuild_table_if_necessary [inlined] rb_st_init_table_with_size(type=0x00000001004f4a78, size=524287) at st.c:585:5 frame #17: 0x00000001001ebc5c miniruby`rebuild_table_if_necessary [inlined] rebuild_table(tab=0x000000013108e2f0) at st.c:753:19 frame #18: 0x00000001001ebbfc miniruby`rebuild_table_if_necessary(tab=0x000000013108e2f0) at st.c:1125:9 frame #19: 0x00000001001eba08 miniruby`rb_st_insert(tab=0x000000013108e2f0, key=262144, value=4767566624) at st.c:1143:5 frame #20: 0x0000000100194b84 miniruby`ractor_port_initialzie [inlined] ractor_add_port(r=0x0000000131093460, id=262144) at ractor_sync.c:399:9 frame #21: 0x0000000100194b58 miniruby`ractor_port_initialzie [inlined] ractor_port_init(rpv=4750065560, r=0x0000000131093460) at ractor_sync.c:87:5 frame #22: 0x0000000100194b34 miniruby`ractor_port_initialzie(self=4750065560) at ractor_sync.c:103:12
11 days	Get rid of rb_obj_set_shape_id	Jean Boussier
	Now that the shape_id has been unified across all types this helper function doesn't do much over `RBASIC_SET_SHAPE_ID`. It still check if the write is needed, but it doesn't seem useful in places where it's used.
2025-08-21	Remove dead rb_obj_is_main_ractor	Peter Zhu

2025-08-18	Output array shared root flag in rb_raw_obj_info_buitin_type	Peter Zhu

2025-08-18	Move flags for arrays out of if statements in rb_raw_obj_info_buitin_type	Peter Zhu

2025-08-18	Remove impossible case in rb_raw_obj_info_buitin_type for array	Peter Zhu
	Since we handle embedded arrays in the if statement above, we don't need to handle it here.
2025-08-15	Don't free Ractors in GC shutdown	John Hawthorn
	rb_gc_shutdown_call_finalizer_p returns false for threads and fibers, so it should probably do the same for all Ractors (not just the main one). This hopefully mitigates a bug where, at exit, rb_ractor_terminate_all gets all Ractors to stop before continuing with the shutdown process. However when vm->ractor.cnt reaches 1, the native threads may still be running code at the end co_start, which reads/locks on th->ractor->threads.sched, so the Ractor is not safe to free. A better solution might be to ensure that all native threads end up stopped or otherwise parked before this part of the shutdown, however that would be a bit more involved.
2025-08-13	imemo_fields: store owner object in RBasic.klass	Jean Boussier
	It is much more convenient than storing the klass, especially when dealing with `object_id` as it allows to update the id2ref table without having to dereference the owner, which may be garbage at that point.
2025-08-12	RTypedData: keep direct reference to IMEMO/fields	Jean Boussier
	Similar to f3206cc79bec2fd852e81ec56de59f0a67ab32b7 but for TypedData. It's quite common for TypedData objects to have a mix of reference in their struct and some ivars. Since we do happen to have 8B free in the RtypedData struct, we could use it to keep a direct reference to the IMEMO/fields saving having to synchronize the VM and lookup the `gen_fields_tbl` on every ivar access. For old school Data classes however, we don't have free space, but this API is soft-deprecated and no longer very common.
2025-08-11	Fix return value of setting in GC.config	Peter Zhu
	gc_config_set returned rb_gc_impl_config_get, but gc_config_get also added the implementation key to the return value. This caused the return value of GC.config to differ depending on whether the optional hash argument is provided or not.
2025-08-08	Fix id2ref table build when GC in progress	John Hawthorn
	Previously, if GC was in progress when we're initially building the id2ref table, it could see the empty table and then crash when trying to remove ids from it. This commit fixes the bug by only publishing the table after GC is done. Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2025-08-08	object_id_to_ref: complete incremental GC before iterating	Jean Boussier
	Otherwise dealing with garbage objects is tricky.
2025-08-07	symbol.c: use `rb_gc_mark_and_move` over `rb_gc_location`	Jean Boussier
	The `p->field = rb_gc_location(p->field)` isn't ideal because it means all references are rewritten on compaction, regardless of whether the referenced object has moved. This isn't good for caches nor for Copy-on-Write. `rb_gc_mark_and_move` avoid needless writes, and most of the time allow to have a single function for both marking and updating references.
2025-08-06	Struct: keep direct reference to IMEMO/fields when space allows	Jean Boussier
	It's not rare for structs to have additional ivars, hence are one of the most common, if not the most common type in the `gen_fields_tbl`. This can cause Ractor contention, but even in single ractor mode means having to do a hash lookup to access the ivars, and increase GC work. Instead, unless the struct is perfectly right sized, we can store a reference to the associated IMEMO/fields object right after the last struct member. ``` compare-ruby: ruby 3.5.0dev (2025-08-06T12:50:36Z struct-ivar-fields-2 9a30d141a1) +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-08-06T12:57:59Z struct-ivar-fields-2 2ff3ec237f) +PRISM [arm64-darwin24] warming up..... \| \|compare-ruby\|built-ruby\| \|:---------------------\|-----------:\|---------:\| \|member_reader \| 590.317k\| 579.246k\| \| \| 1.02x\| -\| \|member_writer \| 543.963k\| 527.104k\| \| \| 1.03x\| -\| \|member_reader_method \| 213.540k\| 213.004k\| \| \| 1.00x\| -\| \|member_writer_method \| 192.657k\| 191.491k\| \| \| 1.01x\| -\| \|ivar_reader \| 403.993k\| 569.915k\| \| \| -\| 1.41x\| ``` Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-08-01	Fix rb_shape_transition_object_id transition to TOO_COMPLEX	Jean Boussier
	If `get_next_shape_internal` fail to return a shape, we must transitiont to a complex shape. `shape_transition_object_id` mistakenly didn't. Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2025-08-01	Make `RClass.cc_table` a managed object	Jean Boussier
	For now this doesn't change anything, but now that the table is managed by GC, it opens the door to use RCU when in multi-ractor mode, hence allow unsynchornized reads.
2025-08-01	Use `rb_gc_mark_weak` for `cc->klass`.	Jean Boussier
	One of the biggest remaining contention point is `RClass.cc_table`. The logical solution would be to turn it into a managed object, so we can use an RCU strategy, given it's read heavy. However, that's not currently possible because the table can't be freed before the owning class, given the class free function MUST go over all the CC entries to invalidate them. However if the `CC->klass` reference is weak marked, then the GC will take care of setting the reference to `Qundef`.
2025-07-30	Don't check the symbol's fstr at shutdown	Peter Zhu
	During Ruby's shutdown, we no longer need to check the fstr of the symbol because we don't use the fstr anymore for freeing the symbol. This can also fix the following ASAN error: ==2721247==ERROR: AddressSanitizer: use-after-poison on address 0x75fa90a627b8 at pc 0x64a7b06fb4bc bp 0x7ffdf95ba9b0 sp 0x7ffdf95ba9a8 READ of size 8 at 0x75fa90a627b8 thread T0 #0 0x64a7b06fb4bb in RB_BUILTIN_TYPE include/ruby/internal/value_type.h:191:30 #1 0x64a7b06fb4bb in rb_gc_shutdown_call_finalizer_p gc.c:357:18 #2 0x64a7b06fb4bb in rb_gc_impl_shutdown_call_finalizer gc/default/default.c:3045:21 #3 0x64a7b06fb4bb in rb_objspace_call_finalizer gc.c:1739:5 #4 0x64a7b06ca1b2 in rb_ec_finalize eval.c:165:5 #5 0x64a7b06ca1b2 in rb_ec_cleanup eval.c:256:5 #6 0x64a7b06c98a3 in ruby_cleanup eval.c:179:12
2025-07-21	Remove dsymbol_fstr_hash	Peter Zhu
	We don't need to delay the freeing of the fstr for the symbol if we store the hash of the fstr in the dynamic symbol and we use compare-by-identity for removing the dynamic symbol from the sym_set.
2025-07-21	Convert global symbol table to concurrent set	Peter Zhu

2025-07-16	Add a comment to count_objects to prevent future regression	Yusuke Endoh

2025-07-16	Prevent ObjectSpace.count_objects from allocating extra arrays	Yusuke Endoh
	`ObjectSpace.count_objects` could cause an unintended array allocation. It returns a hash like `{ :T_ARRAY => 100, :T_STRING => 100, ... }`, so it creates the key symbol (e.g., `:T_STRING`) for the first time. On rare occations, this symbol creation internally allocates a new array for symbol management. This led to a problematic side effect where calling `count_objects` twice in a row could produce inconsistent results: the first call would trigger the hidden array allocation, and the second call would then report an increased count for `:T_ARRAY`. This behavior caused test failures in `test/ruby/test_allocation.rb`, which performs a baseline measurement before an operation and then asserts the exact number of new allocations. https://rubyci.s3.amazonaws.com/openbsd-current/ruby-master/log/20250716T053005Z.fail.html.gz > 1) Failure: > TestAllocation::ProcCall::WithBlock#test_ruby2_keywords [...]: > Expected 1 array allocations for "r2k.(1, a: 2, &block)", but 2 arrays allocated. This change resolves the issue by pre-interning all key symbols used by `ObjectSpace.count_objects` before its counting. This eliminates the side effect and ensures the stability of allocation-sensitive tests. Co-authored-by: Koichi Sasada <ko1@atdot.net>
2025-07-14	YJIT: Set code mem permissions in bulk	Kunshan Wang
	Some GC modules, notably MMTk, support parallel GC, i.e. multiple GC threads work in parallel during a GC. Currently, when two GC threads scan two iseq objects simultaneously when YJIT is enabled, both threads will attempt to borrow `CodeBlock::mem_block`, which will result in panic. This commit makes one part of the change. We now set the YJIT code memory to writable in bulk before the reference-updating phase, and reset it to executable in bulk after the reference-updating phase. Previously, YJIT lazily sets memory pages writable while updating object references embedded in JIT-compiled machine code, and sets the memory back to executable by calling `mark_all_executable`. This approach is inherently unfriendly to parallel GC because (1) it borrows `CodeBlock::mem_block`, and (2) it sets the whole `CodeBlock` as executable which races with other GC threads that are updating other iseq objects. It also has performance overhead due to the frequent invocation of system calls. We now set the permission of all the code memory in bulk before and after the reference updating phase. Multiple GC threads can now perform raw memory writes in parallel. We should also see performance improvement during moving GC because of the reduced number of `mprotect` system calls.
2025-06-30	Inline ASAN poison functions when ASAN is not enabled	Peter Zhu
	The ASAN poison functions was always defined in gc.c, even if ASAN was not enabled. This made function calls to happen all the time even if ASAN is not enabled. This commit defines these functions as empty macros when ASAN is not enabled.
2025-06-27	Extract Ractor safe table used for frozen strings	Peter Zhu
	This commit extracts the Ractor safe table used for frozen strings into ractor_safe_table.c, which will allow it to be used elsewhere, including for the global symbol table.
2025-06-26	variable.c: Refactor `generic_field_set` / `generic_ivar_set`	Jean Boussier
	These two functions are very similar, they can share most of their logic.
2025-06-25	Move RUBY_ATOMIC_VALUE_LOAD to ruby_atomic.h	Peter Zhu
	Deduplicates RUBY_ATOMIC_VALUE_LOAD by moving it to ruby_atomic.h.
2025-06-23	Ensure `RCLASS_CLASSEXT_TBL` accessor is always used.	Jean Boussier

2025-06-17	Refactor generic fields to use `T_IMEMO/fields` objects.	Jean Boussier
	Followup: https://github.com/ruby/ruby/pull/13589 This simplify a lot of things, as we no longer need to manually manage the memory, we can use the Read-Copy-Update pattern and avoid numerous race conditions. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/13626
2025-06-17	Update vm->self location and mark it in vm.c for consistency	Satoshi Tagomori
	Notes: Merged: https://github.com/ruby/ruby/pull/13630
2025-06-15	Fix typo in rb_bug message for unreachable code	ydah
	Notes: Merged: https://github.com/ruby/ruby/pull/13620
2025-06-13	Get rid of FL_EXIVAR	Jean Boussier
	Now that the shape_id gives us all the same information, it's no longer needed. Notes: Merged: https://github.com/ruby/ruby/pull/13612
2025-06-13	Use the `shape_id` rather than `FL_EXIVAR`	Jean Boussier
	We still keep setting `FL_EXIVAR` so that `rb_shape_verify_consistency` can detect discrepancies. Notes: Merged: https://github.com/ruby/ruby/pull/13612
2025-06-13	Enforce consistency between shape_id and FL_EXIVAR	Jean Boussier
	The FL_EXIVAR is a bit redundant with the shape_id. Now that the `shape_id` is embedded in all objects on all archs, we can cheaply check if an object has any fields with a simple bitmask. Notes: Merged: https://github.com/ruby/ruby/pull/13612
2025-06-12	Turn `rb_classext_t.fields` into a T_IMEMO/class_fields	Jean Boussier
	This behave almost exactly as a T_OBJECT, the layout is entirely compatible. This aims to solve two problems. First, it solves the problem of namspaced classes having a single `shape_id`. Now each namespaced classext has an object that can hold the namespace specific shape. Second, it open the door to later make class instance variable writes atomics, hence be able to read class variables without locking the VM. In the future, in multi-ractor mode, we can do the write on a copy of the `fields_obj` and then atomically swap it. Considerations: - Right now the `RClass` shape_id is always synchronized, but with namespace we should likely mark classes that have multiple namespace with a specific shape flag. Notes: Merged: https://github.com/ruby/ruby/pull/13411
2025-06-09	Take file and line in GC VM locks	Peter Zhu
	This commit adds file and line to GC VM locking functions for debugging purposes and adds upper case macros to pass __FILE__ and __LINE__. Notes: Merged: https://github.com/ruby/ruby/pull/13550
2025-06-09	Get rid of `gen_fields_tbl.fields_count`	Jean Boussier
	This data is redundant because the shape already contains both the length and capacity of the object's fields. So it both waste space and create the possibility of a desync between the two. We also do not need to initialize everything to Qundef, this seem to be a left-over from pre-shape instance variables. Notes: Merged: https://github.com/ruby/ruby/pull/13561
2025-06-09	Optimize callcache invalidation for refinements	alpaca-tc
	Fixes [Bug #21201] This change addresses a performance regression where defining methods inside `refine` blocks caused severe slowdowns. The issue was due to `rb_clear_all_refinement_method_cache()` triggering a full object space scan via `rb_objspace_each_objects` to find and invalidate affected callcaches, which is very inefficient. To fix this, I introduce `vm->cc_refinement_table` to track callcaches related to refinements. This allows us to invalidate only the necessary callcaches without scanning the entire heap, resulting in significant performance improvement. Notes: Merged: https://github.com/ruby/ruby/pull/13077
2025-06-07	Simplify `rb_gc_rebuild_shape`	Jean Boussier
	Now that there no longer multiple shape roots, all we need to do when moving an object from one slot to the other is to update the `heap_index` part of the shape_id. Since this never need to create a shape transition, it will always work and never result in a complex shape. Notes: Merged: https://github.com/ruby/ruby/pull/13556
2025-06-07	ignore confirming belonging while finrializer	Koichi Sasada
	A finalizer registerred in Ractor A can be invoked in B. ```ruby require "tempfile" r = Ractor.new{ 10_000.times{\|i\| Tempfile.new(["file_to_require_from_ractor#{i}", ".rb"]) } } sleep 0.1 ``` For example, above script makes tempfiles which have finalizers on Ractor r, but at the end of the process, main Ractor will invoke finalizers and it violates belonging check. This patch just ignore the belonging check to avoid CI failure. Of course it violates Ractor's isolation and wrong workaround. This issue will be solved with Ractor local GC. Notes: Merged: https://github.com/ruby/ruby/pull/13542
2025-06-06	fix `rp(obj)` for any object	Koichi Sasada
	Now `rp(obj)` doesn't work if the `obj` is out-of-heap because of `asan_unpoisoning_object()`, so this patch solves it. Also add pointer information and type information to show. Notes: Merged: https://github.com/ruby/ruby/pull/13534
2025-06-05	Get rid of `rb_shape_t.flags`	Jean Boussier
	Now all flags are only in the `shape_id_t`, and can all be checked without needing to dereference a pointer. Notes: Merged: https://github.com/ruby/ruby/pull/13515
2025-06-04	Remove dead rb_malloc_info_show_results	Peter Zhu
	Notes: Merged: https://github.com/ruby/ruby/pull/13516
2025-06-02	Make FrozenCore a plain T_CLASS	John Hawthorn
	Notes: Merged: https://github.com/ruby/ruby/pull/13458
2025-05-31	`Ractor::Port`	Koichi Sasada
	* Added `Ractor::Port` * `Ractor::Port#receive` (support multi-threads) * `Rcator::Port#close` * `Ractor::Port#closed?` * Added some methods * `Ractor#join` * `Ractor#value` * `Ractor#monitor` * `Ractor#unmonitor` * Removed some methods * `Ractor#take` * `Ractor.yield` * Change the spec * `Racotr.select` You can wait for multiple sequences of messages with `Ractor::Port`. ```ruby ports = 3.times.map{ Ractor::Port.new } ports.map.with_index do \|port, ri\| Ractor.new port,ri do \|port, ri\| 3.times{\|i\| port << "r#{ri}-#{i}"} end end p ports.each{\|port\| pp 3.times.map{port.receive}} ``` In this example, we use 3 ports, and 3 Ractors send messages to them respectively. We can receive a series of messages from each port. You can use `Ractor#value` to get the last value of a Ractor's block: ```ruby result = Ractor.new do heavy_task() end.value ``` You can wait for the termination of a Ractor with `Ractor#join` like this: ```ruby Ractor.new do some_task() end.join ``` `#value` and `#join` are similar to `Thread#value` and `Thread#join`. To implement `#join`, `Ractor#monitor` (and `Ractor#unmonitor`) is introduced. This commit changes `Ractor.select()` method. It now only accepts ports or Ractors, and returns when a port receives a message or a Ractor terminates. We removes `Ractor.yield` and `Ractor#take` because: * `Ractor::Port` supports most of similar use cases in a simpler manner. * Removing them significantly simplifies the code. We also change the internal thread scheduler code (thread_pthread.c): * During barrier synchronization, we keep the `ractor_sched` lock to avoid deadlocks. This lock is released by `rb_ractor_sched_barrier_end()` which is called at the end of operations that require the barrier. * fix potential deadlock issues by checking interrupts just before setting UBF. https://bugs.ruby-lang.org/issues/21262 Notes: Merged: https://github.com/ruby/ruby/pull/13445
2025-05-29	Read {max_iv,variation}_count from prime classext	John Hawthorn
	MAX_IV_COUNT is a hint which determines the size of variable width allocation we should use for a given class. We don't need to scope this by namespace, if we end up with larger builtin objects on some namespaces that isn't a user-visible problem, just extra memory use. Similarly variation_count is used to track if a given object has had too many branches in shapes it has used, and to use too_complex when that happens. That's also just a hint, so we can use the same value across namespaces without it being visible to users. Previously variation_count was being incremented (written to) on the RCLASS_EXT_READABLE ext, which seems incorrect if we wanted it to be different across namespaces Notes: Merged: https://github.com/ruby/ruby/pull/13434
2025-05-27	Rename `rb_shape_set_shape_id` in `rb_obj_set_shape_id`	Jean Boussier
	Notes: Merged: https://github.com/ruby/ruby/pull/13450
2025-05-27	Refactor `rb_shape_too_complex_p` to take a `shape_id_t`.	Jean Boussier
	Notes: Merged: https://github.com/ruby/ruby/pull/13450