ruby-changes:67063
From: Jeremy <ko1@a...>
Date: Sat, 7 Aug 2021 02:15:19 +0900 (JST)
Subject: [ruby-changes:67063] 1a05dc03f9 (master): Make backtrace generation work outward from current frame
https://git.ruby-lang.org/ruby.git/commit/?id=1a05dc03f9 From 1a05dc03f953830564c272665c47a61e53550f3e Mon Sep 17 00:00:00 2001 From: Jeremy Evans <code@j...> Date: Wed, 21 Jul 2021 16:44:56 -0700 Subject: Make backtrace generation work outward from current frame This fixes multiple bugs found in the partial backtrace optimization added in 3b24b7914c16930bfadc89d6aff6326a51c54295. These bugs occurs when passing a start argument to caller where the start argument lands on a iseq frame without a pc. Before this commit, the following code results in the same line being printed twice, both for the #each method. ```ruby def a; [1].group_by { b } end def b; puts(caller(2, 1).first, caller(3, 1).first) end a ``` After this commit and in Ruby 2.7, the lines are different, with the first line being for each and the second for group_by. Before this commit, the following code can either segfault or result in an infinite loop: ```ruby def foo caller_locations(2, 1).inspect # segfault caller_locations(2, 1)[0].path # infinite loop end 1.times.map { 1.times.map { foo } } ``` After this commit, this code works correctly. This commit completely refactors the backtrace handling. Instead of processing the backtrace from the outermost frame working in, process it from the innermost frame working out. This is much faster for partial backtraces, since you only access the control frames you need to in order to construct the backtrace. To handle cfunc frames in the new design, they start out with no location information. We increment a counter for each cfunc frame added. When an iseq frame with pc is accessed, after adding the iseq backtrace location, we use the location for the iseq backtrace location for all of the directly preceding cfunc backtrace locations. If the last backtrace line is a cfunc frame, we continue scanning for iseq frames until the end control frame, and use the location information from the first one for the trailing cfunc frames in the backtrace. As only rb_ec_partial_backtrace_object uses the new backtrace implementation, remove all of the function pointers and inline the functions. This makes the process easier to understand. Restore the Ruby 2.7 implementation of backtrace_each and use it for all the other functions that called backtrace_each other than rb_ec_partial_backtrace_object. All other cases requested the entire backtrace, so there is no advantage of using the new algorithm for those. Additionally, there are implicit assumptions in the other code that the backtrace processing works inward instead of outward. Remove the cfunc/iseq union in rb_backtrace_location_t, and remove the prev_loc member for cfunc. Both cfunc and iseq types can now have iseq and pc entries, so the location information can be accessed the same way for each. This avoids the need for a extra backtrace location entry to store an iseq backtrace location if the final entry in the backtrace is a cfunc. This is also what fixes the segfault and infinite loop issues in the above bugs. Here's Ruby pseudocode for the new algorithm, where start and length are the arguments to caller or caller_locations: ```ruby end_cf = VM.end_control_frame.next cf = VM.start_control_frame size = VM.num_control_frames - 2 bt = [] cfunc_counter = 0 if length.nil? || length > size length = size end while cf != end_cf && bt.size != length if cf.iseq? if cf.instruction_pointer? if start > 0 start -= 1 else bt << cf.iseq_backtrace_entry cf_counter.times do |i| bt[-1 - i].loc = cf.loc end cfunc_counter = 0 end end elsif cf.cfunc? if start > 0 start -= 1 else bt << cf.cfunc_backtrace_entry cfunc_counter += 1 end end cf = cf.prev end if cfunc_counter > 0 while cf != end_cf if (cf.iseq? && cf.instruction_pointer?) cf_counter.times do |i| bt[-i].loc = cf.loc end end cf = cf.prev end end ``` With the following benchmark, which uses a call depth of around 100 (common in many Ruby applications): ```ruby class T def test(depth, &block) if depth == 0 yield self else test(depth - 1, &block) end end def array Array.new end def first caller_locations(1, 1) end def full caller_locations end end t = T.new t.test((ARGV.first || 100).to_i) do Benchmark.ips do |x| x.report ('caller_loc(1, 1)') {t.first} x.report ('caller_loc') {t.full} x.report ('Array.new') {t.array} x.compare! end end ``` Results before commit: ``` Calculating ------------------------------------- caller_loc(1, 1) 281.159k (_ 0.7%) i/s - 1.426M in 5.073055s caller_loc 15.836k (_ 2.1%) i/s - 79.450k in 5.019426s Array.new 1.852M (_ 2.5%) i/s - 9.296M in 5.022511s Comparison: Array.new: 1852297.5 i/s caller_loc(1, 1): 281159.1 i/s - 6.59x (_ 0.00) slower caller_loc: 15835.9 i/s - 116.97x (_ 0.00) slower ``` Results after commit: ``` Calculating ------------------------------------- caller_loc(1, 1) 562.286k (_ 0.8%) i/s - 2.858M in 5.083249s caller_loc 16.402k (_ 1.0%) i/s - 83.200k in 5.072963s Array.new 1.853M (_ 0.1%) i/s - 9.278M in 5.007523s Comparison: Array.new: 1852776.5 i/s caller_loc(1, 1): 562285.6 i/s - 3.30x (_ 0.00) slower caller_loc: 16402.3 i/s - 112.96x (_ 0.00) slower ``` This shows that the speed of caller_locations(1, 1) has roughly doubled, and the speed of caller_locations with no arguments has improved slightly. So this new algorithm is significant faster, much simpler, and fixes bugs in the previous algorithm. Fixes [Bug #18053] --- test/ruby/test_backtrace.rb | 28 +++ vm_backtrace.c | 476 +++++++++++++++++--------------------------- 2 files changed, 209 insertions(+), 295 deletions(-) diff --git a/test/ruby/test_backtrace.rb b/test/ruby/test_backtrace.rb index 742463a..aa79db2 100644 --- a/test/ruby/test_backtrace.rb +++ b/test/ruby/test_backtrace.rb @@ -170,6 +170,34 @@ class TestBacktrace < Test::Unit::TestCase https://github.com/ruby/ruby/blob/trunk/test/ruby/test_backtrace.rb#L170 end end + def test_caller_limit_cfunc_iseq_no_pc + def self.a; [1].group_by { b } end + def self.b + [ + caller_locations(2, 1).first.base_label, + caller_locations(3, 1).first.base_label + ] + end + assert_equal({["each", "group_by"]=>[1]}, a) + end + + def test_caller_location_inspect_cfunc_iseq_no_pc + def self.foo + @res = caller_locations(2, 1).inspect + end + @line = __LINE__ + 1 + 1.times.map { 1.times.map { foo } } + assert_equal("[\"#{__FILE__}:#{@line}:in `times'\"]", @res) + end + + def test_caller_location_path_cfunc_iseq_no_pc + def self.foo + @res = caller_locations(2, 1)[0].path + end + 1.times.map { 1.times.map { foo } } + assert_equal(__FILE__, @res) + end + def test_caller_locations cs = caller(0); locs = caller_locations(0).map{|loc| loc.to_s diff --git a/vm_backtrace.c b/vm_backtrace.c index ac620c6..fc9f701 100644 --- a/vm_backtrace.c +++ b/vm_backtrace.c @@ -119,16 +119,9 @@ typedef struct rb_backtrace_location_struct { https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L119 LOCATION_TYPE_CFUNC, } type; - union { - struct { - const rb_iseq_t *iseq; - const VALUE *pc; - } iseq; - struct { - ID mid; - struct rb_backtrace_location_struct *prev_loc; - } cfunc; - } body; + const rb_iseq_t *iseq; + const VALUE *pc; + ID mid; } rb_backtrace_location_t; struct valued_frame_info { @@ -148,9 +141,13 @@ location_mark_entry(rb_backtrace_location_t *fi) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L141 { switch (fi->type) { case LOCATION_TYPE_ISEQ: - rb_gc_mark_movable((VALUE)fi->body.iseq.iseq); + rb_gc_mark_movable((VALUE)fi->iseq); break; case LOCATION_TYPE_CFUNC: + if (fi->iseq) { + rb_gc_mark_movable((VALUE)fi->iseq); + } + break; default: break; } @@ -188,10 +185,10 @@ location_lineno(rb_backtrace_location_t *loc) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L185 { switch (loc->type) { case LOCATION_TYPE_ISEQ: - return calc_lineno(loc->body.iseq.iseq, loc->body.iseq.pc); + return calc_lineno(loc->iseq, loc->pc); case LOCATION_TYPE_CFUNC: - if (loc->body.cfunc.prev_loc) { - return location_lineno(loc->body.cfunc.prev_loc); + if (loc->iseq && loc->pc) { + return calc_lineno(loc->iseq, loc->pc); } return 0; default: @@ -219,9 +216,9 @@ location_label(rb_backtrace_location_t *loc) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L216 { switch (loc->type) { case LOCATION_TYPE_ISEQ: - return loc->body.iseq.iseq->body->location.label; + return loc->iseq->body->location.label; case LOCATION_TYPE_CFUNC: - return rb_id2str(loc->body.cfunc.mid); + return rb_id2str(loc->mid); default: rb_bug("location_label: unreachable"); UNREACHABLE; @@ -266,9 +263,9 @@ location_base_label(rb_backtrace_location_t *loc) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L263 { switch (loc->type) { case LOCATION_TYPE_ISEQ: - return loc->body.iseq.iseq->body->location.base_label; + return loc->iseq->body->location.base_label; case LOCATION_TYPE_CFUNC: - return rb_id2str(loc->body.cfunc.mid); + return rb_id2str(loc->mid); default: rb_bug("location_base_label: unreachable"); UNREACHABLE; @@ -291,10 +288,10 @@ location_path(rb_backtrace_location_t *loc) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L288 { switch (loc->type) { case LOCATION_TYPE_ISEQ: - return rb_iseq_path(loc->body.iseq.iseq); + return rb_iseq_path(loc->iseq); case LOCATION_TYPE_CFUNC: - if (loc->body.cfun (... truncated) -- ML: ruby-changes@q... Info: http://www.atdot.net/~ko1/quickml/