ruby-changes:67063

https://git.ruby-lang.org/ruby.git/commit/?id=1a05dc03f9

From 1a05dc03f953830564c272665c47a61e53550f3e Mon Sep 17 00:00:00 2001
From: Jeremy Evans <code@j...>
Date: Wed, 21 Jul 2021 16:44:56 -0700
Subject: Make backtrace generation work outward from current frame

This fixes multiple bugs found in the partial backtrace
optimization added in 3b24b7914c16930bfadc89d6aff6326a51c54295.
These bugs occurs when passing a start argument to caller where
the start argument lands on a iseq frame without a pc.

Before this commit, the following code results in the same
line being printed twice, both for the #each method.

```ruby
def a; [1].group_by { b } end
def b; puts(caller(2, 1).first, caller(3, 1).first) end
a
```

After this commit and in Ruby 2.7, the lines are different,
with the first line being for each and the second for group_by.

Before this commit, the following code can either segfault or
result in an infinite loop:

```ruby
def foo
  caller_locations(2, 1).inspect # segfault
  caller_locations(2, 1)[0].path # infinite loop
end

1.times.map { 1.times.map { foo } }
```

After this commit, this code works correctly.

This commit completely refactors the backtrace handling.
Instead of processing the backtrace from the outermost
frame working in, process it from the innermost frame
working out.  This is much faster for partial backtraces,
since you only access the control frames you need to in
order to construct the backtrace.

To handle cfunc frames in the new design, they start
out with no location information.  We increment a counter
for each cfunc frame added.  When an iseq frame with pc
is accessed, after adding the iseq backtrace location,
we use the location for the iseq backtrace location for
all of the directly preceding cfunc backtrace locations.

If the last backtrace line is a cfunc frame, we continue
scanning for iseq frames until the end control frame, and
use the location information from the first one for the
trailing cfunc frames in the backtrace.

As only rb_ec_partial_backtrace_object uses the new
backtrace implementation, remove all of the function
pointers and inline the functions.  This makes the
process easier to understand.

Restore the Ruby 2.7 implementation of backtrace_each and
use it for all the other functions that called
backtrace_each other than rb_ec_partial_backtrace_object.
All other cases requested the entire backtrace, so there
is no advantage of using the new algorithm for those.
Additionally, there are implicit assumptions in the other
code that the backtrace processing works inward instead
of outward.

Remove the cfunc/iseq union in rb_backtrace_location_t,
and remove the prev_loc member for cfunc.  Both cfunc and
iseq types can now have iseq and pc entries, so the
location information can be accessed the same way for each.
This avoids the need for a extra backtrace location entry
to store an iseq backtrace location if the final entry in
the backtrace is a cfunc. This is also what fixes the
segfault and infinite loop issues in the above bugs.

Here's Ruby pseudocode for the new algorithm, where start
and length are the arguments to caller or caller_locations:

```ruby
end_cf = VM.end_control_frame.next
cf = VM.start_control_frame
size = VM.num_control_frames - 2
bt = []
cfunc_counter = 0

if length.nil? || length > size
  length = size
end

while cf != end_cf && bt.size != length
  if cf.iseq?
    if cf.instruction_pointer?
      if start > 0
        start -= 1
      else
        bt << cf.iseq_backtrace_entry
        cf_counter.times do |i|
          bt[-1 - i].loc = cf.loc
        end
        cfunc_counter = 0
      end
    end
  elsif cf.cfunc?
    if start > 0
      start -= 1
    else
      bt << cf.cfunc_backtrace_entry
      cfunc_counter += 1
    end
  end

  cf = cf.prev
end

if cfunc_counter > 0
  while cf != end_cf
    if (cf.iseq? && cf.instruction_pointer?)
      cf_counter.times do |i|
        bt[-i].loc = cf.loc
      end
    end
    cf = cf.prev
  end
end
```

With the following benchmark, which uses a call depth of
around 100 (common in many Ruby applications):

```ruby
class T
  def test(depth, &block)
    if depth == 0
      yield self
    else
      test(depth - 1, &block)
    end
  end
  def array
    Array.new
  end
  def first
    caller_locations(1, 1)
  end
  def full
    caller_locations
  end
end

t = T.new
t.test((ARGV.first || 100).to_i) do
  Benchmark.ips do |x|
    x.report ('caller_loc(1, 1)') {t.first}
    x.report ('caller_loc') {t.full}
    x.report ('Array.new') {t.array}
    x.compare!
  end
end
```

Results before commit:

```
Calculating -------------------------------------
    caller_loc(1, 1)    281.159k (_ 0.7%) i/s -      1.426M in   5.073055s
          caller_loc     15.836k (_ 2.1%) i/s -     79.450k in   5.019426s
           Array.new      1.852M (_ 2.5%) i/s -      9.296M in   5.022511s

Comparison:
           Array.new:  1852297.5 i/s
    caller_loc(1, 1):   281159.1 i/s - 6.59x  (_ 0.00) slower
          caller_loc:    15835.9 i/s - 116.97x  (_ 0.00) slower
```

Results after commit:

```
Calculating -------------------------------------
    caller_loc(1, 1)    562.286k (_ 0.8%) i/s -      2.858M in   5.083249s
          caller_loc     16.402k (_ 1.0%) i/s -     83.200k in   5.072963s
           Array.new      1.853M (_ 0.1%) i/s -      9.278M in   5.007523s

Comparison:
           Array.new:  1852776.5 i/s
    caller_loc(1, 1):   562285.6 i/s - 3.30x  (_ 0.00) slower
          caller_loc:    16402.3 i/s - 112.96x  (_ 0.00) slower
```

This shows that the speed of caller_locations(1, 1) has roughly
doubled, and the speed of caller_locations with no arguments
has improved slightly.  So this new algorithm is significant faster,
much simpler, and fixes bugs in the previous algorithm.

Fixes [Bug #18053]
---
 test/ruby/test_backtrace.rb |  28 +++
 vm_backtrace.c              | 476 +++++++++++++++++---------------------------
 2 files changed, 209 insertions(+), 295 deletions(-)

diff --git a/test/ruby/test_backtrace.rb b/test/ruby/test_backtrace.rb
index 742463a..aa79db2 100644
--- a/test/ruby/test_backtrace.rb
+++ b/test/ruby/test_backtrace.rb
@@ -170,6 +170,34 @@ class TestBacktrace < Test::Unit::TestCase https://github.com/ruby/ruby/blob/trunk/test/ruby/test_backtrace.rb#L170
     end
   end
 
+  def test_caller_limit_cfunc_iseq_no_pc
+    def self.a; [1].group_by { b } end
+    def self.b
+      [
+        caller_locations(2, 1).first.base_label,
+        caller_locations(3, 1).first.base_label
+      ]
+    end
+    assert_equal({["each", "group_by"]=>[1]}, a)
+  end
+
+  def test_caller_location_inspect_cfunc_iseq_no_pc
+    def self.foo
+      @res = caller_locations(2, 1).inspect
+    end
+    @line = __LINE__ + 1
+    1.times.map { 1.times.map { foo } }
+    assert_equal("[\"#{__FILE__}:#{@line}:in `times'\"]", @res)
+  end
+
+  def test_caller_location_path_cfunc_iseq_no_pc
+    def self.foo
+      @res = caller_locations(2, 1)[0].path
+    end
+    1.times.map { 1.times.map { foo } }
+    assert_equal(__FILE__, @res)
+  end
+
   def test_caller_locations
     cs = caller(0); locs = caller_locations(0).map{|loc|
       loc.to_s
diff --git a/vm_backtrace.c b/vm_backtrace.c
index ac620c6..fc9f701 100644
--- a/vm_backtrace.c
+++ b/vm_backtrace.c
@@ -119,16 +119,9 @@ typedef struct rb_backtrace_location_struct { https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L119
 	LOCATION_TYPE_CFUNC,
     } type;
 
-    union {
-	struct {
-	    const rb_iseq_t *iseq;
-            const VALUE *pc;
-	} iseq;
-	struct {
-	    ID mid;
-	    struct rb_backtrace_location_struct *prev_loc;
-	} cfunc;
-    } body;
+    const rb_iseq_t *iseq;
+    const VALUE *pc;
+    ID mid;
 } rb_backtrace_location_t;
 
 struct valued_frame_info {
@@ -148,9 +141,13 @@ location_mark_entry(rb_backtrace_location_t *fi) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L141
 {
     switch (fi->type) {
       case LOCATION_TYPE_ISEQ:
-	rb_gc_mark_movable((VALUE)fi->body.iseq.iseq);
+        rb_gc_mark_movable((VALUE)fi->iseq);
 	break;
       case LOCATION_TYPE_CFUNC:
+        if (fi->iseq) {
+            rb_gc_mark_movable((VALUE)fi->iseq);
+        }
+        break;
       default:
 	break;
     }
@@ -188,10 +185,10 @@ location_lineno(rb_backtrace_location_t *loc) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L185
 {
     switch (loc->type) {
       case LOCATION_TYPE_ISEQ:
-	return calc_lineno(loc->body.iseq.iseq, loc->body.iseq.pc);
+        return calc_lineno(loc->iseq, loc->pc);
       case LOCATION_TYPE_CFUNC:
-	if (loc->body.cfunc.prev_loc) {
-	    return location_lineno(loc->body.cfunc.prev_loc);
+        if (loc->iseq && loc->pc) {
+            return calc_lineno(loc->iseq, loc->pc);
 	}
 	return 0;
       default:
@@ -219,9 +216,9 @@ location_label(rb_backtrace_location_t *loc) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L216
 {
     switch (loc->type) {
       case LOCATION_TYPE_ISEQ:
-	return loc->body.iseq.iseq->body->location.label;
+        return loc->iseq->body->location.label;
       case LOCATION_TYPE_CFUNC:
-	return rb_id2str(loc->body.cfunc.mid);
+        return rb_id2str(loc->mid);
       default:
 	rb_bug("location_label: unreachable");
 	UNREACHABLE;
@@ -266,9 +263,9 @@ location_base_label(rb_backtrace_location_t *loc) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L263
 {
     switch (loc->type) {
       case LOCATION_TYPE_ISEQ:
-	return loc->body.iseq.iseq->body->location.base_label;
+        return loc->iseq->body->location.base_label;
       case LOCATION_TYPE_CFUNC:
-	return rb_id2str(loc->body.cfunc.mid);
+        return rb_id2str(loc->mid);
       default:
 	rb_bug("location_base_label: unreachable");
 	UNREACHABLE;
@@ -291,10 +288,10 @@ location_path(rb_backtrace_location_t *loc) https://github.com/ruby/ruby/blob/trunk/vm_backtrace.c#L288
 {
     switch (loc->type) {
       case LOCATION_TYPE_ISEQ:
-	return rb_iseq_path(loc->body.iseq.iseq);
+        return rb_iseq_path(loc->iseq);
       case LOCATION_TYPE_CFUNC:
-	if (loc->body.cfun (... truncated)

--
ML: ruby-changes@q...
Info: http://www.atdot.net/~ko1/quickml/