[前][次][番号順一覧][スレッド一覧]

ruby-changes:65914

From: Jeremy <ko1@a...>
Date: Thu, 22 Apr 2021 02:49:43 +0900 (JST)
Subject: [ruby-changes:65914] 50c54d40a8 (master): Evaluate multiple assignment left hand side before right hand side

https://git.ruby-lang.org/ruby.git/commit/?id=50c54d40a8

From 50c54d40a81bb2a4794a6be5f1861152900b4fed Mon Sep 17 00:00:00 2001
From: Jeremy Evans <code@j...>
Date: Wed, 21 Apr 2021 10:49:19 -0700
Subject: Evaluate multiple assignment left hand side before right hand side

In regular assignment, Ruby evaluates the left hand side before
the right hand side.  For example:

```ruby
foo[0] = bar
```

Calls `foo`, then `bar`, then `[]=` on the result of `foo`.

Previously, multiple assignment didn't work this way.  If you did:

```ruby
abc.def, foo[0] = bar, baz
```

Ruby would previously call `bar`, then `baz`, then `abc`, then
`def=` on the result of `abc`, then `foo`, then `[]=` on the
result of `foo`.

This change makes multiple assignment similar to single assignment,
changing the evaluation order of the above multiple assignment code
to calling `abc`, then `foo`, then `bar`, then `baz`, then `def=` on
the result of `abc`, then `[]=` on the result of `foo`.

Implementing this is challenging with the stack-based virtual machine.
We need to keep track of all of the left hand side attribute setter
receivers and setter arguments, and then keep track of the stack level
while handling the assignment processing, so we can issue the
appropriate topn instructions to get the receiver.  Here's an example
of how the multiple assignment is executed, showing the stack and
instructions:

```
self                                      # putself
abc                                       # send
abc, self                                 # putself
abc, foo                                  # send
abc, foo, 0                               # putobject 0
abc, foo, 0, [bar, baz]                   # evaluate RHS
abc, foo, 0, [bar, baz], baz, bar         # expandarray
abc, foo, 0, [bar, baz], baz, bar, abc    # topn 5
abc, foo, 0, [bar, baz], baz, abc, bar    # swap
abc, foo, 0, [bar, baz], baz, def=        # send
abc, foo, 0, [bar, baz], baz              # pop
abc, foo, 0, [bar, baz], baz, foo         # topn 3
abc, foo, 0, [bar, baz], baz, foo, 0      # topn 3
abc, foo, 0, [bar, baz], baz, foo, 0, baz # topn 2
abc, foo, 0, [bar, baz], baz, []=         # send
abc, foo, 0, [bar, baz], baz              # pop
abc, foo, 0, [bar, baz]                   # pop
[bar, baz], foo, 0, [bar, baz]            # setn 3
[bar, baz], foo, 0                        # pop
[bar, baz], foo                           # pop
[bar, baz]                                # pop
```

As multiple assignment must deal with splats, post args, and any level
of nesting, it gets quite a bit more complex than this in non-trivial
cases. To handle this, struct masgn_state is added to keep
track of the overall state of the mass assignment, which stores a linked
list of struct masgn_attrasgn, one for each assigned attribute.

This adds a new optimization that replaces a topn 1/pop instruction
combination with a single swap instruction for multiple assignment
to non-aref attributes.

This new approach isn't compatible with one of the optimizations
previously used, in the case where the multiple assignment return value
was not needed, there was no lhs splat, and one of the left hand side
used an attribute setter.  This removes that optimization. Removing
the optimization allowed for removing the POP_ELEMENT and adjust_stack
functions.

This adds a benchmark to measure how much slower multiple
assignment is with the correct evaluation order.

This benchmark shows:

* 4-9% decrease for attribute sets
* 14-23% decrease for array member sets
* Basically same speed for local variable sets

Importantly, it shows no significant difference between the popped
(where return value of the multiple assignment is not needed) and
!popped (where return value of the multiple assignment is needed)
cases for attribute and array member sets.  This indicates the
previous optimization, which was dropped in the evaluation
order fix and only affected the popped case, is not important to
performance.

Fixes [Bug #4443]
---
 NEWS.md                      |  46 ++++++
 benchmark/masgn.yml          |  29 ++++
 compile.c                    | 363 ++++++++++++++++++++++++++++++-------------
 test/ruby/test_assignment.rb |  58 +++++++
 4 files changed, 391 insertions(+), 105 deletions(-)
 create mode 100644 benchmark/masgn.yml

diff --git a/NEWS.md b/NEWS.md
index 8670a0e..a41cf0b 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -6,6 +6,7 @@ since the **3.0.0** release, except for bug fixes. https://github.com/ruby/ruby/blob/trunk/NEWS.md#L6
 Note that each entry is kept to a minimum, see links for details.
 
 ## Language changes
+
 * Pin operator now takes an expression. [[Feature #17411]]
 
     ```ruby
@@ -13,6 +14,50 @@ Note that each entry is kept to a minimum, see links for details. https://github.com/ruby/ruby/blob/trunk/NEWS.md#L14
     #=> [[3, 5], [5, 7], [11, 13]]
     ```
 
+* Multiple assignment evaluation order has been made consistent with
+  single assignment evaluation order.  With single assignment, Ruby
+  uses a left-to-right evaluation order.  With this code:
+
+    ```ruby
+    foo[0] = bar
+    ```
+
+  The following evaluation order is used:
+
+  1. `foo`
+  2. `bar`
+  3. `[]=` called on the result of `foo`
+
+  In Ruby before 3.1.0, multiple assignment did not follow this
+  evaluation order.  With this code:
+
+    ```ruby
+    foo[0], bar.baz = a, b
+    ```
+
+  Versions of Ruby before 3.1.0 would evaluate in the following
+  order
+
+  1. `a`
+  2. `b`
+  3. `foo`
+  4. `[]=` called on the result of `foo`
+  5. `bar`
+  6. `baz=` called on the result of `bar`
+
+  Starting in Ruby 3.1.0, evaluation order is now consistent with
+  single assignment, with the left hand side being evaluated before
+  the right hand side:
+
+  1. `foo`
+  2. `bar`
+  3. `a`
+  4. `b`
+  5. `[]=` called on the result of `foo`
+  6. `baz=` called on the result of `bar`
+
+  [[Bug #4443]]
+
 ## Command line options
 
 ## Core classes updates
@@ -96,6 +141,7 @@ Excluding feature bug fixes. https://github.com/ruby/ruby/blob/trunk/NEWS.md#L141
 ## Miscellaneous changes
 
 
+[Bug #4443]: https://bugs.ruby-lang.org/issues/4443
 [Feature #12194]: https://bugs.ruby-lang.org/issues/12194
 [Feature #14256]: https://bugs.ruby-lang.org/issues/14256
 [Feature #15198]: https://bugs.ruby-lang.org/issues/15198
diff --git a/benchmark/masgn.yml b/benchmark/masgn.yml
new file mode 100644
index 0000000..4be9333
--- /dev/null
+++ b/benchmark/masgn.yml
@@ -0,0 +1,29 @@ https://github.com/ruby/ruby/blob/trunk/benchmark/masgn.yml#L1
+prelude: |
+  a = [nil] * 3
+  b = Class.new{attr_writer :a, :b, :c}.new
+  c, d, e, f = nil, nil, nil, nil
+benchmark:
+  array2_2: "c = (a[0], a[1] = 1, 2)"
+  array2_3: "c = (a[0], a[1] = 1, 2, 3)"
+  array3_2: "c = (a[0], a[1], a[2] = 1, 2)"
+  array3_3: "c = (a[0], a[1], a[2] = 1, 2, 3)"
+  attr2_2: "c = (b.a, b.b = 1, 2)"
+  attr2_3: "c = (b.a, b.b = 1, 2, 3)"
+  attr3_2: "c = (b.a, b.b, b.c = 1, 2)"
+  attr3_3: "c = (b.a, b.b, b.c = 1, 2, 3)"
+  lvar2_2: "c = (d, e = 1, 2)"
+  lvar2_3: "c = (d, e = 1, 2, 3)"
+  lvar3_2: "c = (d, e, f = 1, 2)"
+  lvar3_3: "c = (d, e, f = 1, 2, 3)"
+  array2_2p: "(a[0], a[1] = 1, 2; nil)"
+  array2_3p: "(a[0], a[1] = 1, 2, 3; nil)"
+  array3_2p: "(a[0], a[1], a[2] = 1, 2; nil)"
+  array3_3p: "(a[0], a[1], a[2] = 1, 2, 3; nil)"
+  attr2_2p: "(b.a, b.b = 1, 2; nil)"
+  attr2_3p: "(b.a, b.b = 1, 2, 3; nil)"
+  attr3_2p: "(b.a, b.b, b.c = 1, 2; nil)"
+  attr3_3p: "(b.a, b.b, b.c = 1, 2, 3; nil)"
+  lvar2_2p: "(d, e = 1, 2; nil)"
+  lvar2_3p: "(d, e = 1, 2, 3; nil)"
+  lvar3_2p: "(d, e, f = 1, 2; nil)"
+  lvar3_3p: "(d, e, f = 1, 2, 3; nil)"
diff --git a/compile.c b/compile.c
index 7e1724c..e6dfc76 100644
--- a/compile.c
+++ b/compile.c
@@ -1099,19 +1099,6 @@ LAST_ELEMENT(LINK_ANCHOR *const anchor) https://github.com/ruby/ruby/blob/trunk/compile.c#L1099
 }
 
 static LINK_ELEMENT *
-POP_ELEMENT(ISEQ_ARG_DECLARE LINK_ANCHOR *const anchor)
-{
-    LINK_ELEMENT *elem = anchor->last;
-    anchor->last = anchor->last->prev;
-    anchor->last->next = 0;
-    verify_list("pop", anchor);
-    return elem;
-}
-#if CPDEBUG < 0
-#define POP_ELEMENT(anchor) POP_ELEMENT(iseq, (anchor))
-#endif
-
-static LINK_ELEMENT *
 ELEM_FIRST_INSN(LINK_ELEMENT *elem)
 {
     while (elem) {
@@ -4596,27 +4583,163 @@ when_splat_vals(rb_iseq_t *iseq, LINK_ANCHOR *const cond_seq, const NODE *vals, https://github.com/ruby/ruby/blob/trunk/compile.c#L4583
     return COMPILE_OK;
 }
 
+/* Multiple Assignment Handling
+ *
+ * In order to handle evaluation of multiple assignment such that the left hand side
+ * is evaluated before the right hand side, we need to process the left hand side
+ * and see if there are any attributes that need to be assigned.  If so, we add
+ * instructions to evaluate the receiver of any assigned attributes before we
+ * process the right hand side.
+ *
+ * For a multiple assignment such as:
+ *
+ *   l1.m1, l2[0] = r3, r4
+ *
+ * We start off evaluating l1 and l2, then we evaluate r3 and r4, then we
+ * assign the result of r3 to l1.m1, and then the result of r4 to l2.m2.
+ * On the VM stack, this looks like:
+ *
+ *     self                               # putself
+ *     l1                                 # send
+ *     l1, self                           # putself
+ *     l1, l2                             # send
+ *     l1, l2, 0                          # putobject 0
+ *     l1, l2, 0, [r3, r4]                # after evaluation of RHS
+ *     l1, l2, 0, [r3, r4], r4, r3        # expandarray
+ *     l1, l2, 0, [r3, r4], r4, r3, l1    # topn 5
+ *     l1, l2, 0, [r3, r4], r4, l1, r3    # swap
+ *     l1, l2, 0, [r3, r4], r4, m1=       # send
+ *     l1, l2, 0, [r3, r4], r4            # pop
+ *     l1, l2, 0, [r3, r4], r4, l2        # topn 3
+ *     l1, l2, 0, [r3, r4], r4, l2, 0     # topn 3
+ *     l1, l2, 0, [r3, r4], r4, l2, 0, r4 # topn 2
+ *     l1, l2, 0, [r3, r4], r4, []=       # send
+ *     l1, l2, 0, [r3, r4], r4            # pop
+ *     l1, l2, 0, [r3, r4]                # pop
+ *     [r3, r4], l2, 0, [r3, r4]          # (... truncated)

--
ML: ruby-changes@q...
Info: http://www.atdot.net/~ko1/quickml/

[前][次][番号順一覧][スレッド一覧]