ruby-changes:53738
From: naruse <ko1@a...>
Date: Sat, 24 Nov 2018 20:53:25 +0900 (JST)
Subject: [ruby-changes:53738] naruse:r65954 (trunk): Don't use single byte optimization on grapheme clusters
naruse 2018-11-24 20:53:19 +0900 (Sat, 24 Nov 2018) New Revision: 65954 https://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=revision&revision=65954 Log: Don't use single byte optimization on grapheme clusters Unicode Text Segmentation considers CRLF as a character. [Bug #15337] Modified files: trunk/string.c trunk/test/ruby/test_string.rb Index: test/ruby/test_string.rb =================================================================== --- test/ruby/test_string.rb (revision 65953) +++ test/ruby/test_string.rb (revision 65954) @@ -973,6 +973,7 @@ CODE https://github.com/ruby/ruby/blob/trunk/test/ruby/test_string.rb#L973 def test_each_grapheme_cluster [ + "\u{0D 0A}", "\u{20 200d}", "\u{600 600}", "\u{600 20}", Index: string.c =================================================================== --- string.c (revision 65953) +++ string.c (revision 65954) @@ -8459,7 +8459,7 @@ rb_str_each_grapheme_cluster_size(VALUE https://github.com/ruby/ruby/blob/trunk/string.c#L8459 rb_encoding *enc = rb_enc_from_index(ENCODING_GET(str)); const char *ptr, *end; - if (!rb_enc_unicode_p(enc) || single_byte_optimizable(str)) { + if (!rb_enc_unicode_p(enc)) { return rb_str_length(str); } @@ -8487,7 +8487,7 @@ rb_str_enumerate_grapheme_clusters(VALUE https://github.com/ruby/ruby/blob/trunk/string.c#L8487 rb_encoding *enc = rb_enc_from_index(ENCODING_GET(str)); const char *ptr, *end; - if (!rb_enc_unicode_p(enc) || single_byte_optimizable(str)) { + if (!rb_enc_unicode_p(enc)) { return rb_str_enumerate_chars(str, ary); } -- ML: ruby-changes@q... Info: http://www.atdot.net/~ko1/quickml/