[前][次][番号順一覧][スレッド一覧]

ruby-changes:6778

From: naruse <ko1@a...>
Date: Thu, 31 Jul 2008 20:01:33 +0900 (JST)
Subject: [ruby-changes:6778] Ruby:r18294 (trunk): * transcode.c (get_replacement_character): use U+FFFD as replacement

naruse	2008-07-31 19:59:39 +0900 (Thu, 31 Jul 2008)

  New Revision: 18294

  http://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=rev&revision=18294

  Log:
    * transcode.c (get_replacement_character): use U+FFFD as replacement
      character when convert to Unicode.
    
    * test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
      rename from test_public_review_issue_121.
    
    * test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
      enable option2.

  Modified files:
    trunk/ChangeLog
    trunk/test/ruby/test_transcode.rb
    trunk/transcode.c

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 18293)
+++ ChangeLog	(revision 18294)
@@ -1,3 +1,14 @@
+Thu Jul 31 19:54:57 2008  NARUSE, Yui  <naruse@r...>
+
+	* transcode.c (get_replacement_character): use U+FFFD as replacement
+	  character when convert to Unicode.
+
+	* test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
+	  rename from test_public_review_issue_121.
+
+	* test/ruby/test_transcode.rb (test_unicode_public_review_issue_121):
+	  enable option2.
+
 Thu Jul 31 17:00:10 2008  NARUSE, Yui  <naruse@r...>
 
 	* transcode.c (get_replacement_character): fix: invalid byte sequence
@@ -11,9 +22,9 @@
 Thu Jul 31 15:11:11 2008  Martin Duerst  <duerst@i...>
 
 	* test/ruby/test_transcode.rb: added test_shift_jis
-          (contributed by Yoshihiro Kambayashi) and
-          test_public_review_issue_121
-          (see http://www.unicode.org/review/pr-121.html)
+	  (contributed by Yoshihiro Kambayashi) and
+	  test_public_review_issue_121
+	  (see http://www.unicode.org/review/pr-121.html)
 
 Thu Jul 31 13:18:30 2008  Yusuke Endoh  <mame@t...>
 
Index: test/ruby/test_transcode.rb
===================================================================
--- test/ruby/test_transcode.rb	(revision 18293)
+++ test/ruby/test_transcode.rb	(revision 18294)
@@ -312,16 +312,13 @@
     # check_both_ways("\u9299", "\x1b$(Dd!\x1b(B", "iso-2022-jp-1") # JIS X 0212 8 1    end
   
-  def test_public_review_issue_121 # see http://www.unicode.org/review/pr-121.html
+  def test_unicode_public_review_issue_121 # see http://www.unicode.org/review/pr-121.html
     # assert_equal("\x00\x61\x00?\x00\x62".force_encoding('UTF-16BE'),
     #   "\x61\xF1\x80\x80\xE1\x80\xC2\x62".encode('UTF-16BE', 'UTF-8', invalid: :replace)) # option 1
-    assert_equal("\x00\x61\x00?\x00?\x00?\x00\x62".force_encoding('UTF-16BE'),
+    assert_equal("\x00\x61\xFF\xFD\xFF\xFD\xFF\xFD\x00\x62".force_encoding('UTF-16BE'),
       "\x61\xF1\x80\x80\xE1\x80\xC2\x62".encode('UTF-16BE', 'UTF-8', invalid: :replace)) # option 2
-    # The next test doesn't work because of a bug in the implementation
-    # but we currently don't plan to fix that bug because we'll rewrite
-    # this stuff a bit anyway.
-    # assert_equal("\x61\x00?\x00?\x00?\x00\x62\x00".force_encoding('UTF-16LE'),
-    #  "\x61\xF1\x80\x80\xE1\x80\xC2\x62".encode('UTF-16LE', 'UTF-8', invalid: :replace)) # option 2
+    assert_equal("\x61\x00\xFD\xFF\xFD\xFF\xFD\xFF\x62\x00".force_encoding('UTF-16LE'),
+      "\x61\xF1\x80\x80\xE1\x80\xC2\x62".encode('UTF-16LE', 'UTF-8', invalid: :replace)) # option 2
     # assert_equal("\x00\x61\x00?\x00?\x00?\x00?\x00?\x00?\x00\x62".force_encoding('UTF-16BE'),
     # "\x61\xF1\x80\x80\xE1\x80\xC2\x62".encode('UTF-16BE', 'UTF-8', invalid: :replace)) # option 3
   end
Index: transcode.c
===================================================================
--- transcode.c	(revision 18293)
+++ transcode.c	(revision 18294)
@@ -137,16 +137,16 @@
 	return "?";
     }
     else if (utf16be_encoding == enc) {
-	return "\x00?";
+	return "\xFF\xFD";
     }
     else if (utf16le_encoding == enc) {
-	return "?\x00";
+	return "\xFD\xFF";
     }
     else if (utf32be_encoding == enc) {
-	return "\x00\x00\x00?";
+	return "\x00\x00\xFF\xFD";
     }
     else if (utf32le_encoding == enc) {
-	return "?\x00\x00\x00";
+	return "\xFD\xFF\x00\x00";
     }
     else {
 	return "?";

--
ML: ruby-changes@q...
Info: http://www.atdot.net/~ko1/quickml/

[前][次][番号順一覧][スレッド一覧]