[前][次][番号順一覧][スレッド一覧]

ruby-changes:25311

From: kou <ko1@a...>
Date: Sun, 28 Oct 2012 21:43:52 +0900 (JST)
Subject: [ruby-changes:25311] kou:r37363 (trunk): * lib/rexml/parsers/baseparser.rb: Fix a bug that UTF-8 is used

kou	2012-10-28 21:42:37 +0900 (Sun, 28 Oct 2012)

  New Revision: 37363

  http://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=rev&revision=37363

  Log:
    * lib/rexml/parsers/baseparser.rb: Fix a bug that UTF-8 is used
      for UTF-16XX encoded XML that doesn't have encoding="UTF-16" in
      XML declration.
    * test/rexml/test_document.rb: Add tests for the above change.

  Modified files:
    trunk/ChangeLog
    trunk/lib/rexml/parsers/baseparser.rb
    trunk/test/rexml/test_document.rb

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 37362)
+++ ChangeLog	(revision 37363)
@@ -1,3 +1,10 @@
+Sun Oct 28 21:40:13 2012  Kouhei Sutou  <kou@c...>
+
+	* lib/rexml/parsers/baseparser.rb: Fix a bug that UTF-8 is used
+	  for UTF-16XX encoded XML that doesn't have encoding="UTF-16" in
+	  XML declration.
+	* test/rexml/test_document.rb: Add tests for the above change.
+
 Sun Oct 28 21:37:34 2012  Kouhei Sutou  <kou@c...>
 
 	* test/rexml/test_document.rb: Group tests that they parse
Index: lib/rexml/parsers/baseparser.rb
===================================================================
--- lib/rexml/parsers/baseparser.rb	(revision 37362)
+++ lib/rexml/parsers/baseparser.rb	(revision 37363)
@@ -215,6 +215,9 @@
             if need_source_encoding_update?(encoding)
               @source.encoding = encoding
             end
+            if encoding.nil? and /\AUTF-16(?:BE|LE)\z/i =~ @source.encoding
+              encoding = "UTF-16"
+            end
             standalone = STANDALONE.match(results)
             standalone = standalone[1] unless standalone.nil?
             return [ :xmldecl, version, encoding, standalone ]
Index: test/rexml/test_document.rb
===================================================================
--- test/rexml/test_document.rb	(revision 37362)
+++ test/rexml/test_document.rb	(revision 37363)
@@ -246,5 +246,27 @@
         assert_equal("UTF-16", document.encoding)
       end
     end
+
+    class NoEncodingTest < self
+      def test_utf_16le
+        xml = <<-EOX.encode("UTF-16LE").force_encoding("ASCII-8BIT")
+<?xml version="1.0"?>
+<message>Hello world!</message>
+EOX
+        bom = "\ufeff".encode("UTF-16LE").force_encoding("ASCII-8BIT")
+        document = REXML::Document.new(bom + xml)
+        assert_equal("UTF-16", document.encoding)
+      end
+
+      def test_utf_16be
+        xml = <<-EOX.encode("UTF-16BE").force_encoding("ASCII-8BIT")
+<?xml version="1.0"?>
+<message>Hello world!</message>
+EOX
+        bom = "\ufeff".encode("UTF-16BE").force_encoding("ASCII-8BIT")
+        document = REXML::Document.new(bom + xml)
+        assert_equal("UTF-16", document.encoding)
+      end
+    end
   end
 end

--
ML: ruby-changes@q...
Info: http://www.atdot.net/~ko1/quickml/

[前][次][番号順一覧][スレッド一覧]