[前][次][番号順一覧][スレッド一覧]

ruby-changes:8400

From: jeg2 <ko1@a...>
Date: Sat, 25 Oct 2008 09:55:01 +0900 (JST)
Subject: [ruby-changes:8400] Ruby:r19931 (trunk): * lib/csv.rb: Fixed a bug in read_to_char() that would slurp

jeg2	2008-10-25 09:54:38 +0900 (Sat, 25 Oct 2008)

  New Revision: 19931

  http://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=rev&revision=19931

  Log:
    * lib/csv.rb:  Fixed a bug in read_to_char() that would slurp
      whole files if the encoding was invalid.  It will now read
      up to 10 bytes ahead to find a valid character boundary or
      give up.  [ruby-core:19465]
    * test/csv/test_features.rb, test/csv/test_table.rb, test/csv/test_row.rb:
      Loosened some tests to check for a compatible? Encoding instea
      of an exact Encoding.  [ruby-core:19470]

  Modified files:
    trunk/ChangeLog
    trunk/lib/csv.rb
    trunk/test/csv/test_features.rb
    trunk/test/csv/test_row.rb
    trunk/test/csv/test_table.rb

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 19930)
+++ ChangeLog	(revision 19931)
@@ -1,3 +1,13 @@
+Sat Oct 25 09:54:10 2008  James Edward Gray II  <jeg2@r...>
+
+  * lib/csv.rb:  Fixed a bug in read_to_char() that would slurp
+    whole files if the encoding was invalid.  It will now read
+    up to 10 bytes ahead to find a valid character boundary or
+    give up.  [ruby-core:19465]
+  * test/csv/test_features.rb, test/csv/test_table.rb, test/csv/test_row.rb:
+    Loosened some tests to check for a compatible? Encoding instea
+    of an exact Encoding.  [ruby-core:19470]
+
 Sat Oct 25 07:42:49 2008  Eric Hodel  <drbrain@s...>
 
 	* lib/rdoc*: Update to RDoc 2.2.2 r192.
Index: lib/csv.rb
===================================================================
--- lib/csv.rb	(revision 19930)
+++ lib/csv.rb	(revision 19931)
@@ -199,7 +199,7 @@
 # 
 class CSV
   # The version of the installed library.
-  VERSION = "2.4.3".freeze
+  VERSION = "2.4.4".freeze
   
   # 
   # A CSV::Row is part Array and part Hash.  It retains an order for the fields
@@ -1551,7 +1551,7 @@
                   end
     @encoding ||= Encoding.default_internal || Encoding.default_external
     # 
-    # prepare for build safe regular expressions in the target encoding,
+    # prepare for building safe regular expressions in the target encoding,
     # if we can transcode the needed characters
     # 
     @re_esc   =   "\\".encode(@encoding) rescue ""
@@ -2251,10 +2251,11 @@
   end
 
   # 
-  # Reads at least +bytes+ from <tt>@io</tt>, but will read on until the data
-  # read is valid in the ecoding of that data.  This should ensure that it is
-  # safe to use regular expressions on the read data.  The read data will be
-  # returned in <tt>@encoding</tt>.
+  # Reads at least +bytes+ from <tt>@io</tt>, but will read up 10 bytes ahead if
+  # needed to ensure the data read is valid in the ecoding of that data.  This
+  # should ensure that it is safe to use regular expressions on the read data,
+  # unless it is actually a broken encoding.  The read data will be returned in
+  # <tt>@encoding</tt>.
   # 
   def read_to_char(bytes)
     return "" if @io.eof?
@@ -2264,10 +2265,12 @@
       raise unless encoded.valid_encoding?
       return encoded
     rescue  # encoding error or my invalid data raise
-      if @io.eof?
+      if @io.eof? or data.size >= bytes + 10
         return data
       else
-        data += @io.read(1) until data.valid_encoding? or @io.eof?
+        data += @io.read(1) until data.valid_encoding? or
+                                  @io.eof?             or
+                                  data.size >= bytes + 10
         retry
       end
     end
Index: test/csv/test_features.rb
===================================================================
--- test/csv/test_features.rb	(revision 19930)
+++ test/csv/test_features.rb	(revision 19931)
@@ -250,9 +250,11 @@
     end
   end
   
-  def test_inspect_is_ascii_8bit_encoded
+  def test_inspect_encoding_is_ascii_compatible
     CSV.new("one,two,three\n1,2,3\n".encode("UTF-16BE")) do |csv|
-      assert_equal("ASCII-8BIT", csv.inspect.encoding.name)
+      assert( Encoding.compatible?( Encoding.find("US-ASCII"),
+                                    csv.inspect.encoding ),
+              "inspect() was not ASCII compatible." )
     end
   end
   
Index: test/csv/test_row.rb
===================================================================
--- test/csv/test_row.rb	(revision 19930)
+++ test/csv/test_row.rb	(revision 19931)
@@ -296,8 +296,10 @@
     end
   end
   
-  def test_inspect_is_ascii_8bit_encoded
-    assert_equal("ASCII-8BIT", @row.inspect.encoding.name)
+  def test_inspect_encoding_is_ascii_compatible
+    assert( Encoding.compatible?( Encoding.find("US-ASCII"),
+                                  @row.inspect.encoding ),
+            "inspect() was not ASCII compatible." )
   end
   
   def test_inspect_shows_symbol_headers_as_bare_attributes
Index: test/csv/test_table.rb
===================================================================
--- test/csv/test_table.rb	(revision 19930)
+++ test/csv/test_table.rb	(revision 19931)
@@ -400,7 +400,9 @@
     assert(str.include?("mode:#{@table.mode}"), "Mode not shown.")
   end
   
-  def test_inspect_is_us_ascii_encoded
-    assert_equal("US-ASCII", @table.inspect.encoding.name)
+  def test_inspect_encoding_is_ascii_compatible
+    assert( Encoding.compatible?( Encoding.find("US-ASCII"),
+                                  @table.inspect.encoding ),
+            "inspect() was not ASCII compatible." )
   end
 end

--
ML: ruby-changes@q...
Info: http://www.atdot.net/~ko1/quickml/

[前][次][番号順一覧][スレッド一覧]