ruby-changes:8400
From: jeg2 <ko1@a...>
Date: Sat, 25 Oct 2008 09:55:01 +0900 (JST)
Subject: [ruby-changes:8400] Ruby:r19931 (trunk): * lib/csv.rb: Fixed a bug in read_to_char() that would slurp
jeg2 2008-10-25 09:54:38 +0900 (Sat, 25 Oct 2008) New Revision: 19931 http://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=rev&revision=19931 Log: * lib/csv.rb: Fixed a bug in read_to_char() that would slurp whole files if the encoding was invalid. It will now read up to 10 bytes ahead to find a valid character boundary or give up. [ruby-core:19465] * test/csv/test_features.rb, test/csv/test_table.rb, test/csv/test_row.rb: Loosened some tests to check for a compatible? Encoding instea of an exact Encoding. [ruby-core:19470] Modified files: trunk/ChangeLog trunk/lib/csv.rb trunk/test/csv/test_features.rb trunk/test/csv/test_row.rb trunk/test/csv/test_table.rb Index: ChangeLog =================================================================== --- ChangeLog (revision 19930) +++ ChangeLog (revision 19931) @@ -1,3 +1,13 @@ +Sat Oct 25 09:54:10 2008 James Edward Gray II <jeg2@r...> + + * lib/csv.rb: Fixed a bug in read_to_char() that would slurp + whole files if the encoding was invalid. It will now read + up to 10 bytes ahead to find a valid character boundary or + give up. [ruby-core:19465] + * test/csv/test_features.rb, test/csv/test_table.rb, test/csv/test_row.rb: + Loosened some tests to check for a compatible? Encoding instea + of an exact Encoding. [ruby-core:19470] + Sat Oct 25 07:42:49 2008 Eric Hodel <drbrain@s...> * lib/rdoc*: Update to RDoc 2.2.2 r192. Index: lib/csv.rb =================================================================== --- lib/csv.rb (revision 19930) +++ lib/csv.rb (revision 19931) @@ -199,7 +199,7 @@ # class CSV # The version of the installed library. - VERSION = "2.4.3".freeze + VERSION = "2.4.4".freeze # # A CSV::Row is part Array and part Hash. It retains an order for the fields @@ -1551,7 +1551,7 @@ end @encoding ||= Encoding.default_internal || Encoding.default_external # - # prepare for build safe regular expressions in the target encoding, + # prepare for building safe regular expressions in the target encoding, # if we can transcode the needed characters # @re_esc = "\\".encode(@encoding) rescue "" @@ -2251,10 +2251,11 @@ end # - # Reads at least +bytes+ from <tt>@io</tt>, but will read on until the data - # read is valid in the ecoding of that data. This should ensure that it is - # safe to use regular expressions on the read data. The read data will be - # returned in <tt>@encoding</tt>. + # Reads at least +bytes+ from <tt>@io</tt>, but will read up 10 bytes ahead if + # needed to ensure the data read is valid in the ecoding of that data. This + # should ensure that it is safe to use regular expressions on the read data, + # unless it is actually a broken encoding. The read data will be returned in + # <tt>@encoding</tt>. # def read_to_char(bytes) return "" if @io.eof? @@ -2264,10 +2265,12 @@ raise unless encoded.valid_encoding? return encoded rescue # encoding error or my invalid data raise - if @io.eof? + if @io.eof? or data.size >= bytes + 10 return data else - data += @io.read(1) until data.valid_encoding? or @io.eof? + data += @io.read(1) until data.valid_encoding? or + @io.eof? or + data.size >= bytes + 10 retry end end Index: test/csv/test_features.rb =================================================================== --- test/csv/test_features.rb (revision 19930) +++ test/csv/test_features.rb (revision 19931) @@ -250,9 +250,11 @@ end end - def test_inspect_is_ascii_8bit_encoded + def test_inspect_encoding_is_ascii_compatible CSV.new("one,two,three\n1,2,3\n".encode("UTF-16BE")) do |csv| - assert_equal("ASCII-8BIT", csv.inspect.encoding.name) + assert( Encoding.compatible?( Encoding.find("US-ASCII"), + csv.inspect.encoding ), + "inspect() was not ASCII compatible." ) end end Index: test/csv/test_row.rb =================================================================== --- test/csv/test_row.rb (revision 19930) +++ test/csv/test_row.rb (revision 19931) @@ -296,8 +296,10 @@ end end - def test_inspect_is_ascii_8bit_encoded - assert_equal("ASCII-8BIT", @row.inspect.encoding.name) + def test_inspect_encoding_is_ascii_compatible + assert( Encoding.compatible?( Encoding.find("US-ASCII"), + @row.inspect.encoding ), + "inspect() was not ASCII compatible." ) end def test_inspect_shows_symbol_headers_as_bare_attributes Index: test/csv/test_table.rb =================================================================== --- test/csv/test_table.rb (revision 19930) +++ test/csv/test_table.rb (revision 19931) @@ -400,7 +400,9 @@ assert(str.include?("mode:#{@table.mode}"), "Mode not shown.") end - def test_inspect_is_us_ascii_encoded - assert_equal("US-ASCII", @table.inspect.encoding.name) + def test_inspect_encoding_is_ascii_compatible + assert( Encoding.compatible?( Encoding.find("US-ASCII"), + @table.inspect.encoding ), + "inspect() was not ASCII compatible." ) end end -- ML: ruby-changes@q... Info: http://www.atdot.net/~ko1/quickml/