ruby-changes:13573
From: naruse <ko1@a...>
Date: Fri, 16 Oct 2009 03:19:32 +0900 (JST)
Subject: [ruby-changes:13573] Ruby:r25353 (trunk): * lib/csv.rb (CSV#read_to_char): set encoding and verify data
naruse 2009-10-16 03:19:15 +0900 (Fri, 16 Oct 2009) New Revision: 25353 http://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=rev&revision=25353 Log: * lib/csv.rb (CSV#read_to_char): set encoding and verify data which read from io before encode it to @encoding. * lib/csv.rb (CSV#raw_encoding): add to get @io's encoding. * lib/csv.rb (CSV#read_io): add to read string and set @io's encoding. Modified files: trunk/ChangeLog trunk/lib/csv.rb Index: ChangeLog =================================================================== --- ChangeLog (revision 25352) +++ ChangeLog (revision 25353) @@ -1,3 +1,13 @@ +Fri Oct 16 03:15:52 2009 NARUSE, Yui <naruse@r...> + + * lib/csv.rb (CSV#read_to_char): set encoding and verify data + which read from io before encode it to @encoding. + + * lib/csv.rb (CSV#raw_encoding): add to get @io's encoding. + + * lib/csv.rb (CSV#read_io): add to read string and set @io's + encoding. + Thu Oct 15 18:26:12 2009 Nobuyoshi Nakada <nobu@r...> * parse.y (rb_intern3): check symbol table overflow before generate Index: lib/csv.rb =================================================================== --- lib/csv.rb (revision 25352) +++ lib/csv.rb (revision 25353) @@ -1550,12 +1550,7 @@ # create the IO object we will read from @io = if data.is_a? String then StringIO.new(data) else data end # honor the IO encoding if we can, otherwise default to ASCII-8BIT - @encoding = if @io.respond_to? :internal_encoding - @io.internal_encoding || @io.external_encoding - elsif @io.is_a? StringIO - @io.string.encoding - end - @encoding ||= Encoding.default_internal || Encoding.default_external + @encoding = raw_encoding || Encoding.default_internal || Encoding.default_external # # prepare for building safe regular expressions in the target encoding, # if we can transcode the needed characters @@ -1989,7 +1984,6 @@ sample = read_to_char(1024) sample += read_to_char(1) if sample[-1..-1] == encode_str("\r") and not @io.eof? - # try to find a standard separator if sample =~ encode_re("\r\n?|\n") @row_sep = $& @@ -2272,8 +2266,9 @@ # def read_to_char(bytes) return "" if @io.eof? - data = @io.read(bytes) + data = read_io(bytes) begin + raise unless data.valid_encoding? encoded = encode_str(data) raise unless encoded.valid_encoding? return encoded @@ -2281,11 +2276,26 @@ if @io.eof? or data.size >= bytes + 10 return data else - data += @io.read(1) + data += read_io(1) retry end end end + + private + def raw_encoding + if @io.respond_to? :internal_encoding + @io.internal_encoding || @io.external_encoding + elsif @io.is_a? StringIO + @io.string.encoding + else + @io.encoding + end + end + + def read_io(bytes) + @io.read(bytes).force_encoding(raw_encoding) + end end # Another name for CSV::instance(). -- ML: ruby-changes@q... Info: http://www.atdot.net/~ko1/quickml/