[前][次][番号順一覧][スレッド一覧]

ruby-changes:13573

From: naruse <ko1@a...>
Date: Fri, 16 Oct 2009 03:19:32 +0900 (JST)
Subject: [ruby-changes:13573] Ruby:r25353 (trunk): * lib/csv.rb (CSV#read_to_char): set encoding and verify data

naruse	2009-10-16 03:19:15 +0900 (Fri, 16 Oct 2009)

  New Revision: 25353

  http://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=rev&revision=25353

  Log:
    * lib/csv.rb (CSV#read_to_char): set encoding and verify data
      which read from io before encode it to @encoding.
    
    * lib/csv.rb (CSV#raw_encoding): add to get @io's encoding.
    
    * lib/csv.rb (CSV#read_io): add to read string and set @io's
      encoding.

  Modified files:
    trunk/ChangeLog
    trunk/lib/csv.rb

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 25352)
+++ ChangeLog	(revision 25353)
@@ -1,3 +1,13 @@
+Fri Oct 16 03:15:52 2009  NARUSE, Yui  <naruse@r...>
+
+	* lib/csv.rb (CSV#read_to_char): set encoding and verify data
+	  which read from io before encode it to @encoding.
+
+	* lib/csv.rb (CSV#raw_encoding): add to get @io's encoding.
+
+	* lib/csv.rb (CSV#read_io): add to read string and set @io's
+	  encoding.
+
 Thu Oct 15 18:26:12 2009  Nobuyoshi Nakada  <nobu@r...>
 
 	* parse.y (rb_intern3): check symbol table overflow before generate
Index: lib/csv.rb
===================================================================
--- lib/csv.rb	(revision 25352)
+++ lib/csv.rb	(revision 25353)
@@ -1550,12 +1550,7 @@
     # create the IO object we will read from
     @io       =   if data.is_a? String then StringIO.new(data) else data end
     # honor the IO encoding if we can, otherwise default to ASCII-8BIT
-    @encoding =   if @io.respond_to? :internal_encoding
-                    @io.internal_encoding || @io.external_encoding
-                  elsif @io.is_a? StringIO
-                    @io.string.encoding
-                  end
-    @encoding ||= Encoding.default_internal || Encoding.default_external
+    @encoding = raw_encoding || Encoding.default_internal || Encoding.default_external
     #
     # prepare for building safe regular expressions in the target encoding,
     # if we can transcode the needed characters
@@ -1989,7 +1984,6 @@
             sample =  read_to_char(1024)
             sample += read_to_char(1) if sample[-1..-1] == encode_str("\r") and
                                          not @io.eof?
-
             # try to find a standard separator
             if sample =~ encode_re("\r\n?|\n")
               @row_sep = $&
@@ -2272,8 +2266,9 @@
   #
   def read_to_char(bytes)
     return "" if @io.eof?
-    data = @io.read(bytes)
+    data = read_io(bytes)
     begin
+      raise unless data.valid_encoding?
       encoded = encode_str(data)
       raise unless encoded.valid_encoding?
       return encoded
@@ -2281,11 +2276,26 @@
       if @io.eof? or data.size >= bytes + 10
         return data
       else
-        data += @io.read(1)
+        data += read_io(1)
         retry
       end
     end
   end
+
+  private
+  def raw_encoding
+    if @io.respond_to? :internal_encoding
+      @io.internal_encoding || @io.external_encoding
+    elsif @io.is_a? StringIO
+      @io.string.encoding
+    else
+      @io.encoding
+    end
+  end
+
+  def read_io(bytes)
+    @io.read(bytes).force_encoding(raw_encoding)
+  end
 end
 
 # Another name for CSV::instance().

--
ML: ruby-changes@q...
Info: http://www.atdot.net/~ko1/quickml/

[前][次][番号順一覧][スレッド一覧]