ruby-changes:50779

usa	2018-03-28 18:13:59 +0900 (Wed, 28 Mar 2018)

  New Revision: 62970

  https://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=revision&revision=62970

  Log:
    merge revision(s) 62960-62965:
    
    webrick: use IO.copy_stream for multipart response
    
    Use the new Proc response body feature to generate a multipart
    range response dynamically.  We use a flat array to minimize
    object overhead as much as possible; as many ranges may fit
    into an HTTP request header.
    
    * lib/webrick/httpservlet/filehandler.rb (multipart_body): new method
      (make_partial_content): use multipart_body
    ------------------------------------------------------------------------
    r62960 | normal | 2018-03-28 17:06:23 +0900 (?\230?\176?\180, 28 3 2018) | 13 lines
    
    webrick/httprequest: limit request headers size
    
    We use the same 112 KB limit started (AFAIK) by Mongrel, Thin,
    and Puma to prevent malicious users from using up all the memory
    with a single request.  This also limits the damage done by
    excessive ranges in multipart Range: requests.
    
    Due to the way we rely on IO#gets and the desire to keep
    the code simple, the actual maximum header may be 4093 bytes
    larger than 112 KB, but we're splitting hairs at that point.
    
    * lib/webrick/httprequest.rb: define MAX_HEADER_LENGTH
      (read_header): raise when headers exceed max length
    ------------------------------------------------------------------------
    r62961 | normal | 2018-03-28 17:06:28 +0900 (?\230?\176?\180, 28 3 2018) | 9 lines
    
    webrick/httpservlet/cgihandler: reduce memory use
    
    WEBrick::HTTPRequest#body can be passed a block to process the
    body in chunks.  Use this feature to avoid building a giant
    string in memory.
    
    * lib/webrick/httpservlet/cgihandler.rb (do_GET):
      avoid reading entire request body into memory
      (do_POST is aliased to do_GET, so it handles bodies)
    ------------------------------------------------------------------------
    r62962 | normal | 2018-03-28 17:06:34 +0900 (?\230?\176?\180, 28 3 2018) | 7 lines
    
    webrick/httprequest: raise correct exception
    
    "BadRequest" alone does not resolve correctly, it is in the
    HTTPStatus namespace.
    
    * lib/webrick/httprequest.rb (read_chunked): use correct exception
    * test/webrick/test_httpserver.rb (test_eof_in_chunk): new test
    ------------------------------------------------------------------------
    r62963 | normal | 2018-03-28 17:06:39 +0900 (?\230?\176?\180, 28 3 2018) | 9 lines
    
    webrick/httprequest: use InputBufferSize for chunked requests
    
    While WEBrick::HTTPRequest#body provides a Proc interface
    for streaming large request bodies, clients must not force
    the server to use an excessively large chunk size.
    
    * lib/webrick/httprequest.rb (read_chunk_size): limit each
      read and block.call to :InputBufferSize in config.
    * test/webrick/test_httpserver.rb (test_big_chunks): new test
    ------------------------------------------------------------------------
    r62964 | normal | 2018-03-28 17:06:44 +0900 (?\230?\176?\180, 28 3 2018) | 9 lines
    
    webrick: add test for Digest auth-int
    
    No changes to the actual code, this is a new test for
    a feature for which no tests existed.  I don't understand
    the Digest authentication code well at all, but this is
    necessary for the subsequent change.
    
    * test/webrick/test_httpauth.rb (test_digest_auth_int): new test
      (credentials_for_request): support bodies with POST
    ------------------------------------------------------------------------
    r62965 | normal | 2018-03-28 17:06:49 +0900 (?\230?\176?\180, 28 3 2018) | 18 lines
    
    webrick/httpauth/digestauth: stream req.body
    
    WARNING! WARNING! WARNING!  LIKELY BROKEN CHANGE
    
    Pass a proc to WEBrick::HTTPRequest#body to avoid reading a
    potentially large request body into memory during
    authentication.
    
    WARNING! this will break apps completely which want to do
    something with the body besides calculating the MD5 digest
    of it.
    
    Also, keep in mind that probably nobody uses "auth-int".
    Servers such as Apache, lighttpd, nginx don't seem to
    support it; nor does curl when using POST/PUT bodies;
    and we didn't have tests for it until now...
    
    * lib/webrick/httpauth/digestauth.rb (_authenticate): stream req.body

  Modified directories:
    branches/ruby_2_3/
  Modified files:
    branches/ruby_2_3/ChangeLog
    branches/ruby_2_3/lib/webrick/httpauth/digestauth.rb
    branches/ruby_2_3/lib/webrick/httprequest.rb
    branches/ruby_2_3/lib/webrick/httpservlet/cgihandler.rb
    branches/ruby_2_3/test/webrick/test_httpauth.rb
    branches/ruby_2_3/test/webrick/test_httpserver.rb
    branches/ruby_2_3/version.h
Index: ruby_2_3/test/webrick/test_httpauth.rb
===================================================================
--- ruby_2_3/test/webrick/test_httpauth.rb	(revision 62969)
+++ ruby_2_3/test/webrick/test_httpauth.rb	(revision 62970)
@@ -4,6 +4,7 @@ require "net/http" https://github.com/ruby/ruby/blob/trunk/ruby_2_3/test/webrick/test_httpauth.rb#L4
 require "tempfile"
 require "webrick"
 require "webrick/httpauth/basicauth"
+require "stringio"
 require_relative "utils"
 
 class TestWEBrickHTTPAuth < Test::Unit::TestCase
@@ -211,12 +212,97 @@ class TestWEBrickHTTPAuth < Test::Unit:: https://github.com/ruby/ruby/blob/trunk/ruby_2_3/test/webrick/test_httpauth.rb#L212
     }
   end
 
+  def test_digest_auth_int
+    log_tester = lambda {|log, access_log|
+      log.reject! {|line| /\A\s*\z/ =~ line }
+      pats = [
+        /ERROR Digest wb auth-int realm: no credentials in the request\./,
+        /ERROR WEBrick::HTTPStatus::Unauthorized/,
+        /ERROR Digest wb auth-int realm: foo: digest unmatch\./
+      ]
+      pats.each {|pat|
+        assert(!log.grep(pat).empty?, "webrick log doesn't have expected error: #{pat.inspect}")
+        log.reject! {|line| pat =~ line }
+      }
+      assert_equal([], log)
+   }
+    TestWEBrick.start_httpserver({}, log_tester) {|server, addr, port, log|
+      realm = "wb auth-int realm"
+      path = "/digest_auth_int"
+
+      Tempfile.create("test_webrick_auth_int") {|tmpfile|
+        tmpfile.close
+        tmp_pass = WEBrick::HTTPAuth::Htdigest.new(tmpfile.path)
+        tmp_pass.set_passwd(realm, "foo", "Hunter2")
+        tmp_pass.flush
+
+        htdigest = WEBrick::HTTPAuth::Htdigest.new(tmpfile.path)
+        users = []
+        htdigest.each{|user, pass| users << user }
+        assert_equal %w(foo), users
+
+        auth = WEBrick::HTTPAuth::DigestAuth.new(
+          :Realm => realm, :UserDB => htdigest,
+          :Algorithm => 'MD5',
+          :Logger => server.logger,
+          :Qop => %w(auth-int),
+        )
+        server.mount_proc(path){|req, res|
+          auth.authenticate(req, res)
+          res.body = "bbb"
+        }
+        Net::HTTP.start(addr, port) do |http|
+          post = Net::HTTP::Post.new(path)
+          params = {}
+          data = 'hello=world'
+          body = StringIO.new(data)
+          post.content_length = data.bytesize
+          post['Content-Type'] = 'application/x-www-form-urlencoded'
+          post.body_stream = body
+
+          http.request(post) do |res|
+            assert_equal('401', res.code, log.call)
+            res["www-authenticate"].scan(DIGESTRES_) do |key, quoted, token|
+              params[key.downcase] = token || quoted.delete('\\')
+            end
+             params['uri'] = "http://#{addr}:#{port}#{path}"
+          end
+
+          body.rewind
+          cred = credentials_for_request('foo', 'Hunter3', params, body)
+          post['Authorization'] = cred
+          post.body_stream = body
+          http.request(post){|res|
+            assert_equal('401', res.code, log.call)
+            assert_not_equal("bbb", res.body, log.call)
+          }
+
+          body.rewind
+          cred = credentials_for_request('foo', 'Hunter2', params, body)
+          post['Authorization'] = cred
+          post.body_stream = body
+          http.request(post){|res| assert_equal("bbb", res.body, log.call)}
+        end
+      }
+    }
+  end
+
   private
-  def credentials_for_request(user, password, params)
+  def credentials_for_request(user, password, params, body = nil)
     cnonce = "hoge"
     nonce_count = 1
     ha1 = "#{user}:#{params['realm']}:#{password}"
-    ha2 = "GET:#{params['uri']}"
+    if body
+      dig = Digest::MD5.new
+      while buf = body.read(16384)
+        dig.update(buf)
+      end
+      body.rewind
+      ha2 = "POST:#{params['uri']}:#{dig.hexdigest}"
+    else
+      ha2 = "GET:#{params['uri']}"
+    end
+
     request_digest =
       "#{Digest::MD5.hexdigest(ha1)}:" \
       "#{params['nonce']}:#{'%08x' % nonce_count}:#{cnonce}:#{params['qop']}:" \
Index: ruby_2_3/test/webrick/test_httpserver.rb
===================================================================
--- ruby_2_3/test/webrick/test_httpserver.rb	(revision 62969)
+++ ruby_2_3/test/webrick/test_httpserver.rb	(revision 62970)
@@ -435,4 +435,71 @@ class TestWEBrickHTTPServer < Test::Unit https://github.com/ruby/ruby/blob/trunk/ruby_2_3/test/webrick/test_httpserver.rb#L435
     s&.shutdown
     th&.join
   end
+
+  def test_gigantic_request_header
+    log_tester = lambda {|log, access_log|
+      assert_equal 1, log.size
+      assert log[0].include?('ERROR headers too large')
+    }
+    TestWEBrick.start_httpserver({}, log_tester){|server, addr, port, log|
+      server.mount('/', WEBrick::HTTPServlet::FileHandler, __FILE__)
+      TCPSocket.open(addr, port) do |c|
+        c.write("GET / HTTP/1.0\r\n")
+        junk = -"X-Junk: #{' ' * 1024}\r\n"
+        assert_raise(Errno::ECONNRESET, Errno::EPIPE) do
+          loop { c.write(junk) }
+        end
+      end
+    }
+  end
+
+  def test_eof_in_chunk
+    log_tester = lambda do |log, access_log|
+      assert_equal 1, log.size
+      assert log[0].include?('ERROR bad chunk data size')
+    end
+    TestWEBrick.start_httpserver({}, log_tester){|server, addr, port, log|
+      server.mount_proc('/', ->(req, res) { res.body = req.body })
+      TCPSocket.open(addr, port) do |c|
+        c.write("POST / HTTP/1.1\r\nHost: example.com\r\n" \
+                "Transfer-Encoding: chunked\r\n\r\n5\r\na")
+        c.shutdown(Socket::SHUT_WR) # trigger EOF in server
+        res = c.read
+        assert_match %r{\AHTTP/1\.1 400 }, res
+      end
+    }
+  end
+
+  def test_big_chunks
+    nr_out = 3
+    buf = 'big' # 3 bytes is bigger than 2!
+    config = { :InputBufferSize => 2 }.freeze
+    total = 0
+    all = ''
+    TestWEBrick.start_httpserver(config){|server, addr, port, log|
+      server.mount_proc('/', ->(req, res) {
+        err = []
+        ret = req.body do |chunk|
+          n = chunk.bytesize
+          n > config[:InputBufferSize] and err << "#{n} > :InputBufferSize"
+          total += n
+          all << chunk
+        end
+        ret.nil? or err << 'req.body should return nil'
+        (buf * nr_out) == all or err << 'input body does not match expected'
+        res.header['connection'] = 'close'
+        res.body = err.join("\n")
+      })
+      TCPSocket.open(addr, port) do |c|
+        c.write("POST / HTTP/1.1\r\nHost: example.com\r\n" \
+                "Transfer-Encoding: chunked\r\n\r\n")
+        chunk = "#{buf.bytesize.to_s(16)}\r\n#{buf}\r\n"
+        nr_out.times { c.write(chunk) }
+        c.write("0\r\n\r\n")
+        head, body = c.read.split("\r\n\r\n")
+        assert_match %r{\AHTTP/1\.1 200 OK}, head
+        assert_nil body
+      end
+    }
+  end
 end
Index: ruby_2_3/version.h
===================================================================
--- ruby_2_3/version.h	(revision 62969)
+++ ruby_2_3/version.h	(revision 62970)
@@ -1,6 +1,6 @@ https://github.com/ruby/ruby/blob/trunk/ruby_2_3/version.h#L1
 #define RUBY_VERSION "2.3.7"
 #define RUBY_RELEASE_DATE "2018-03-28"
-#define RUBY_PATCHLEVEL 448
+#define RUBY_PATCHLEVEL 449
 
 #define RUBY_RELEASE_YEAR 2018
 #define RUBY_RELEASE_MONTH 3
Index: ruby_2_3/lib/webrick/httprequest.rb
===================================================================
--- ruby_2_3/lib/webrick/httprequest.rb	(revision 62969)
+++ ruby_2_3/lib/webrick/httprequest.rb	(revision 62970)
@@ -414,9 +414,13 @@ module WEBrick https://github.com/ruby/ruby/blob/trunk/ruby_2_3/lib/webrick/httprequest.rb#L414
 
     MAX_URI_LENGTH = 2083 # :nodoc:
 
+    # same as Mongrel, Thin and Puma
+    MAX_HEADER_LENGTH = (112 * 1024) # :nodoc:
+
     def read_request_line(socket)
       @request_line = read_line(socket, MAX_URI_LENGTH) if socket
-      if @request_line.bytesize >= MAX_URI_LENGTH and @request_line[-1, 1] != LF
+      @request_bytes = @request_line.bytesize
+      if @request_bytes >= MAX_URI_LENGTH and @request_line[-1, 1] != LF
         raise HTTPStatus::RequestURITooLarge
       end
       @request_time = Time.now
@@ -435,6 +439,9 @@ module WEBrick https://github.com/ruby/ruby/blob/trunk/ruby_2_3/lib/webrick/httprequest.rb#L439
       if socket
         while line = read_line(socket)
           break if /\A(#{CRLF}|#{LF})\z/om =~ line
+          if (@request_bytes += line.bytesize) > MAX_HEADER_LENGTH
+            raise HTTPStatus::RequestEntityTooLarge, 'headers too large'
+          end
           @raw_header << line
         end
       end
@@ -502,12 +509,16 @@ module WEBrick https://github.com/ruby/ruby/blob/trunk/ruby_2_3/lib/webrick/httprequest.rb#L509
     def read_chunked(socket, block)
       chunk_size, = read_chunk_size(socket)
       while chunk_size > 0
-        data = read_data(socket, chunk_size) # read chunk-data
-        if data.nil? || data.bytesize != chunk_size
-          raise BadRequest, "bad chunk data size."
-        end
+        begin
+          sz = [ chunk_size, @buffer_size ].min
+          data = read_data(socket, sz) # read chunk-data
+          if data.nil? || data.bytesize != sz
+            raise HTTPStatus::BadRequest, "bad chunk data size."
+          end
+          block.call(data)
+        end while (chunk_size -= sz) > 0
+
         read_line(socket)                    # skip CRLF
-        block.call(data)
         chunk_size, = read_chunk_size(socket)
       end
       read_header(socket)                    # trailer + CRLF
Index: ruby_2_3/lib/webrick/httpauth/digestauth.rb
===================================================================
--- ruby_2_3/lib/webrick/httpauth/digestauth.rb	(revision 62969)
+++ ruby_2_3/lib/webrick/httpauth/digestauth.rb	(revision 62970)
@@ -235,9 +235,11 @@ module WEBrick https://github.com/ruby/ruby/blob/trunk/ruby_2_3/lib/webrick/httpauth/digestauth.rb#L235
           ha2 = hexdigest(req.request_method, auth_req['uri'])
           ha2_res = hexdigest("", auth_req['uri'])
         elsif auth_req['qop'] == "auth-int"
-          ha2 = hexdigest(req.request_method, auth_req['uri'],
-                          hexdigest(req.body))
-          ha2_res = hexdigest("", auth_req['uri'], hexdigest(res.body))
+          body_digest = @h.new
+          req.body { |chunk| body_digest.update(chunk) }
+          body_digest = body_digest.hexdigest
+          ha2 = hexdigest(req.request_method, auth_req['uri'], body_digest)
+          ha2_res = hexdigest("", auth_req['uri'], body_digest)
         end
 
         if auth_req['qop'] == "auth" || auth_req['qop'] == "auth-int"
Index: ruby_2_3/lib/webrick/httpservlet/cgihandler.rb
===================================================================
--- ruby_2_3/lib/webrick/httpservlet/cgihandler.rb	(revision 62969)
+++ ruby_2_3/lib/webrick/httpservlet/cgihandler.rb	(revision 62970)
@@ -65,9 +65,7 @@ module WEBrick https://github.com/ruby/ruby/blob/trunk/ruby_2_3/lib/webrick/httpservlet/cgihandler.rb#L65
           cgi_in.write("%8d" % dump.bytesize)
           cgi_in.write(dump)
 
-          if req.body and req.body.bytesize > 0
-            cgi_in.write(req.body)
-          end
+          req.body { |chunk| cgi_in.write(chunk) }
         ensure
           cgi_in.close
           status = $?.exitstatus
Index: ruby_2_3/ChangeLog
===================================================================
--- ruby_2_3/ChangeLog	(revision 62969)
+++ ruby_2_3/ChangeLog	(revision 62970)
@@ -1,3 +1,86 @@ https://github.com/ruby/ruby/blob/trunk/ruby_2_3/ChangeLog#L1
+Wed Mar 28 18:04:37 2018  Eric Wong  <normalperson@y...>
+
+	webrick: use IO.copy_stream for multipart response
+
+	Use the new Proc response body feature to generate a multipart
+	range response dynamically.  We use a flat array to minimize
+	object overhead as much as possible; as many ranges may fit
+	into an HTTP request header.
+
+	* lib/webrick/httpservlet/filehandler.rb (multipart_body): new method
+	  (make_partial_content): use multipart_body
+
+	webrick/httprequest: limit request headers size
+
+	We use the same 112 KB limit started (AFAIK) by Mongrel, Thin,
+	and Puma to prevent malicious users from using up all the memory
+	with a single request.  This also limits the damage done by
+	excessive ranges in multipart Range: requests.
+
+	Due to the way we rely on IO#gets and the desire to keep
+	the code simple, the actual maximum header may be 4093 bytes
+	larger than 112 KB, but we're splitting hairs at that point.
+
+	* lib/webrick/httprequest.rb: define MAX_HEADER_LENGTH
+	  (read_header): raise when headers exceed max length
+
+	webrick/httpservlet/cgihandler: reduce memory use
+
+	WEBrick::HTTPRequest#body can be passed a block to process the
+	body in chunks.  Use this feature to avoid building a giant
+	string in memory.
+
+	* lib/webrick/httpservlet/cgihandler.rb (do_GET):
+	  avoid reading entire request body into memory
+	  (do_POST is aliased to do_GET, so it handles bodies)
+
+	webrick/httprequest: raise correct exception
+
+	"BadRequest" alone does not resolve correctly, it is in the
+	HTTPStatus namespace.
+
+	* lib/webrick/httprequest.rb (read_chunked): use correct exception
+	* test/webrick/test_httpserver.rb (test_eof_in_chunk): new test
+
+	webrick/httprequest: use InputBufferSize for chunked requests
+
+	While WEBrick::HTTPRequest#body provides a Proc interface
+	for streaming large request bodies, clients must not force
+	the server to use an excessively large chunk size.
+
+	* lib/webrick/httprequest.rb (read_chunk_size): limit each
+	  read and block.call to :InputBufferSize in config.
+	* test/webrick/test_httpserver.rb (test_big_chunks): new test
+
+	webrick: add test for Digest auth-int
+
+	No changes to the actual code, this is a new test for
+	a feature for which no tests existed.  I don't understand
+	the Digest authentication code well at all, but this is
+	necessary for the subsequent change.
+
+	* test/webrick/test_httpauth.rb (test_digest_auth_int): new test
+	  (credentials_for_request): support bodies with POST
+
+	webrick/httpauth/digestauth: stream req.body
+
+	WARNING! WARNING! WARNING!  LIKELY BROKEN CHANGE
+
+	Pass a proc to WEBrick::HTTPRequest#body to avoid reading a
+	potentially large request body into memory during
+	authentication.
+
+	WARNING! this will break apps completely which want to do
+	something with the body besides calculating the MD5 digest
+	of it.
+
+	Also, keep in mind that probably nobody uses "auth-int".
+	Servers such as Apache, lighttpd, nginx don't seem to
+	support it; nor does curl when using POST/PUT bodies;
+	and we didn't have tests for it until now...
+
+	* lib/webrick/httpauth/digestauth.rb (_authenticate): stream req.body
+
 Wed Mar 28 15:48:30 2018  Kazuki Yamaguchi <k@r...>
 
 	backport some changes from openssl gem v2.0.6 and v2.0.7.
Index: ruby_2_3
===================================================================
--- ruby_2_3	(revision 62969)
+++ ruby_2_3	(revision 62970)

Property changes on: ruby_2_3
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /trunk:r62960-62965

--
ML: ruby-changes@q...
Info: http://www.atdot.net/~ko1/quickml/