ruby-changes:53536
From: shyouhei <ko1@a...>
Date: Fri, 16 Nov 2018 11:34:04 +0900 (JST)
Subject: [ruby-changes:53536] shyouhei:r65752 (trunk): enc/unicode.c: 'a' is bigger than 'A'
shyouhei 2018-11-16 11:34:00 +0900 (Fri, 16 Nov 2018) New Revision: 65752 https://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=revision&revision=65752 Log: enc/unicode.c: 'a' is bigger than 'A' In ASCII, 'a' is bigger than 'A'. Which means 'A' - 'a' is a negative number (-32, to be precise). In C, the type of 'a' and 'A' are signed int (cf: ISO/IEC 9899:1990 section 6.1.3.4). So 'A' - 'a' is also a signed int. It is `(signed int)-32`. The problem is, OnigCodePoint is unsigned int. Adding a negative number to a variable of OnigCodepoint (`code` here) introduces an unintentional cast of `(unsigned)(signed)-32`, which is 4,294,967,264. Adding this value to code then overflows, and the result eventually becomes normal codepoint. The series of operations are not a serious problem but because `code >= 'a'` holds, we can `(code - 'a') + 'A'` to reroute this. See also: https://github.com/k-takata/Onigmo/pull/107 Modified files: trunk/enc/unicode.c Index: enc/unicode.c =================================================================== --- enc/unicode.c (revision 65751) +++ enc/unicode.c (revision 65752) @@ -683,8 +683,10 @@ onigenc_unicode_case_map(OnigCaseFoldTyp https://github.com/ruby/ruby/blob/trunk/enc/unicode.c#L683 MODIFIED; if (flags & ONIGENC_CASE_FOLD_TURKISH_AZERI && code == 'i') code = I_WITH_DOT_ABOVE; - else - code += 'A' - 'a'; + else { + code -= 'a'; + code += 'A'; + } } } else if (code >= 'A' && code <= 'Z') { -- ML: ruby-changes@q... Info: http://www.atdot.net/~ko1/quickml/