util:charset:codepoints: codepoint_cmpi warning about non-transitivity
authorDouglas Bagnall <douglas.bagnall@catalyst.net.nz>
Thu, 4 Apr 2024 01:56:16 +0000 (14:56 +1300)
committerAndrew Bartlett <abartlet@samba.org>
Wed, 10 Apr 2024 22:56:33 +0000 (22:56 +0000)
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625

Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
lib/util/charset/codepoints.c

index 68b7b08ee5024a1c651e923332eeeaf9134d9bdc..80226278faf4d5b9904778db7fb5d20479f99cfe 100644 (file)
@@ -16480,6 +16480,18 @@ _PUBLIC_ bool isupper_m(codepoint_t val)
 */
 _PUBLIC_ int codepoint_cmpi(codepoint_t c1, codepoint_t c2)
 {
+       /*
+        * FIXME: this is unsuitable for use in a sort, as the
+        * comparison is intransitive.
+        *
+        * The problem is toupper_m() is only called on equality case,
+        * which has strange effects.
+        *
+        *    Consider {'a', 'A', 'B'}.
+        *     'a' == 'A'
+        *     'a' >  'B'  (lowercase letters come after upper)
+        *     'A' <  'B'
+        */
        if (c1 == c2 ||
            toupper_m(c1) == toupper_m(c2)) {
                return 0;