Skip to content

bee_version_compare: Rewrite compare_version_strings() #53

Merged
merged 8 commits into from
Jul 8, 2022
Merged

Conversation

donald
Copy link
Contributor

@donald donald commented Jul 6, 2022

The current algorithm is complicated and broken.

First, it removes common prefixes, so that, for example, '1.5.8' and
'1.5.9' are compared numerically as 8 and 9. After that, '0's are
skipped, to that, for example, '1.5.09' can compare equal to '1.5.9'.

In a next step, removed digits are restored, so that
that '1.5.1134' and '1.5.1211' are compared as 1134 and 1211 and not
as 34 and 11.

However, the removal of '0's is done independently for both values, so
that a different number of '0's might be removed, while the undo is done
in sync, so that the same number of digits is 'restored'.

As a result, '1.101' compares less that '1.11':

a             b

1.101         1.11  # original
01            1     # after removal of common prefix
1             1     # after removal of '0's
01            11    # after restoration of digits

Additionally, the code tries to sort digits after non-digit characters
but failed to miss the case, when only the second string starts with a
digit.

Rewrite algorithm from scratch.

The new algorithm advances through both strings comparing them at their
beginning

  • if both strings start with a digit
    • compare numerically
    • if equal, skip over digits and loop
  • otherwise if the first character is different, compare character values
  • otherwise if both strings end, result is 0 (equal)
  • otherwise, advance to next character and loop

We intentionally don't sort digits after non-digits, as the old code
tried, because this makes more sense for the case when we have a hex
string (e.g. from a git hash) as part of the version string.

To make the code more readable, the character pointers are dereferenced
multiple times and some conditionals are evaluated multiple times. We
can trust the compiler to optimize this away.

Comparing 'bee list --available' with the old and the new algorithm
produces the following differences:

firefox-9.0.1-0.x86_64          firefox-9.0.1-0.x86_64
firefox-10.0.2-0.x86_64         firefox-10.0.2-0.x86_64
firefox-102.0-0.x86_64       <
firefox-11.0-0.x86_64           firefox-11.0-0.x86_64
firefox-12.0-0.x86_64           firefox-12.0-0.x86_64

firefox-94.0.1-0.x86_64         firefox-94.0.1-0.x86_64
firefox-94.0.2-0.x86_64         firefox-94.0.2-0.x86_64
                             >  firefox-102.0-0.x86_64
firefox_current-4-0.x86_64      firefox_current-4-0.x86_64
firefox_current-5-0.x86_64      firefox_current-5-0.x86_64

java-1.7.0_13-0.x86_64          java-1.7.0_13-0.x86_64
java-1.7.0_17-0.x86_64          java-1.7.0_17-0.x86_64
                             >  java-1.8.0_11-0.x86_64
                             >  java-1.8.0_45-0.x86_64
java-1.8.0_101-0.x86_64         java-1.8.0_101-0.x86_64
java-1.8.0_102-0.x86_64         java-1.8.0_102-0.x86_64
java-1.8.0_102-1.x86_64         java-1.8.0_102-1.x86_64
java-1.8.0_11-0.x86_64       <
java-1.8.0_45-0.x86_64       <
java-1.8.0_121-0.x86_64         java-1.8.0_121-0.x86_64
java-1.8.0_131-0.x86_64         java-1.8.0_131-0.x86_64

mxq-0.0_p106_8a0ad87-0.x86_64           mxq-0.0_p106_8a0ad87-0.x86_64
mxq-0.0_p107_eaf8146-0.x86_64           mxq-0.0_p107_eaf8146-0.x86_64
mxq-0.0_p108_f56522b-0.x86_64         <
mxq-0.0_p108_0b7afc1-0.x86_64           mxq-0.0_p108_0b7afc1-0.x86_64
mxq-0.0_p108_4e83c8d-0.x86_64           mxq-0.0_p108_4e83c8d-0.x86_64
mxq-0.0_p109_ed52e7f-0.x86_64         | mxq-0.0_p108_f56522b-0.x86_64
mxq-0.0_p109_621fea8-0.x86_64           mxq-0.0_p109_621fea8-0.x86_64
                                      > mxq-0.0_p109_ed52e7f-0.x86_64
mxq-0.0_p110_baba367-0.x86_64           mxq-0.0_p110_baba367-0.x86_64
mxq-0.0_p111_38cc7db-0.x86_64           mxq-0.0_p111_38cc7db-0.x86_64

mxq-0.1.1_p1_4840161-0.x86_64           mxq-0.1.1_p1_4840161-0.x86_64
mxq-0.1.2_p0_d21c6ae-0.x86_64           mxq-0.1.2_p0_d21c6ae-0.x86_64
mxq-0.1.2_p1_efa8ec4-0.x86_64         <
mxq-0.1.2_p1_9ccc12f-0.x86_64           mxq-0.1.2_p1_9ccc12f-0.x86_64
                                      > mxq-0.1.2_p1_efa8ec4-0.x86_64
mxq-0.1.3_p0_0c3032c-0.x86_64           mxq-0.1.3_p0_0c3032c-0.x86_64
mxq-0.1.4_p0_c936ba0-0.x86_64           mxq-0.1.4_p0_c936ba0-0.x86_64

mxstartup-2.9_p1_0e23fa8-0.x86_64       mxstartup-2.9_p1_0e23fa8-0.x86_64
mxstartup-2.9_p2_e915bc3-0.x86_64       mxstartup-2.9_p2_e915bc3-0.x86_64
mxstartup-2.9_p3_d919716-0.x86_64     <
mxstartup-2.9_p3_21347ab-0.x86_64       mxstartup-2.9_p3_21347ab-0.x86_64
                                      > mxstartup-2.9_p3_d919716-0.x86_64
mxstartup-2.10_p0_d919716-0.x86_64      mxstartup-2.10_p0_d919716-0.x86_64
mxstartup-2.11_p0_4ee8fed-0.x86_64      mxstartup-2.11_p0_4ee8fed-0.x86_64

Fixes #52

@donald donald force-pushed the fix-52 branch 2 times, most recently from d828d75 to 2c0b280 Compare July 6, 2022 11:26
src/bee_version_compare.c Outdated Show resolved Hide resolved
@donald
Copy link
Contributor Author

donald commented Jul 7, 2022

The original version (tried to) sort digits after characters. This is now added to the new version as well. Comment with the example strings fixed and commit message was modified a bit.

@donald donald force-pushed the fix-52 branch 2 times, most recently from 955711e to 4f14b0a Compare July 7, 2022 08:59
@donald
Copy link
Contributor Author

donald commented Jul 7, 2022

  • Back to alphabetical sort, if only one string stars with a digit, which makes more sense
  • Move the end of string detection toward the end of the loop, as the prior code doesn't fail with \0 and the end-of-string case is the exception. (Suggested by @niclas)

The current algorithm is complicated and broken.

First, it removes common prefixes, so that, for example, '1.5.8' and
'1.5.9' are compared numerically as 8 and 9.  After that, '0's are
skipped, to that, for example, '1.5.09' can compare equal to '1.5.9'.

In a next step, removed digits are restored, so that
that '1.5.1134' and '1.5.1211' are compared as 1134 and 1211 and not
as 34 and 11.

However, the removal of '0's is done independently for both values, so
that a different number of '0's might be removed, while the undo is done
in sync, so that the same number of digits is 'restored'.

As a result, '1.101' compares less that '1.11':

a             b

1.101         1.11  # original
01            1     # after removal of common prefix
1             1     # after removal of '0's
01            11    # after restoration of digits

Additionally, the code tries to sort digits after non-digit characters
but failed to miss the case, when only the second string starts with a
digit.

Rewrite algorithm from scratch.

The new algorithm advances through both strings comparing them at their
beginning

  - if both strings start with a digit
     - compare numerically
     - if equal, skip over digits and loop
  - otherwise if the first character is different, compare character values
  - otherwise if both strings end, result is 0 (equal)
  - otherwise, advance to next character and loop

We intentionally don't sort digits after non-digits, as the old code
tried, because this makes more sense for the case when we have a hex
string (e.g. from a git hash) as part of the version string.

To make the code more readable, the character pointers are dereferenced
multiple times and some conditionals are evaluated multiple times. We
can trust the compiler to optimize this away.

Comparing 'bee list --available' with the old and the new algorithm
produces the following differences:

firefox-9.0.1-0.x86_64          firefox-9.0.1-0.x86_64
firefox-10.0.2-0.x86_64         firefox-10.0.2-0.x86_64
firefox-102.0-0.x86_64       <
firefox-11.0-0.x86_64           firefox-11.0-0.x86_64
firefox-12.0-0.x86_64           firefox-12.0-0.x86_64

firefox-94.0.1-0.x86_64         firefox-94.0.1-0.x86_64
firefox-94.0.2-0.x86_64         firefox-94.0.2-0.x86_64
                             >  firefox-102.0-0.x86_64
firefox_current-4-0.x86_64      firefox_current-4-0.x86_64
firefox_current-5-0.x86_64      firefox_current-5-0.x86_64

java-1.7.0_13-0.x86_64          java-1.7.0_13-0.x86_64
java-1.7.0_17-0.x86_64          java-1.7.0_17-0.x86_64
                             >  java-1.8.0_11-0.x86_64
                             >  java-1.8.0_45-0.x86_64
java-1.8.0_101-0.x86_64         java-1.8.0_101-0.x86_64
java-1.8.0_102-0.x86_64         java-1.8.0_102-0.x86_64
java-1.8.0_102-1.x86_64         java-1.8.0_102-1.x86_64
java-1.8.0_11-0.x86_64       <
java-1.8.0_45-0.x86_64       <
java-1.8.0_121-0.x86_64         java-1.8.0_121-0.x86_64
java-1.8.0_131-0.x86_64         java-1.8.0_131-0.x86_64

mxq-0.0_p106_8a0ad87-0.x86_64           mxq-0.0_p106_8a0ad87-0.x86_64
mxq-0.0_p107_eaf8146-0.x86_64           mxq-0.0_p107_eaf8146-0.x86_64
mxq-0.0_p108_f56522b-0.x86_64         <
mxq-0.0_p108_0b7afc1-0.x86_64           mxq-0.0_p108_0b7afc1-0.x86_64
mxq-0.0_p108_4e83c8d-0.x86_64           mxq-0.0_p108_4e83c8d-0.x86_64
mxq-0.0_p109_ed52e7f-0.x86_64         | mxq-0.0_p108_f56522b-0.x86_64
mxq-0.0_p109_621fea8-0.x86_64           mxq-0.0_p109_621fea8-0.x86_64
                                      > mxq-0.0_p109_ed52e7f-0.x86_64
mxq-0.0_p110_baba367-0.x86_64           mxq-0.0_p110_baba367-0.x86_64
mxq-0.0_p111_38cc7db-0.x86_64           mxq-0.0_p111_38cc7db-0.x86_64

mxq-0.1.1_p1_4840161-0.x86_64           mxq-0.1.1_p1_4840161-0.x86_64
mxq-0.1.2_p0_d21c6ae-0.x86_64           mxq-0.1.2_p0_d21c6ae-0.x86_64
mxq-0.1.2_p1_efa8ec4-0.x86_64         <
mxq-0.1.2_p1_9ccc12f-0.x86_64           mxq-0.1.2_p1_9ccc12f-0.x86_64
                                      > mxq-0.1.2_p1_efa8ec4-0.x86_64
mxq-0.1.3_p0_0c3032c-0.x86_64           mxq-0.1.3_p0_0c3032c-0.x86_64
mxq-0.1.4_p0_c936ba0-0.x86_64           mxq-0.1.4_p0_c936ba0-0.x86_64

mxstartup-2.9_p1_0e23fa8-0.x86_64       mxstartup-2.9_p1_0e23fa8-0.x86_64
mxstartup-2.9_p2_e915bc3-0.x86_64       mxstartup-2.9_p2_e915bc3-0.x86_64
mxstartup-2.9_p3_d919716-0.x86_64     <
mxstartup-2.9_p3_21347ab-0.x86_64       mxstartup-2.9_p3_21347ab-0.x86_64
                                      > mxstartup-2.9_p3_d919716-0.x86_64
mxstartup-2.10_p0_d919716-0.x86_64      mxstartup-2.10_p0_d919716-0.x86_64
mxstartup-2.11_p0_4ee8fed-0.x86_64      mxstartup-2.11_p0_4ee8fed-0.x86_64
Gcc can warn on implicit fallthroughs. Mark them with a comment which is
recognized by gcc.
Enable -Wextra. Exclude -Wno-override-init for now, because bee_getopt.h
relies on it. See [1] for similar issue in mxq.

[1]: mariux64/mxq#131
Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

beeversion -max sorts incorrectly
2 participants