Ticket #46286

AI likely weakened by integer overflow

Open Date: 2022-12-16 08:09 Last Update: 2023-05-22 18:04

Reporter:
Owner:
Type:
Status:
Closed
Component:
MileStone:
Priority:
5 - Medium
Severity:
5 - Medium
Resolution:
Fixed
File:
6

Details

AI desire to kill likely weakened by integer overflow in ai/default/aiunit.c:kill_desire()

Depending on input values, instead of returning a big positive value for desire to kill, it returns a negative one.

Even if freeciv is compiled on 64 bit systems, the integer calculation is performed 32 bit.

Random example from a real game:

input values for kill_desire():

int benefit = 7824
int attack =  6400
int loss = 50
int vuln = 12544
int victim_count = 3
SHIELD_WEIGHTING = 17 by #define

Current formula:

desire = ((benefit * attack - loss * vuln) * victim_count * SHIELD_WEIGHTING / (attack + vuln * victim_count));

result: desire = -40270

Ive got 2 ideas how so solve it:

1. Mitigate the overflow, cap the value of (benefit * attack - loss * vuln) * victim_count so that the overflow wont happen, like:

desire = (benefit * attack - loss * vuln) * victim_count;`

if (desire < (INT_MAX / SHIELD_WEIGHTING)) { /* mitigate signed integer overflow */
  desire *= SHIELD_WEIGHTING;
} else {
  desire = INT_MAX;
}

result: desire = 48770

Theoretically the overflow could still happen before multiplying with SHIELD_WEIGHTING, but it would be an improvement over the more easily triggered overflows happening right now. The cases ive seen in real game, would be mitigated.

2. Fix the overflow for systems, that have long to be 64 bit:

desire = (long) (benefit * attack - loss * vuln) * victim_count * SHIELD_WEIGHTING / (attack + vuln * victim_count);

result: desire = 57271

I quickly went over the source code, and i dont see any 64 bit stuff happening. Also i dont know how you handle portability and such in freeciv. Thats why i didnt propose the use of int64_t here. Can you find better solutions?

Ticket History (3/54 Histories)

2022-12-16 08:09 Updated by: mortmann
  • New Ticket "AI likely weakened by integer overflow" created
2022-12-16 08:14 Updated by: mortmann
  • Details Updated
2022-12-16 08:24 Updated by: cazfi
2022-12-16 10:48 Updated by: cazfi
Comment

To avoid introducing 64 bit values to older branches, but to improve the situation on those too, I guess it would be easy to at least turn the value to an unsigned one?

2022-12-16 20:18 Updated by: mortmann
Comment

i did some measurement with unsigned int:

desire = (unsigned int) (benefit * attack - loss * vuln) * victim_count * SHIELD_WEIGHTING / (attack + vuln * victim_count);

amount of kill_desire() calls: 20300

amount of overflows with current code: 195 (accounts for 0.96 percent of kill_desire() calls)

amount of fixed overflows with unsigned int: 113 (accounts for 57.95 percent of overflows with current code)

looks pretty good to me.

2022-12-23 08:49 Updated by: cazfi
Comment

Do you plan to make a patch yourself?

2022-12-24 02:57 Updated by: mortmann
Comment

not yet, because "unsigned int" is no final solution.

so you prefer to have "unsigned int" now and leave a ticket open for a real solution in the future? then i would prepare the PR/patch.

or should we instead work towards a real solution now?

2022-12-25 09:38 Updated by: cazfi
Comment

Reply To mortmann

not yet, because "unsigned int" is no final solution. so you prefer to have "unsigned int" now and leave a ticket open for a real solution in the future? then i would prepare the PR/patch. or should we instead work towards a real solution now?

We want to improve situation in stable branches (S3_1 and S3_0 at least, maybe even S2_6). Not sure if "real solution" is going to be acceptable to those, so likely we should go by the two steps approach of first having "unsigned int", and later more extensive one. That also means two tickets; for one can't have ticket listed in resolved tickets in 3.0.6 release notes if it in fact isn't resolved. You can either keep this as the "complete solution" ticket (-> split 'unsigned int' part to new ticket), or the "initial solution" ticket (-> open a new ticket about further changes) - both are fine for me.

2023-01-13 08:47 Updated by: mortmann
Comment

We will keep this ticket for the complete solution.

For the split ticket see https://osdn.net/projects/freeciv/ticket/46513

2023-01-13 09:27 Updated by: cazfi
Comment

Next question I have (I could have a look myself, but I want opinions) is if that's something we should not be using integer types at all, but a floating point number? Like, can there be desires like 0.1 and 0.9, which current integer using approach makes 0 in both cases (like they were equal, and not one on them nine times higher than the other)

2023-01-22 02:30 Updated by: cazfi
Comment

Reply To cazfi

Next question I have (I could have a look myself, but I want opinions) is if that's something we should not be using integer types at all, but a floating point number? Like, can there be desires like 0.1 and 0.9, which current integer using approach makes 0 in both cases (like they were equal, and not one on them nine times higher than the other)

Related: All the callers, except process_attacker_want(), assign value returned from this function to variable of type adv_want, and even process_attacker_want() just has the variable types wrong -> #46530

2023-02-03 00:34 Updated by: cazfi
Comment

Keeping S3_0 target for now, though the mitigation patch went in on another ticket.

2023-02-28 21:42 Updated by: alain_bkr
Comment

i vote for float it seems the simplest change, and may fix other similar issues elsewhere in the code.

All current CPU have floating point unit (since 1995 ?)

2023-03-11 22:00 Updated by: cazfi
2023-03-13 00:25 Updated by: alain_bkr
Comment

I had a lot of fun with this !

I discovered that clang can detect runtime overflow with the additional option "-fsanitize=undefined"

On my toto.c test program it gives very nice error message , without killing the program which simply continue

toto.c:78:27: runtime error: signed integer overflow: 5000000 * 1000 cannot be represented in type 'int'

And i have trouble with gcc 11.3 , because it overflows unsigned int, but not int !

I'll try to recompile with clang and note the places where overflow occurs.

2023-03-13 08:56 Updated by: alain_bkr
Comment

I compiled with clang14

mkdir build
cd build

# CLANG 
# https://releases.llvm.org/14.0.0/tools/clang/docs/UsersManual.html 
export CC=clang
export CPP=clang-cpp-14
export CXX=clang++
export LD=lld 

export CFLAGS='-g -O2  -fsanitize=undefined -Wno-cast-align -Wno-string-plus-int'
export CXXFLAGS='-g -O2  -fsanitize=undefined -Wno-cast-align -Wno-string-plus-int'
export LDFLAGS='-fsanitize=undefined -Wno-cast-align -Wno-string-plus-int'

../autogen.sh
../configure --enable-gitrev --enable-debug=yes --enable-gprof --enable-client=gtk3.22 --program-transform-name=s:freeciv:clangFC:g --enable-ai-static=classic

The interesting options is -fsanitize=undefined (which must be added to link options too if understood correctly)

i needed to add -Wno-cast-align -Wno-string-plus-int , just to compile because i don't know how to fix the associated warnings.

It quickly gave int overflow in server, while playing several turns from a saved game

FC3clang-serv.log:../../common/city.c:2866:21: runtime error: signed integer overflow: -999999985 * 100 cannot be represented in type 'int'
FC3clang-serv.log:../../common/city.c:2873:25: runtime error: signed integer overflow: -999999990 * 100 cannot be represented in type 'int'
FC3clang-serv.log:../../common/city.c:2880:37: runtime error: signed integer overflow: -999999985 * 100 cannot be represented in type 'int'

FC3clang-serv.log:> ../../../ai/default/daimilitary.c:1489:20: runtime error: signed integer overflow: 100 * 776736900 cannot be represented in type 'int'
FC3clang-serv.log:> ../../../ai/default/daimilitary.c:1489:20: runtime error: signed integer overflow: 100 * 128890609 cannot be represented in type 'int'

2023-03-13 09:03 Updated by: cazfi
Comment

Tip: Instead of setting those environment variables before running configure, set them as configure parameters (e.g. ../configure CC=clang-14 CXX=clang++-14 ). The benefit is that if some code change causes 'make' to reconfigure the build tree, those parameters are used also in that reconfigure.

2023-03-13 17:55 Updated by: alain_bkr
Comment

Thanks for the tip, i have modified my script.

I launched WW1 scenario , and will let it run for several days (half an hour per turn so 50 turns per day, it uses only one core , even when i tried tex or older threaded ai but this is another topic)

I got another overflow, even with the mitigation of using unsigned int

../../../ai/default/aiunit.c:343:36: runtime error: signed integer overflow: 12260 * 4124961 cannot be represented in type 'int'

I guess we should use float for intermediate computation, and then we will see if the end result overflows.

  desire = (  (float) benefit * attack - loss * vuln ) * (float) victim_count * 
           (float) SHIELD_WEIGHTING / (attack + vuln * victim_count);

2023-03-13 17:59 Updated by: alain_bkr
Comment

I found a gcc option : -ftrapv

It will coredump on overflow, thus easy to backtrace, and check values because the problem may be somewhere before with wrong values.

I found the values for city.c overflow very strange : -999999985 -999999990

2023-03-14 08:37 Updated by: alain_bkr
  • File S3_0_fix_int_overflow_daimilitary.patch (File ID: 11839) is attached
2023-03-14 08:38 Updated by: alain_bkr
  • File S3_0_fix_int_overflow_daimilitary.patch (File ID: 11840) is attached
2023-03-14 08:38 Updated by: alain_bkr
  • File S3_0_fix_int_overflow_daimilitary.patch (File ID: 11841) is attached
2023-03-14 08:38 Updated by: alain_bkr
  • File S3_0_fix_int_overflow_daimilitary.patch (File ID: 11840) is deleted
2023-03-14 08:40 Updated by: alain_bkr
  • File S3_0_fix_int_overflow_aiunit.patch (File ID: 11842) is attached
2023-03-14 08:41 Updated by: alain_bkr
Comment

i attached 2 patches for daimilitary and aiunit for S3_0.

I did nothing for the weird overflow at in city.c with WW1 scenario, because values seems very strange, i'll try to investigate more.

2023-03-14 09:09 Updated by: alain_bkr
Comment

The patch fixes one problem but :

I relaunched WW1 at turn 40 , and i got :

 ../../../ai/default/daimilitary.c:1489:20: runtime error: 1,94115e+11 is outside the range of representable values of type 'int'

and i have similar message from a saved game (with big map and several nations very strong compared to other, so very dangerous.

 ../../../ai/default/daimilitary.c:1489:20: runtime error: 7,76737e+10 is outside the range of representable values of type 'int'
(Edited, 2023-03-14 09:11 Updated by: alain_bkr)
2023-03-14 09:22 Updated by: alain_bkr
Comment

another overflow :

  ../../../ai/default/daimilitary.c:1078:21: runtime error: signed integer overflow: 486 * 4897369 cannot be represented in type 'int' 

2023-03-23 22:21 Updated by: alain_bkr
  • File S3_0_fix_int_overflow_daimilitary.patch (File ID: 11839) is deleted
2023-03-23 22:21 Updated by: alain_bkr
  • File S3_0_fix_int_overflow_daimilitary.patch (File ID: 11841) is deleted
2023-03-23 22:21 Updated by: alain_bkr
  • File S3_0_fix_int_overflow_aiunit.patch (File ID: 11842) is deleted
2023-03-23 22:33 Updated by: alain_bkr
Comment

I deleted my first patch as they don't fix the overflow.

The "best" overflows i got through 38 random games are near 100 * max_int , is also the most frequent one (50 times)

  1 > ../../../ai/default/daimilitary.c:1489:20: runtime error: 2.14184e+11 is outside the range of representable values of type 'int'    ###  ~ 100x max_int !

other are a bit less frequent

1 > ../../../ai/default/daimilitary.c:1492:22: runtime error: signed integer overflow:  100 * 28206721 cannot be represented in type 'int'


1 > ../../../ai/default/daimilitary.c:1078:21: runtime error: signed integer overflow:  758 * 8549776 cannot be represented in type 'int'


1 > ../../../ai/default/aiunit.c:343:56: runtime error: signed integer overflow: 5090 * 640000 cannot be represented in type 'int'

I'll send a patch soon for 3.0 , with a real fix if > 2.109 then =2.109 ...

and a trace in output and log to allow further analyses if needed

(Edited, 2023-03-23 22:37 Updated by: alain_bkr)
2023-03-24 09:27 Updated by: alain_bkr
Comment

for this one

./ai/default/daimilitary.c:1492:22: runtime error: signed integer overflow:  100 * 28206721 cannot be represented in type 'int'
I believe the problem is somewhere else The comment in the code several lines above says it is a percentage , so getting more than 109 seems very unexpected.
    /* First determine the danger. It is measured in percents of our
       * defensive strength, capped at 200 + urgency */

several line below the is

   CITY_LOG(LOG_DEBUG, pcity, "m_a_c_d urgency=%d danger=%d num_def=%d "
             "our_def=%d", urgency, danger, num_defenders, our_def);

How can i activate this logging ? ( i commented out line .../server/srv_log.h:88 if (log_do_output_for_level(level)) but i guess there is a cleaner way to do this )

2023-03-24 09:33 Updated by: cazfi
Comment

Which branch you are testing on? Just asking as you've been running other tests on S3_0, and not on S3_1 that this ticket is targeted to (and the code changes on the area you've been quoting might make your results relevant for specific branch only - are you finding recent regressions in S3_1 or something that has been effectively fixed there)

2023-03-24 22:34 Updated by: alain_bkr
Comment

I started from the proposed patch for 3.1 , with unsigned int instead of int, and it is not enough, it does not fix the problem

I checked for 3.0, the current latest stable and it obviously need a fix too, probably with the same patch.

I tried to stay on the same ticket, to avoid duplication and confusion

2023-03-25 01:03 Updated by: cazfi
Comment

Reply To alain_bkr

I tried to stay on the same ticket

That's exactly what we want - we don't clone the tickets for each branch. I was asking just to know how to interpret your results. Thanks.

2023-03-25 21:16 Updated by: alain_bkr
Comment

This one in S3_1 , compiled with clang

$ git log -1
commit 960c56def698ba8b88a711722d819fcbd51a528f (HEAD -> S3_1, origin/S3_1)
Author: Marko Lindqvist <cazfi74@gmail.com>
Date:   Thu Mar 23 00:29:35 2023 +0200

> 
Game saved as Sz37-Ai028-Lm15-20230325-100653-T0040-Y-2050-auto.sav.xz
> ../../common/city.c:2872:21: runtime error: signed integer overflow: -999999985 * 100 cannot be represented in type 'int'
../../common/city.c:2879:25: runtime error: signed integer overflow: -999999989 * 100 cannot be represented in type 'int'
../../common/city.c:2886:37: runtime error: signed integer overflow: -999999985 * 100 cannot be represented in type 'int'

.../...
Game saved as Sz37-Ai028-Lm15-20230325-100653-T0196-Y01450-auto.sav.xz
> ../../../ai/default/daimilitary.c:1784:20: runtime error: signed integer overflow: 100 * 35414401 cannot be represented in type 'int'

Tests are running , i'll compile more results of all errors i have

(Edited, 2023-03-25 21:18 Updated by: alain_bkr)
2023-03-26 05:47 Updated by: alain_bkr
Comment

3_1 , with clang

i post my test scripts in ticket #47694 ,

I have stepped on these overflows , first column is the count of occurence, i have all saved games

      1 ../../../ai/default/aitools.c:378:18: runtime error: signed integer overflow

      6 > ../../../ai/default/aiunit.c:344:36: runtime error: signed integer overflow
      3 > ../../../ai/default/aiunit.c:344:45: runtime error: signed integer overflow

      9 > ../../../ai/default/daimilitary.c:1784:20: runtime error: signed integer overflow

      1 > ../../../common/aicore/path_finding.c:175:32: runtime error: signed integer overflow

      6 > ../../common/city.c:2872:21: runtime error: signed integer overflow          <--- these 3 follow the same pattern as for S3_0
      6 ../../common/city.c:2879:25: runtime error: signed integer overflow
      6 ../../common/city.c:2886:37: runtime error: signed integer overflow

      1 > ../../common/city.c:3156:33: runtime error: signed integer overflow

(Edited, 2023-03-26 06:57 Updated by: alain_bkr)
2023-03-26 18:38 Updated by: cazfi
  • Owner Update from (None) to cazfi
  • Resolution Update from None to Accepted
Comment

Attached patches for the original kill_desire() part (i.e. this ticket). Have to check what else in alain_bkr's logs, and create new tickets accordingly.

Change kill_desire() to return adv_want (which is a float) that the callers expect it to be. Should fix problems in both ends of the scale -> distinction between different sub-integer values, and overflows with very high values.

Targeting S3_1 (3.1.0-beta2) for now. We may backport this to S3_0 after 3.0.7 has been released.

2023-03-27 21:05 Updated by: alain_bkr
Comment

3.1 , the patch is not applied (tests are still running )

 cat overflow.counted
      1 ../../../ai/default/aitools.c:377:33: runtime error: signed integer overflow
      5 ../../../ai/default/aitools.c:378:18: runtime error: signed integer overflow
      1 ../../../ai/default/aitools.c:378:41: runtime error: signed integer overflow
     17 ../../../ai/default/aiunit.c:344:36: runtime error: signed integer overflow
     14 ../../../ai/default/aiunit.c:344:45: runtime error: signed integer overflow
      4 ../../../ai/default/aiunit.c:344:52: runtime error: signed integer overflow
     30 ../../../ai/default/daimilitary.c:1784:20: runtime error: signed integer overflow
      3 ../../../ai/default/daimilitary.c:1787:22: runtime error: signed integer overflow
      4 ../../../common/aicore/path_finding.c:175:25: runtime error: signed integer overflow
      6 ../../../common/aicore/path_finding.c:175:32: runtime error: signed integer overflow
      2 ../../../common/aicore/path_finding.c:175:40: runtime error: signed integer overflow
     17 ../../common/city.c:2872:21: runtime error: signed integer overflow
     17 ../../common/city.c:2879:25: runtime error: signed integer overflow
     17 ../../common/city.c:2886:37: runtime error: signed integer overflow
      4 ../../common/city.c:3156:33: runtime error: signed integer overflow

I attach the list current of all overflows seen in my tests, it looks like

Game saved as Sz02-Ai002-Lm26-20230326-191857-T0052-Y-3949-auto.sav.xz
> ../../../ai/default/daimilitary.c:1784:20: runtime error: signed integer overflow: 100 * 41434969 cannot be represented in type 'int'

I have not yet tested if i can reproduce the overflow by loading the game ...

2023-04-01 15:05 Updated by: cazfi
Comment

Reply To alain_bkr

3.1 , the patch is not applied (tests are still running )

I guess you had AI at commit 4363db7ef06d5d768bf2236c057f0c5322cbd701 - can't guess the overall commit, but the AI side has not received many commits recently.

2023-04-01 15:11 Updated by: cazfi
Comment

Reply To alain_bkr

I have not yet tested if i can reproduce the overflow by loading the game ...

I hope to start working on improving that reproducibility, at least in dedicated test systems (e.g. saving some extra data that's not part of regular savegames), soon: #44736

2023-04-02 15:37 Updated by: cazfi
  • Resolution Update from Accepted to None
Comment

Reply To alain_bkr

3.1 , the patch is not applied (tests are still running )

17 ../../../ai/default/aiunit.c:344:36: runtime error: signed integer overflow
14 ../../../ai/default/aiunit.c:344:45: runtime error: signed integer overflow
4 ../../../ai/default/aiunit.c:344:52: runtime error: signed integer overflow

This is the kill_desire(), and the line numbers show the weakness of my patch (it won't help) - the calculations is still being made in integer math even when we will assign the result to a float.

(Edited, 2023-04-02 15:38 Updated by: cazfi)
2023-04-02 15:51 Updated by: cazfi
Comment

Reply To alain_bkr

1 ../../../ai/default/aitools.c:377:33: runtime error: signed integer overflow
5 ../../../ai/default/aitools.c:378:18: runtime error: signed integer overflow
1 ../../../ai/default/aitools.c:378:41: runtime error: signed integer overflow

-> #47746

17 ../../../ai/default/aiunit.c:344:36: runtime error: signed integer overflow
14 ../../../ai/default/aiunit.c:344:45: runtime error: signed integer overflow
4 ../../../ai/default/aiunit.c:344:52: runtime error: signed integer overflow

This ticket

30 ../../../ai/default/daimilitary.c:1784:20: runtime error: signed integer overflow
3 ../../../ai/default/daimilitary.c:1787:22: runtime error: signed integer overflow

-> #47747

4 ../../../common/aicore/path_finding.c:175:25: runtime error: signed integer overflow
6 ../../../common/aicore/path_finding.c:175:32: runtime error: signed integer overflow
2 ../../../common/aicore/path_finding.c:175:40: runtime error: signed integer overflow

No ticket about exactly these yet, but #47731 touches the area, and is assumed to end up as a baseline for further fixes.

17 ../../common/city.c:2872:21: runtime error: signed integer overflow
17 ../../common/city.c:2879:25: runtime error: signed integer overflow
17 ../../common/city.c:2886:37: runtime error: signed integer overflow
4 ../../common/city.c:3156:33: runtime error: signed integer overflow

-> #47748

2023-04-03 12:03 Updated by: cazfi
  • Resolution Update from None to Accepted
Comment

With the now attached patches the calculation should happen with floats from the beginning. Also fixes some accuracy losing float -> int -> float conversions.

2023-04-05 04:31 Updated by: cazfi
Comment

Backport to S3_0 attached. I guess that the form this took in the end is sufficient also to that branch, and given effects outside kill_desire() itself even should be included as a bug fix.

2023-04-05 12:35 Updated by: cazfi
Comment

Pushed to later branches, S3_0 waiting for 3.0.7 release to get out of the way.

2023-04-08 16:44 Updated by: cazfi
  • Status Update from Open to Closed
  • Resolution Update from Accepted to Fixed
2023-05-22 18:04 Updated by: None
Comment

Another example of why we should use long ints for move rate stuff.

Edit

Please login to add comment to this ticket » Login