Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POGO] (rare?) crash apparently during SENSE_ACTOR_ACTIONS #45

Open
musliner opened this issue Dec 6, 2021 · 14 comments
Open

[POGO] (rare?) crash apparently during SENSE_ACTOR_ACTIONS #45

musliner opened this issue Dec 6, 2021 · 14 comments

Comments

@musliner
Copy link

musliner commented Dec 6, 2021

musliner@openmind0:~/openmind/code/test/results/pal-tournament-v2-nonov-2021-12-05-222307-openmind0$

Just after we PLACE_TREE_TAP successfully (included below), we sent SENSE_ACTOR_ACTIONS and, possibly as a result or before getting/handling that, PAL crashed with all this:

2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] [04:40:47] [BOT API THREAD/INFO]: [AGENT]PLACE_TREE_TAP^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] [04:40:47] [Client thread/INFO]: [edu.utd.minecraft.mod.polycraft.aitools.BotAPI:onClientTick:643]: PLACE_TREE_TAP^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] [04:40:47] [Client thread/INFO]: Running Command on Client side: edu.utd.minecraft.mod.polycraft.aitools.commands.APICommandPlaceBlock^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] [04:40:47] [BOT API THREAD/INFO]: [CLIENT]{"goal":{"goalType":"ITEM","goalAchieved":false,"Distribution":"Uninformed"},"command_result":{"command":"PLACE_TREE_TAP","argument":"","result":"SUCCESS","message":"Block polycraft:tree_tap placed","stepCost":300.0},"step":59,"gameOver":false}^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] [04:40:47] [BOT API THREAD/INFO]: [SCORE]step:60,totalCost:28182.297,totalReward:64000.0,adjustedReward:35817.703^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] [04:40:47] [Server thread/ERROR]: Exception caught during firing event net.minecraftforge.fml.common.gameevent.TickEvent$ServerTickEvent@71f7b82f:^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] java.lang.NullPointerException^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at edu.utd.minecraft.mod.polycraft.aitools.observations.ObservationPlayerInventory.getObservation(ObservationPlayerInventory.java:56) ~[ObservationPlayerInventory.class:?]^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at edu.utd.minecraft.mod.polycraft.experiment.tutorial.ExperimentTutorial.getObservations(ExperimentTutorial.java:461) ~[ExperimentTutorial.class:?]^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at edu.utd.minecraft.mod.polycraft.experiment.tutorial.TutorialFeatureEntity.updateLastSenseAll(TutorialFeatureEntity.java:113) ~[TutorialFeatureEntity.class:?]^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at edu.utd.minecraft.mod.polycraft.experiment.tutorial.TutorialFeatureEntity.moveTowardsTarget(TutorialFeatureEntity.java:323)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at edu.utd.minecraft.mod.polycraft.experiment.tutorial.TutorialFeatureEntityPogoist.onServerTickUpdate(TutorialFeatureEntityPogoist.java:323)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at edu.utd.minecraft.mod.polycraft.experiment.tutorial.ExperimentTutorial.onServerTickUpdate(ExperimentTutorial.java:311)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at edu.utd.minecraft.mod.polycraft.experiment.tutorial.TutorialManager.onServerTickUpdate(TutorialManager.java:104)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at edu.utd.minecraft.mod.polycraft.proxy.CommonProxy.onServerTick(CommonProxy.java:449)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at net.minecraftforge.fml.common.eventhandler.ASMEventHandler_23_CommonProxy_onServerTick_ServerTickEvent.invoke(.dynamic)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at net.minecraftforge.fml.common.eventhandler.ASMEventHandler.invoke(ASMEventHandler.java:49)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at net.minecraftforge.fml.common.eventhandler.EventBus.post(EventBus.java:140)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at net.minecraftforge.fml.common.FMLCommonHandler.onPostServerTick(FMLCommonHandler.java:253)^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at net.minecraft.server.MinecraftServer.tick(MinecraftServer.java:649)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at net.minecraft.server.integrated.IntegratedServer.tick(IntegratedServer.java:152)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at net.minecraft.server.MinecraftServer.run(MinecraftServer.java:481)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] at java.lang.Thread.run(Thread.java:748)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] ^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] ^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] A detailed walkthrough of the error, its code path and all known details is as follows:^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] ---------------------------------------------------------------------------------------^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] ^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] -- System Details --^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] Details:^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] Minecraft Version: 1.8.9^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] Operating System: Linux (amd64) version 4.15.0-163-generic^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] Java Version: 1.8.0_162, Oracle Corporation^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] Java VM Version: OpenJDK 64-Bit Server VM (mixed mode), Oracle Corporation^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] Memory: 1478655984 bytes (1410 MB) / 2271215616 bytes (2166 MB) up to 28631367680 bytes (27305 MB)^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] JVM Flags: 0 total; ^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] IntCache: cache: 0, tcache: 0, allocated: 0, tallocated: 0^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] FML: MCP 9.19 Powered by Forge 11.15.1.2318 4 mods loaded, 4 mods active^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] States: 'U' = Unloaded 'L' = Loaded 'C' = Constructed 'H' = Pre-initialized 'I' = Initialized 'J' = Post-initialized 'A' = Available 'D' = Disabled 'E' = Errored^M
^M
2021-12-06_04:40:48: PAL: [2021-12-05 22:40:47] UCHIJAAAA mcp{9.19} [Minecraft Coder Pack] (minecraft.jar) ^M
^M
Traceback (most recent call last):^M
File "/openmind/code/test/LaunchTournament.py", line 847, in ^M
pal.execute()^M
File "/openmind/code/test/LaunchTournament.py", line 423, in execute^M
if self._check_ended(str(next_line)):^M
File "/openmind/code/test/LaunchTournament.py", line 206, in _check_ended^M
data_dict = json.loads(json_text)^M
File "/usr/lib/python3.6/json/init.py", line 354, in loads^M
return _default_decoder.decode(s)^M
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode^M
obj, end = self.raw_decode(s, idx=_w(s, 0).end())^M
File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode^M
obj, end = self.scan_once(s, idx)^M
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)^M

@musliner
Copy link
Author

musliner commented Dec 6, 2021

Note that this was with me testing 1s delay between handling the PLACE success and requesting SENSE_ACTOR_ACTIONS. Normally we do 2s delay, so that might be why we don't see this normally.

@musliner
Copy link
Author

musliner commented Dec 6, 2021

Timestamps from executive log suggest the crash (at 22:40:47) happened before we sent the SENSE_ACTOR_ACTIONS:

[2021-12-05 22:40:47] RESULT: sent domain command PLACE_TREE_TAP^M
[2021-12-05 22:40:47] openmind: getting the message: (command-response success "Block polycraft:tree_tap placed"
300.000000 "False" "False") from bridge.^M
[2021-12-05 22:40:47] Telling planner ((:type :command-response)(:plan-name t0-g20-p1)(:step-index 22)(:command "PLACE_TREE_TAP")(:result success)(:message "Block polycraft:tree_tap placed")(:step-cost 300.000000)(:gameover "False")(:goalachieved "False"))^M
[2021-12-05 22:40:47] RESULT: domain command result success Block polycraft:tree_tap placed 300.000000 False False^M
[2021-12-05 22:40:47] incrementing stepcost for plan t0-g20-p1 index 22 with cur-cost 558.188480 by cost 300.000000^M
[2021-12-05 22:40:48] RESULT: sent domain command SENSE_ACTOR_ACTIONS^M

as expected, the bridge log agrees, the command from exec was sent at :48

[2021-12-05 22:40:47] prsinterface[944463] send_message: Sending message to openmind: (command-response SUCCESS "Block polycraft:tree_tap placed" 300.0 "False" "False")^M
[2021-12-05 22:40:47] prsinterface[944463] main: Waiting for msg from OPRS^M
[2021-12-05 22:40:48] Received message from b'openmind': b'SENSE_ACTOR_ACTIONS'; sending to Polycraft^M
[2021-12-05 22:40:48] Stripped parens, uppercased first two words, and sending to Polycraft: SENSE_ACTOR_ACTIONS^M
[2021-12-05 22:40:48] prsinterface[945166] main: Trying to get a msg from pcw^M

the outer pal-tournament log (in original ticket body) with wrong-timezone timestamps had been 1s late for a while... (eg here :10 instead of inner :09)
2021-12-06_04:25:10: PAL: [2021-12-05 22:25:09] [04:25:09] [BOT API THREAD/INFO]: [edu.utd.minecraft.mod.polycraft.aitools.BotAPI$1:run:728]: SENSE_ALL NONAV^M

@StephenGss
Copy link
Owner

Do you have more information on this? Like what items the player was holding?
It looks like an item in the player's inventory is causing this error, but I can't reproduce with any of the items.

The game is crashing when the actors generate the pre-world state player inventory. I don't think it ever got your sense_actor_actions command.

@mpatsift
Copy link

mpatsift commented Dec 8, 2021

The last inventory list printed in the pal-tournament.log (before placing the tree tap):

"inventory": {"0":{"item":"minecraft:iron_pickaxe","count":1,"damage":3,"maxdamage":250}, "1":{"item":"polycraft:block_of_platinum","count":1,"damage":0,"maxdamage":0}, "2":{"item":"minecraft:diamond","count":9,"damage":0,"maxdamage":0}, "3":{"item":"minecraft:planks","count":1,"damage":0,"maxdamage":0,"variant":"oak"}, "4":{"item":"minecraft:stick","count":3,"damage":0,"maxdamage":0}, "5":{"item":"polycraft:tree_tap","count":1,"damage":0,"maxdamage":0,"enabled":"true","facing":"down"}, "selectedItem":{"slot":0,"item":"minecraft:iron_pickaxe","count":1,"damage":3,"maxdamage":250}}

Here's the full pal-tournament log
pal-tournament-musliner-2021-12-05-222307-openmind0.log.gz
:

@musliner
Copy link
Author

musliner commented Jan 7, 2022

just saw this PAL crash again in musliner@openmind0:~/openmind/code/test/results/pal-tournament-v2-shared-1-2-1-M-2022-01-07-053806-openmind0$ in case we need more logs

@mpatsift
Copy link

mpatsift commented Feb 2, 2022

Our regression tests hit this (I think) again last night.

I'll post here, if you want it.

prt-suite-2022-02-01-215523-openmind0/pal-tournament-v2-nonov-3-games-2022-02-01-223243-openmind0/

@musliner
Copy link
Author

musliner commented Mar 9, 2022

I'm trying to run 100 no-nov games and this keeps happening...looks like it is not rare. Any ideas?

@StephenGss
Copy link
Owner

Are you seeing the same error from before?
I did an update on the svn, but my code/test/results folder is empty. Are the logs somewhere else?

@musliner
Copy link
Author

musliner commented Mar 9, 2022

we have way too many results to put in svn, so logs only avail on request. Yes the error appears same as in this issue...

pal-tournament.log sez:

2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] [20:19:27] [BOT API THREAD/INFO]: [AGENT]PLACE_TREE_TAP^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] [20:19:27] [Client thread/INFO]: [edu.utd.minecraft.mod.polycraft.aitools.BotAPI:onClientTick:643]: PLACE_TREE_TAP^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] [20:19:27] [Client thread/INFO]: Running Command on Client side: edu.utd.minecraft.mod.polycraft.aitools.commands.APICommandPlaceBlock^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] [20:19:27] [BOT API THREAD/INFO]: [CLIENT]{"goal":{"goalType":"ITEM","goalAchieved":false,"Distribution":"Uninformed"},"command_result":{"command":"PLACE_TREE_TAP","argument":"","result":"SUCCESS","message":"Block polycraft:tree_tap placed","stepCost":300.0},"step":70,"gameOver":false}^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] [20:19:27] [BOT API THREAD/INFO]: [SCORE]step:71,totalCost:29452.48,totalReward:64000.0,adjustedReward:34547.52^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] [20:19:27] [Client thread/INFO]: [edu.utd.minecraft.mod.polycraft.experiment.tutorial.ExperimentTutorial:onClientTickUpdate:431]: [Client to Server] Sending Feature update^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] [20:19:27] [Server thread/ERROR]: Exception caught during firing event net.minecraftforge.fml.common.gameevent.TickEvent$ServerTickEvent@2f51952d:^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] java.lang.NullPointerException^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] at edu.utd.minecraft.mod.polycraft.aitools.observations.ObservationPlayerInventory.getObservation(ObservationPlayerInventory.java:77) ~[ObservationPlayerInventory.class:?]^M
^M
2022-03-09_20:19:28: PAL: [2022-03-09 14:19:27] at edu.utd.minecraft.mod.polycraft.experiment.tutorial.ExperimentTutorial.getObservations(ExperimentTutorial.java:461) ~[ExperimentTutorial.class:?]^M
...
(crash)

@musliner
Copy link
Author

musliner commented Mar 9, 2022

if you want the full thing just let me know, i'll get it attached here, thx

@StephenGss
Copy link
Owner

I'm still unable to replicate it, but I put that code in a try-catch and pushed an update. It shouldn't affect anything if that piece of code fails in this case.
Please update to the latest on release 2.0 and run your test again.

@mpatsift
Copy link

Looks like same issue yesterday (I'm up-to-date wrt release 2.0):
pal-tournament.log

@musliner
Copy link
Author

I think I mentioned in email, but not here, that the try/catch from March10
is not preventing crashing of the tournament....

@mpatsift
Copy link

This still happens once every few-to-several days in our regression tests. No pattern afaict.

Should we assume that this bug is too rare to be worth fixing and the eval code UTD uses will simply re-run tournaments that crash in this way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants