Job ID | step | status | up to date | date started | date completed | duration | message | | |
61570 | eval1-videos | success | yes | | | 0:02:30 | all ok, 8 bags proce [...]all ok, 8 bags processed
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
61569 | eval2-visualize | success | yes | | | 0:01:13 | | Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | driven_lanedir_consec | 1.538375804350982 | survival_time | 12.901892185211182 | deviation-center-line | 0.5271305418699314 | in-drivable-lane | 7.605623722076416 |
other statsdeviation-heading | 1.874646830288386 | driven_any | 2.3729383598200204 | driven_lanedir | 1.538375804350982 | visualized-eval2-passed | 1 |
| No reset possible |
61568 | eval2-videos | success | yes | | | 0:02:36 | all ok, 8 bags proce [...]all ok, 8 bags processed
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
61567 | eval0-visualize | success | yes | | | 0:01:28 | | Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | driven_lanedir_consec | 0.5095739644743909 | survival_time | 5.999385833740234 | deviation-center-line | 0.18664647817264143 | in-drivable-lane | 1.9999778270721436 |
other statsdeviation-heading | 1.093488464947737 | driven_any | 0.6315599174515889 | driven_lanedir | 0.5095739644743909 | visualized-eval0-passed | 1 |
| No reset possible |
61565 | eval1-visualize | success | yes | | | 0:01:51 | | Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | driven_lanedir_consec | 0.6129078664816039 | survival_time | 8.403920650482178 | deviation-center-line | 0.10089285772150086 | in-drivable-lane | 6.505793333053589 |
other statsdeviation-heading | 1.0347678980734258 | driven_any | 1.3836701285671344 | driven_lanedir | 0.6129078664816039 | visualized-eval1-passed | 1 |
| No reset possible |
61564 | eval0-videos | success | yes | | | 0:01:29 | all ok, 8 bags proce [...]all ok, 8 bags processed
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60578 | eval2 | success | yes | | | 0:09:14 | | Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60564 | eval2 | aborted | yes | | | 0:03:04 | Job shutdown [...]Job shutdown
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60563 | eval2 | aborted | yes | | | 0:08:23 | Operator message: '' [...]Operator message: ''
Logs:
DEBUG:commons:version: 6.1.7 *
INFO:typing:version: 6.1.8
DEBUG:aido_schemas:aido-protocols version 6.0.33 path /usr/local/lib/python3.8/dist-packages
INFO:nodes:version 6.1.1 path /usr/local/lib/python3.8/dist-packages pyparsing 2.4.6
2020-12-11 22:39:20.295126: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-12-11 22:39:23.583160: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:
2020-12-11 22:39:23.583197: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-11 22:39:23.583241: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
DEBUG:ipce:version 6.0.36 path /usr/local/lib/python3.8/dist-packages
INFO:nodes_wrapper:checking implementation
INFO:nodes_wrapper:checking implementation OK
DEBUG:nodes_wrapper:run_loop
fin: /fifos/ego0-in
fout: fifo:/fifos/ego0-out
INFO:nodes_wrapper:Fifo /fifos/ego0-out created. I will block until a reader appears.
INFO:nodes_wrapper:Fifo reader appeared for /fifos/ego0-out.
INFO:nodes_wrapper:Node RLlibAgent starting reading
fi_desc: /fifos/ego0-in
fo_desc: fifo:/fifos/ego0-out
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: init()
WARNING:config.config:Found paths with seed 3092:
WARNING:config.config:0: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/config_dump_3092.yml
WARNING:config.config:Found checkpoints in ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47:
WARNING:config.config:0: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
WARNING:config.config:Config loaded from ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/config_dump_3092.yml
WARNING:config.config:Model checkpoint loaded from ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
WARNING:config.config:Updating default config values by:
env_config:
mode: inference
WARNING:config.config:Env_config.mode is 'inference', some hyperparameters will be overwritten by:
rllib_config:
num_workers: 0
num_gpus: 0
callbacks: {}
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: === Wrappers ===================================
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: Observation wrappers
<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>
<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>
<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>
<NormalizeWrapper<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>>
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: Action wrappers
<Heading2WheelVelsWrapper<NormalizeWrapper<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>>>
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: Reward wrappers
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: === Config ===================================
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: seed: 3092
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
algo: PPO
algo_config_files:
PPO: config/algo/ppo.yml
general: config/algo/general.yml
env_config:
mode: inference
episode_max_steps: 500
resized_input_shape: (84, 84)
crop_image_top: true
top_crop_divider: 3
grayscale_image: false
frame_stacking: true
frame_stacking_depth: 3
motion_blur: false
action_type: heading
reward_function: posangle
distortion: true
accepted_start_angle_deg: 30
simulation_framerate: 30
frame_skip: 3
action_delay_ratio: 0.0
training_map: multimap_aido5
domain_rand: true
dynamics_rand: true
camera_rand: true
frame_repeating: 0.0
spawn_obstacles: false
obstacles:
duckie:
density: 0.5
static: true
duckiebot:
density: 0
static: false
spawn_forward_obstacle: false
aido_wrapper: true
wandb:
project: duckietown-rllib
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
seed: 3092
ray_init_config:
num_cpus: 1
webui_host: 127.0.0.1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
restore_seed: 3091
restore_experiment_idx: 0
restore_checkpoint_idx: 0
debug_hparams:
rllib_config:
num_workers: 1
num_gpus: 0
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
inference_hparams:
rllib_config:
num_workers: 0
num_gpus: 0
callbacks: {}
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
timesteps_total: 4000000.0
rllib_config:
num_workers: 0
sample_batch_size: 265
num_gpus: 0
train_batch_size: 4096
gamma: 0.99
lr: 5.0e-05
monitor: false
evaluation_interval: 25
evaluation_num_episodes: 2
evaluation_config:
monitor: false
explore: false
seed: 1234
lambda: 0.95
sgd_minibatch_size: 128
vf_loss_coeff: 0.5
entropy_coeff: 0.0
clip_param: 0.2
vf_clip_param: 0.2
grad_clip: 0.5
env: Duckietown
callbacks: {}
env_config:
mode: inference
episode_max_steps: 500
resized_input_shape: (84, 84)
crop_image_top: true
top_crop_divider: 3
grayscale_image: false
frame_stacking: true
frame_stacking_depth: 3
motion_blur: false
action_type: heading
reward_function: posangle
distortion: true
accepted_start_angle_deg: 30
simulation_framerate: 30
frame_skip: 3
action_delay_ratio: 0.0
training_map: multimap_aido5
domain_rand: true
dynamics_rand: true
camera_rand: true
frame_repeating: 0.0
spawn_obstacles: false
obstacles:
duckie:
density: 0.5
static: true
duckiebot:
density: 0
static: false
spawn_forward_obstacle: false
aido_wrapper: true
wandb:
project: duckietown-rllib
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
seed: 3092
2020-12-11 22:39:24,141 INFO trainer.py:428 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
2020-12-11 22:39:24,162 ERROR syncer.py:39 -- Log sync requires rsync to be installed.
2020-12-11 22:39:24,162 WARNING deprecation.py:29 -- DeprecationWarning: `sample_batch_size` has been deprecated. Use `rollout_fragment_length` instead. This will raise an error in the future!
2020-12-11 22:39:24,162 INFO trainer.py:583 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
2020-12-11 22:39:24.177929: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-12-11 22:39:24.188468: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2599935000 Hz
2020-12-11 22:39:24.188993: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x676c000 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-11 22:39:24.189021: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-11 22:39:29,457 INFO trainable.py:217 -- Getting current IP.
2020-12-11 22:39:29,457 WARNING util.py:37 -- Install gputil for GPU system monitoring.
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: Restoring checkpoint from: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
2020-12-11 22:39:29,562 INFO trainable.py:217 -- Getting current IP.
2020-12-11 22:39:29,562 INFO trainable.py:422 -- Restored on 172.17.0.2 from checkpoint: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
2020-12-11 22:39:29,562 INFO trainable.py:430 -- Current state after restoring: {'_iteration': 363, '_timesteps_total': 1539120, '_time_total': 110224.9016327858, '_episodes_total': 8614}
INFO:nodes_wrapper:c70e6aa61a1f:RLlibAgent: Starting episode "episode".
ERROR:nodes_wrapper:Error in node RLlibAgent:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
handle_message_node(parsed, receiver0, context0)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
f(**kwargs)
File "solution.py", line 80, in on_received_get_commands
pwm_left, pwm_right = self.compute_action(self.current_image)
File "solution.py", line 73, in compute_action
action = self.model.predict(observation)
File "/submission/model.py", line 63, in predict
action = self.model.compute_action(observation, explore=False)
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/agents/trainer.py", line 781, in compute_action
result = self.get_policy(policy_id).compute_single_action(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/policy.py", line 150, in compute_single_action
[action], state_out, info = self.compute_actions(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/tf_policy.py", line 268, in compute_actions
return builder.get(fetches)
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 42, in get
self._executed = run_timeline(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 89, in run_timeline
fetches = sess.run(ops, feed_dict=feed_dict)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 957, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1180, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1358, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
sys.exit(signal.SIGTERM)
SystemExit: Signals.SIGTERM
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 243, in run_loop
loop(node_name, fi, fo, node, protocol, tin, tout, config=config, fi_desc=fin, fo_desc=fout)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 378, in loop
raise InternalProblem(msg) from e # XXX
zuper_nodes.structures.InternalProblem: Exception while handling a message on topic "get_commands".
| Traceback (most recent call last):
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
| handle_message_node(parsed, receiver0, context0)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
| call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
| f(**kwargs)
| File "solution.py", line 80, in on_received_get_commands
| pwm_left, pwm_right = self.compute_action(self.current_image)
| File "solution.py", line 73, in compute_action
| action = self.model.predict(observation)
| File "/submission/model.py", line 63, in predict
| action = self.model.compute_action(observation, explore=False)
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/agents/trainer.py", line 781, in compute_action
| result = self.get_policy(policy_id).compute_single_action(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/policy.py", line 150, in compute_single_action
| [action], state_out, info = self.compute_actions(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/tf_policy.py", line 268, in compute_actions
| return builder.get(fetches)
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 42, in get
| self._executed = run_timeline(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 89, in run_timeline
| fetches = sess.run(ops, feed_dict=feed_dict)
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 957, in run
| result = self._run(None, fetches, feed_dict, options_ptr,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1180, in _run
| results = self._do_run(handle, final_targets, final_fetches,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1358, in _do_run
| return self._do_call(_run_fn, feeds, fetches, targets, options,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
| return fn(*args)
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
| return self._call_tf_sessionrun(options, feed_dict, fetch_list,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
| return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
| File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
| sys.exit(signal.SIGTERM)
| SystemExit: Signals.SIGTERM
|
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
handle_message_node(parsed, receiver0, context0)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
f(**kwargs)
File "solution.py", line 80, in on_received_get_commands
pwm_left, pwm_right = self.compute_action(self.current_image)
File "solution.py", line 73, in compute_action
action = self.model.predict(observation)
File "/submission/model.py", line 63, in predict
action = self.model.compute_action(observation, explore=False)
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/agents/trainer.py", line 781, in compute_action
result = self.get_policy(policy_id).compute_single_action(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/policy.py", line 150, in compute_single_action
[action], state_out, info = self.compute_actions(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/tf_policy.py", line 268, in compute_actions
return builder.get(fetches)
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 42, in get
self._executed = run_timeline(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 89, in run_timeline
fetches = sess.run(ops, feed_dict=feed_dict)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 957, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1180, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1358, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
sys.exit(signal.SIGTERM)
SystemExit: Signals.SIGTERM
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 243, in run_loop
loop(node_name, fi, fo, node, protocol, tin, tout, config=config, fi_desc=fin, fo_desc=fout)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 378, in loop
raise InternalProblem(msg) from e # XXX
zuper_nodes.structures.InternalProblem: Exception while handling a message on topic "get_commands".
| Traceback (most recent call last):
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
| handle_message_node(parsed, receiver0, context0)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
| call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
| f(**kwargs)
| File "solution.py", line 80, in on_received_get_commands
| pwm_left, pwm_right = self.compute_action(self.current_image)
| File "solution.py", line 73, in compute_action
| action = self.model.predict(observation)
| File "/submission/model.py", line 63, in predict
| action = self.model.compute_action(observation, explore=False)
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/agents/trainer.py", line 781, in compute_action
| result = self.get_policy(policy_id).compute_single_action(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/policy.py", line 150, in compute_single_action
| [action], state_out, info = self.compute_actions(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/tf_policy.py", line 268, in compute_actions
| return builder.get(fetches)
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 42, in get
| self._executed = run_timeline(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 89, in run_timeline
| fetches = sess.run(ops, feed_dict=feed_dict)
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 957, in run
| result = self._run(None, fetches, feed_dict, options_ptr,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1180, in _run
| results = self._do_run(handle, final_targets, final_fetches,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1358, in _do_run
| return self._do_call(_run_fn, feeds, fetches, targets, options,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
| return fn(*args)
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
| return self._call_tf_sessionrun(options, feed_dict, fetch_list,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
| return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
| File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
| sys.exit(signal.SIGTERM)
| SystemExit: Signals.SIGTERM
|
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "solution.py", line 127, in <module>
main()
File "solution.py", line 123, in main
wrap_direct(node=node, protocol=protocol)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/interface.py", line 24, in wrap_direct
run_loop(node, protocol, args)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 251, in run_loop
raise Exception(msg) from e
Exception: Error in node RLlibAgent
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60560 | eval2 | aborted | yes | | | 0:07:04 | Operator message: '' [...]Operator message: ''
Logs:
DEBUG:commons:version: 6.1.7 *
INFO:typing:version: 6.1.8
DEBUG:aido_schemas:aido-protocols version 6.0.33 path /usr/local/lib/python3.8/dist-packages
INFO:nodes:version 6.1.1 path /usr/local/lib/python3.8/dist-packages pyparsing 2.4.6
2020-12-11 22:29:03.403854: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-12-11 22:29:06.557963: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:
2020-12-11 22:29:06.557996: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-11 22:29:06.558026: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
DEBUG:ipce:version 6.0.36 path /usr/local/lib/python3.8/dist-packages
INFO:nodes_wrapper:checking implementation
INFO:nodes_wrapper:checking implementation OK
DEBUG:nodes_wrapper:run_loop
fin: /fifos/ego0-in
fout: fifo:/fifos/ego0-out
INFO:nodes_wrapper:Fifo /fifos/ego0-out created. I will block until a reader appears.
INFO:nodes_wrapper:Fifo reader appeared for /fifos/ego0-out.
INFO:nodes_wrapper:Node RLlibAgent starting reading
fi_desc: /fifos/ego0-in
fo_desc: fifo:/fifos/ego0-out
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: init()
WARNING:config.config:Found paths with seed 3092:
WARNING:config.config:0: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/config_dump_3092.yml
WARNING:config.config:Found checkpoints in ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47:
WARNING:config.config:0: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
WARNING:config.config:Config loaded from ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/config_dump_3092.yml
WARNING:config.config:Model checkpoint loaded from ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
WARNING:config.config:Updating default config values by:
env_config:
mode: inference
WARNING:config.config:Env_config.mode is 'inference', some hyperparameters will be overwritten by:
rllib_config:
num_workers: 0
num_gpus: 0
callbacks: {}
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: === Wrappers ===================================
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: Observation wrappers
<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>
<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>
<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>
<NormalizeWrapper<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>>
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: Action wrappers
<Heading2WheelVelsWrapper<NormalizeWrapper<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>>>
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: Reward wrappers
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: === Config ===================================
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: seed: 3092
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
algo: PPO
algo_config_files:
PPO: config/algo/ppo.yml
general: config/algo/general.yml
env_config:
mode: inference
episode_max_steps: 500
resized_input_shape: (84, 84)
crop_image_top: true
top_crop_divider: 3
grayscale_image: false
frame_stacking: true
frame_stacking_depth: 3
motion_blur: false
action_type: heading
reward_function: posangle
distortion: true
accepted_start_angle_deg: 30
simulation_framerate: 30
frame_skip: 3
action_delay_ratio: 0.0
training_map: multimap_aido5
domain_rand: true
dynamics_rand: true
camera_rand: true
frame_repeating: 0.0
spawn_obstacles: false
obstacles:
duckie:
density: 0.5
static: true
duckiebot:
density: 0
static: false
spawn_forward_obstacle: false
aido_wrapper: true
wandb:
project: duckietown-rllib
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
seed: 3092
ray_init_config:
num_cpus: 1
webui_host: 127.0.0.1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
restore_seed: 3091
restore_experiment_idx: 0
restore_checkpoint_idx: 0
debug_hparams:
rllib_config:
num_workers: 1
num_gpus: 0
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
inference_hparams:
rllib_config:
num_workers: 0
num_gpus: 0
callbacks: {}
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
timesteps_total: 4000000.0
rllib_config:
num_workers: 0
sample_batch_size: 265
num_gpus: 0
train_batch_size: 4096
gamma: 0.99
lr: 5.0e-05
monitor: false
evaluation_interval: 25
evaluation_num_episodes: 2
evaluation_config:
monitor: false
explore: false
seed: 1234
lambda: 0.95
sgd_minibatch_size: 128
vf_loss_coeff: 0.5
entropy_coeff: 0.0
clip_param: 0.2
vf_clip_param: 0.2
grad_clip: 0.5
env: Duckietown
callbacks: {}
env_config:
mode: inference
episode_max_steps: 500
resized_input_shape: (84, 84)
crop_image_top: true
top_crop_divider: 3
grayscale_image: false
frame_stacking: true
frame_stacking_depth: 3
motion_blur: false
action_type: heading
reward_function: posangle
distortion: true
accepted_start_angle_deg: 30
simulation_framerate: 30
frame_skip: 3
action_delay_ratio: 0.0
training_map: multimap_aido5
domain_rand: true
dynamics_rand: true
camera_rand: true
frame_repeating: 0.0
spawn_obstacles: false
obstacles:
duckie:
density: 0.5
static: true
duckiebot:
density: 0
static: false
spawn_forward_obstacle: false
aido_wrapper: true
wandb:
project: duckietown-rllib
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
seed: 3092
2020-12-11 22:29:07,019 INFO trainer.py:428 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
2020-12-11 22:29:07,036 ERROR syncer.py:39 -- Log sync requires rsync to be installed.
2020-12-11 22:29:07,037 WARNING deprecation.py:29 -- DeprecationWarning: `sample_batch_size` has been deprecated. Use `rollout_fragment_length` instead. This will raise an error in the future!
2020-12-11 22:29:07,037 INFO trainer.py:583 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
2020-12-11 22:29:07.050348: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-12-11 22:29:07.061066: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2599935000 Hz
2020-12-11 22:29:07.061705: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7972150 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-11 22:29:07.061754: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-11 22:29:12,040 INFO trainable.py:217 -- Getting current IP.
2020-12-11 22:29:12,041 WARNING util.py:37 -- Install gputil for GPU system monitoring.
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: Restoring checkpoint from: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
2020-12-11 22:29:12,106 INFO trainable.py:217 -- Getting current IP.
2020-12-11 22:29:12,106 INFO trainable.py:422 -- Restored on 172.17.0.2 from checkpoint: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
2020-12-11 22:29:12,106 INFO trainable.py:430 -- Current state after restoring: {'_iteration': 363, '_timesteps_total': 1539120, '_time_total': 110224.9016327858, '_episodes_total': 8614}
INFO:nodes_wrapper:4dd9d5a4eb8d:RLlibAgent: Starting episode "episode".
ERROR:nodes_wrapper:Error in node RLlibAgent:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
handle_message_node(parsed, receiver0, context0)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
f(**kwargs)
File "solution.py", line 80, in on_received_get_commands
pwm_left, pwm_right = self.compute_action(self.current_image)
File "solution.py", line 73, in compute_action
action = self.model.predict(observation)
File "/submission/model.py", line 63, in predict
action = self.model.compute_action(observation, explore=False)
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/agents/trainer.py", line 781, in compute_action
result = self.get_policy(policy_id).compute_single_action(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/policy.py", line 150, in compute_single_action
[action], state_out, info = self.compute_actions(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/tf_policy.py", line 268, in compute_actions
return builder.get(fetches)
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 42, in get
self._executed = run_timeline(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 89, in run_timeline
fetches = sess.run(ops, feed_dict=feed_dict)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 957, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1180, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1358, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
sys.exit(signal.SIGTERM)
SystemExit: Signals.SIGTERM
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 243, in run_loop
loop(node_name, fi, fo, node, protocol, tin, tout, config=config, fi_desc=fin, fo_desc=fout)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 378, in loop
raise InternalProblem(msg) from e # XXX
zuper_nodes.structures.InternalProblem: Exception while handling a message on topic "get_commands".
| Traceback (most recent call last):
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
| handle_message_node(parsed, receiver0, context0)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
| call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
| f(**kwargs)
| File "solution.py", line 80, in on_received_get_commands
| pwm_left, pwm_right = self.compute_action(self.current_image)
| File "solution.py", line 73, in compute_action
| action = self.model.predict(observation)
| File "/submission/model.py", line 63, in predict
| action = self.model.compute_action(observation, explore=False)
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/agents/trainer.py", line 781, in compute_action
| result = self.get_policy(policy_id).compute_single_action(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/policy.py", line 150, in compute_single_action
| [action], state_out, info = self.compute_actions(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/tf_policy.py", line 268, in compute_actions
| return builder.get(fetches)
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 42, in get
| self._executed = run_timeline(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 89, in run_timeline
| fetches = sess.run(ops, feed_dict=feed_dict)
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 957, in run
| result = self._run(None, fetches, feed_dict, options_ptr,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1180, in _run
| results = self._do_run(handle, final_targets, final_fetches,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1358, in _do_run
| return self._do_call(_run_fn, feeds, fetches, targets, options,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
| return fn(*args)
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
| return self._call_tf_sessionrun(options, feed_dict, fetch_list,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
| return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
| File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
| sys.exit(signal.SIGTERM)
| SystemExit: Signals.SIGTERM
|
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
handle_message_node(parsed, receiver0, context0)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
f(**kwargs)
File "solution.py", line 80, in on_received_get_commands
pwm_left, pwm_right = self.compute_action(self.current_image)
File "solution.py", line 73, in compute_action
action = self.model.predict(observation)
File "/submission/model.py", line 63, in predict
action = self.model.compute_action(observation, explore=False)
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/agents/trainer.py", line 781, in compute_action
result = self.get_policy(policy_id).compute_single_action(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/policy.py", line 150, in compute_single_action
[action], state_out, info = self.compute_actions(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/tf_policy.py", line 268, in compute_actions
return builder.get(fetches)
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 42, in get
self._executed = run_timeline(
File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 89, in run_timeline
fetches = sess.run(ops, feed_dict=feed_dict)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 957, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1180, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1358, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
sys.exit(signal.SIGTERM)
SystemExit: Signals.SIGTERM
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 243, in run_loop
loop(node_name, fi, fo, node, protocol, tin, tout, config=config, fi_desc=fin, fo_desc=fout)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 378, in loop
raise InternalProblem(msg) from e # XXX
zuper_nodes.structures.InternalProblem: Exception while handling a message on topic "get_commands".
| Traceback (most recent call last):
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
| handle_message_node(parsed, receiver0, context0)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
| call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
| f(**kwargs)
| File "solution.py", line 80, in on_received_get_commands
| pwm_left, pwm_right = self.compute_action(self.current_image)
| File "solution.py", line 73, in compute_action
| action = self.model.predict(observation)
| File "/submission/model.py", line 63, in predict
| action = self.model.compute_action(observation, explore=False)
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/agents/trainer.py", line 781, in compute_action
| result = self.get_policy(policy_id).compute_single_action(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/policy.py", line 150, in compute_single_action
| [action], state_out, info = self.compute_actions(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/policy/tf_policy.py", line 268, in compute_actions
| return builder.get(fetches)
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 42, in get
| self._executed = run_timeline(
| File "/usr/local/lib/python3.8/dist-packages/ray/rllib/utils/tf_run_builder.py", line 89, in run_timeline
| fetches = sess.run(ops, feed_dict=feed_dict)
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 957, in run
| result = self._run(None, fetches, feed_dict, options_ptr,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1180, in _run
| results = self._do_run(handle, final_targets, final_fetches,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1358, in _do_run
| return self._do_call(_run_fn, feeds, fetches, targets, options,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1365, in _do_call
| return fn(*args)
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
| return self._call_tf_sessionrun(options, feed_dict, fetch_list,
| File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
| return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
| File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
| sys.exit(signal.SIGTERM)
| SystemExit: Signals.SIGTERM
|
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "solution.py", line 127, in <module>
main()
File "solution.py", line 123, in main
wrap_direct(node=node, protocol=protocol)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/interface.py", line 24, in wrap_direct
run_loop(node, protocol, args)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 251, in run_loop
raise Exception(msg) from e
Exception: Error in node RLlibAgent
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60557 | eval2 | aborted | yes | | | 0:06:42 | Job shutdown [...]Job shutdown
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60554 | eval2 | aborted | yes | | | 0:04:13 | Job shutdown [...]Job shutdown
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60551 | eval1 | success | yes | | | 0:06:24 | | Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60550 | eval1 | aborted | yes | | | 0:00:33 | Job shutdown [...]Job shutdown
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60547 | eval0 | success | yes | | | 0:07:28 | | Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60546 | eval0 | timeout | yes | | - | - | | - | - | No reset possible |
60545 | eval0 | aborted | yes | | | 0:06:15 | Operator message: '' [...]Operator message: ''
Logs:
DEBUG:commons:version: 6.1.7 *
INFO:typing:version: 6.1.8
DEBUG:aido_schemas:aido-protocols version 6.0.33 path /usr/local/lib/python3.8/dist-packages
INFO:nodes:version 6.1.1 path /usr/local/lib/python3.8/dist-packages pyparsing 2.4.6
2020-12-11 21:10:57.484773: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-12-11 21:11:00.240024: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:
2020-12-11 21:11:00.240064: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-11 21:11:00.240112: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
DEBUG:ipce:version 6.0.36 path /usr/local/lib/python3.8/dist-packages
INFO:nodes_wrapper:checking implementation
INFO:nodes_wrapper:checking implementation OK
DEBUG:nodes_wrapper:run_loop
fin: /fifos/ego0-in
fout: fifo:/fifos/ego0-out
INFO:nodes_wrapper:Fifo /fifos/ego0-out created. I will block until a reader appears.
INFO:nodes_wrapper:Fifo reader appeared for /fifos/ego0-out.
INFO:nodes_wrapper:Node RLlibAgent starting reading
fi_desc: /fifos/ego0-in
fo_desc: fifo:/fifos/ego0-out
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: init()
WARNING:config.config:Found paths with seed 3092:
WARNING:config.config:0: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/config_dump_3092.yml
WARNING:config.config:Found checkpoints in ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47:
WARNING:config.config:0: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
WARNING:config.config:Config loaded from ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/config_dump_3092.yml
WARNING:config.config:Model checkpoint loaded from ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
WARNING:config.config:Updating default config values by:
env_config:
mode: inference
WARNING:config.config:Env_config.mode is 'inference', some hyperparameters will be overwritten by:
rllib_config:
num_workers: 0
num_gpus: 0
callbacks: {}
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: === Wrappers ===================================
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Observation wrappers
<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>
<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>
<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>
<NormalizeWrapper<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>>
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Action wrappers
<Heading2WheelVelsWrapper<NormalizeWrapper<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>>>
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Reward wrappers
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: === Config ===================================
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: seed: 3092
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
algo: PPO
algo_config_files:
PPO: config/algo/ppo.yml
general: config/algo/general.yml
env_config:
mode: inference
episode_max_steps: 500
resized_input_shape: (84, 84)
crop_image_top: true
top_crop_divider: 3
grayscale_image: false
frame_stacking: true
frame_stacking_depth: 3
motion_blur: false
action_type: heading
reward_function: posangle
distortion: true
accepted_start_angle_deg: 30
simulation_framerate: 30
frame_skip: 3
action_delay_ratio: 0.0
training_map: multimap_aido5
domain_rand: true
dynamics_rand: true
camera_rand: true
frame_repeating: 0.0
spawn_obstacles: false
obstacles:
duckie:
density: 0.5
static: true
duckiebot:
density: 0
static: false
spawn_forward_obstacle: false
aido_wrapper: true
wandb:
project: duckietown-rllib
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
seed: 3092
ray_init_config:
num_cpus: 1
webui_host: 127.0.0.1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
restore_seed: 3091
restore_experiment_idx: 0
restore_checkpoint_idx: 0
debug_hparams:
rllib_config:
num_workers: 1
num_gpus: 0
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
inference_hparams:
rllib_config:
num_workers: 0
num_gpus: 0
callbacks: {}
ray_init_config:
num_cpus: 1
memory: 2097152000
object_store_memory: 209715200
redis_max_memory: 209715200
local_mode: true
timesteps_total: 4000000.0
rllib_config:
num_workers: 0
sample_batch_size: 265
num_gpus: 0
train_batch_size: 4096
gamma: 0.99
lr: 5.0e-05
monitor: false
evaluation_interval: 25
evaluation_num_episodes: 2
evaluation_config:
monitor: false
explore: false
seed: 1234
lambda: 0.95
sgd_minibatch_size: 128
vf_loss_coeff: 0.5
entropy_coeff: 0.0
clip_param: 0.2
vf_clip_param: 0.2
grad_clip: 0.5
env: Duckietown
callbacks: {}
env_config:
mode: inference
episode_max_steps: 500
resized_input_shape: (84, 84)
crop_image_top: true
top_crop_divider: 3
grayscale_image: false
frame_stacking: true
frame_stacking_depth: 3
motion_blur: false
action_type: heading
reward_function: posangle
distortion: true
accepted_start_angle_deg: 30
simulation_framerate: 30
frame_skip: 3
action_delay_ratio: 0.0
training_map: multimap_aido5
domain_rand: true
dynamics_rand: true
camera_rand: true
frame_repeating: 0.0
spawn_obstacles: false
obstacles:
duckie:
density: 0.5
static: true
duckiebot:
density: 0
static: false
spawn_forward_obstacle: false
aido_wrapper: true
wandb:
project: duckietown-rllib
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
seed: 3092
2020-12-11 21:11:03,758 INFO trainer.py:428 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
2020-12-11 21:11:03,777 ERROR syncer.py:39 -- Log sync requires rsync to be installed.
2020-12-11 21:11:03,778 WARNING deprecation.py:29 -- DeprecationWarning: `sample_batch_size` has been deprecated. Use `rollout_fragment_length` instead. This will raise an error in the future!
2020-12-11 21:11:03,778 INFO trainer.py:583 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
2020-12-11 21:11:03.793170: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-12-11 21:11:03.800627: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2599935000 Hz
2020-12-11 21:11:03.801289: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x90734d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-11 21:11:03.801341: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-12-11 21:11:08,963 INFO trainable.py:217 -- Getting current IP.
2020-12-11 21:11:08,964 WARNING util.py:37 -- Install gputil for GPU system monitoring.
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Restoring checkpoint from: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
2020-12-11 21:11:09,040 INFO trainable.py:217 -- Getting current IP.
2020-12-11 21:11:09,040 INFO trainable.py:422 -- Restored on 172.17.0.2 from checkpoint: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
2020-12-11 21:11:09,040 INFO trainable.py:430 -- Current state after restoring: {'_iteration': 363, '_timesteps_total': 1539120, '_time_total': 110224.9016327858, '_episodes_total': 8614}
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Starting episode "episode".
ERROR:nodes_wrapper:Error in node RLlibAgent:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
handle_message_node(parsed, receiver0, context0)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
f(**kwargs)
File "solution.py", line 61, in on_received_observations
new_image = jpg2rgb(camera.jpg_data)
File "solution.py", line 113, in jpg2rgb
im = im.convert('RGB')
File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 902, in convert
self.load()
File "/usr/local/lib/python3.8/dist-packages/PIL/ImageFile.py", line 261, in load
n, err_code = decoder.decode(b)
File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
sys.exit(signal.SIGTERM)
SystemExit: Signals.SIGTERM
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 243, in run_loop
loop(node_name, fi, fo, node, protocol, tin, tout, config=config, fi_desc=fin, fo_desc=fout)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 378, in loop
raise InternalProblem(msg) from e # XXX
zuper_nodes.structures.InternalProblem: Exception while handling a message on topic "observations".
| Traceback (most recent call last):
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
| handle_message_node(parsed, receiver0, context0)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
| call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
| f(**kwargs)
| File "solution.py", line 61, in on_received_observations
| new_image = jpg2rgb(camera.jpg_data)
| File "solution.py", line 113, in jpg2rgb
| im = im.convert('RGB')
| File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 902, in convert
| self.load()
| File "/usr/local/lib/python3.8/dist-packages/PIL/ImageFile.py", line 261, in load
| n, err_code = decoder.decode(b)
| File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
| sys.exit(signal.SIGTERM)
| SystemExit: Signals.SIGTERM
|
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
handle_message_node(parsed, receiver0, context0)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
f(**kwargs)
File "solution.py", line 61, in on_received_observations
new_image = jpg2rgb(camera.jpg_data)
File "solution.py", line 113, in jpg2rgb
im = im.convert('RGB')
File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 902, in convert
self.load()
File "/usr/local/lib/python3.8/dist-packages/PIL/ImageFile.py", line 261, in load
n, err_code = decoder.decode(b)
File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
sys.exit(signal.SIGTERM)
SystemExit: Signals.SIGTERM
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 243, in run_loop
loop(node_name, fi, fo, node, protocol, tin, tout, config=config, fi_desc=fin, fo_desc=fout)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 378, in loop
raise InternalProblem(msg) from e # XXX
zuper_nodes.structures.InternalProblem: Exception while handling a message on topic "observations".
| Traceback (most recent call last):
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
| handle_message_node(parsed, receiver0, context0)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
| call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
| File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
| f(**kwargs)
| File "solution.py", line 61, in on_received_observations
| new_image = jpg2rgb(camera.jpg_data)
| File "solution.py", line 113, in jpg2rgb
| im = im.convert('RGB')
| File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 902, in convert
| self.load()
| File "/usr/local/lib/python3.8/dist-packages/PIL/ImageFile.py", line 261, in load
| n, err_code = decoder.decode(b)
| File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
| sys.exit(signal.SIGTERM)
| SystemExit: Signals.SIGTERM
|
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "solution.py", line 127, in <module>
main()
File "solution.py", line 123, in main
wrap_direct(node=node, protocol=protocol)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/interface.py", line 24, in wrap_direct
run_loop(node, protocol, args)
File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 251, in run_loop
raise Exception(msg) from e
Exception: Error in node RLlibAgent
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60543 | eval0 | aborted | yes | | | 0:03:49 | Job shutdown [...]Job shutdown
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60542 | eval0 | aborted | yes | | | 0:02:04 | Job shutdown [...]Job shutdown
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |
60541 | eval0 | aborted | yes | | | 0:06:49 | Job shutdown [...]Job shutdown
| Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard. | | No reset possible |