Asynchronous Reinforcement Learning for Franka Robotic Arm

This is the implementation for asynchronous reinforcement learning for Franka robotic arm. This repo consists of two parts, the vision-based Franka environment, which is based on the OpenAI Gym framework, Franka ROS Interface, and an asynchronous learning architecture for Soft-Actor-Critic. (Our implementation of SAC is partly borrowed from here)

Trained results:

Initial policy, frame dropped	After about 400 episodes, frame dropped

Recording of the initial policy
Recording of the learned behaviour at 400 episodes

Required Packages

Use pip3 to install the Pytorch packages and not conda.

Python 3.7+
Numpy 1.19.5
Pytorch 1.9.0+cuda 11.1
Pytorch vision 0.10.0
Pytorch audio 0.9.0
OpenCV 4.1.2.30
Matplotlib 3.3.4
SenseAct 0.1.2
Gym 0.17.3
Termcolor 1.1.0
Latest NVidia driver on Linux

Required Hardware

Graphic card with cuda 11.1 support
USB camera

Instructions

To run without SenceAct communicator

Initialize a Python virtual enviroment python3 -m venv
Install dependencies from pip isntall -r requirement.txt
Install ROS dependencies
Activate local virtual environment source venv/bin/activate
In the same terminal, go to your ~/catkin_ws and run .franka.sh remote
Go back to the project root and run python franka_train.py --async_mode (for parallel mode, or ignore it for serial mode). Arguments can be found defined in this file.

Misc

Test your camera feed with v4l2-ctl -d /dev/video0 --list-formats-ext
If your virtual env complains about ROS dependencies, set your PYTHONPATH to your virtual env python. e.g. export PYTHONPATH="$PYTHONPATH:<path-to-root>"
If you have issues with the environment, check with <root>/debug_scripts/collect_env.py

To run with SenseAct communicator

TBD

Output format

The console output is available in the form:

| train | E: 1 | S: 1000 | D: 0.8 s | R: 0.0000 | BR: 0.0000 | ALOSS: 0.0000 | CLOSS: 0.0000 | NUM: 0.0000

and a training entry decodes as:

E - total number of episodes 
S - total number of environment steps
D - duration in seconds of 1 episode
R - episode reward
BR - average reward of sampled batch
ALOSS - average loss of actor
CLOSS - average loss of critic
NUM - number of gradient updates performed so far

Troubleshoot

For Python3.7+ you will run into this issue:
```
multiprocessing: TypeError: cannot pickle 'weakref' object
```
Please check out this thread and follow fix suggested here.
When using the fmq library, you will find it not compatible with python3. Please follow this fix.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
algo		algo
configs		configs
debug_scripts		debug_scripts
envs		envs
figs		figs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
action.txt		action.txt
franka_train.py		franka_train.py
logger.py		logger.py
plotting.py		plotting.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Asynchronous Reinforcement Learning for Franka Robotic Arm

Trained results:

Required Packages

Required Hardware

Instructions

To run without SenceAct communicator

Misc

To run with SenseAct communicator

Output format

Troubleshoot

About

Releases

Packages

Languages

License

ne-v0y/franka-async-rl

Folders and files

Latest commit

History

Repository files navigation

Asynchronous Reinforcement Learning for Franka Robotic Arm

Trained results:

Required Packages

Required Hardware

Instructions

To run without SenceAct communicator

Misc

To run with SenseAct communicator

Output format

Troubleshoot

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages