[1] Mahyar Abdeetedal. Kuka rl. https://github.com/mahyaret/kuka_rl/tree/master, 2023. [2] Volodymyr Mnih and Badia. Asynchronous methods for Deep Reinforcement Learning. arXiv preprint arXiv:1602.01783, 2016. [3] Volodymyr Mnih and Kavukcuoglu. Playing Atari with Deep Reinforcement Learning. arXiv preprint arXiv:1312.5602, 2013. [4] John Schulman and Wolski. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347, 2017.