Publications

Journal Articles

2024

Fast Kinodynamic Planning on the Constraint Manifold With Deep Neural Networks

Piotr Kicki, Puze Liu, Davide Tateo, Haitham Bou-Ammar, Krzysztof Walas, Piotr Skrzypczyński, and Jan Peters

IEEE Transactions on Robotics (T-RO), vol. 40, pp. 277-297, 2024

[Abs] [PDF] [DOI]

Motion planning is a mature area of research in robotics with many well-established methods based on optimization or sampling the state space, suitable for solving kinematic motion planning. However, when dynamic motions under constraints are needed and computation time is limited, fast kinodynamic planning on the constraint manifold is indispensable. In recent years, learning-based solutions have become alternatives to classical approaches, but they still lack comprehensive handling of complex constraints, such as planning on a lower dimensional manifold of the task space while considering the robot’s dynamics. This article introduces a novel learning-to-plan framework that exploits the concept of constraint manifold, including dynamics, and neural planning methods. Our approach generates plans satisfying an arbitrary set of constraints and computes them in a short constant time, namely the inference time of a neural network. This allows the robot to plan and replan reactively, making our approach suitable for dynamic environments. We validate our approach on two simulated tasks and in a demanding real-world scenario, where we use a Kuka LBR Iiwa 14 robotic arm to perform the hitting movement in robotic air hockey.
Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications

P. Liu, Bou-Ammar H., J. Peters, and Tateo D.

Submitted to the IEEE Transactions on Robotics (T-RO), 2024

[Abs] [arXiv] [PDF] [URL]

Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. However, most existing approaches are trained in well-tuned simulators and subsequently deployed on real robots without online fine-tuning. In this setting, the simulation’s realism seriously impacts the deployment’s success rate. Instead, learning with real-world interaction data offers a promising alternative: not only eliminates the need for a fine-tuned simulator but also applies to a broader range of tasks where accurate modeling is unfeasible. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and non-linear, making safety challenging to guarantee in learning systems. In this paper, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the Constraint Manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method’s effectiveness in a real-world Robot Air Hockey task, showing that our method can handle high-dimensional tasks with complex constraints. Videos of the real robot experiments are available on the project website.

2023

Composable energy policies for reactive motion generation and reinforcement learning

Julen Urain, Anqi Li, Puze Liu, Carlo D’Eramo, and Jan Peters

The International Journal of Robotics Research, vol. 42, pp. 827-858, 2023

[Abs] [PDF] [URL] [DOI]

In this work, we introduce composable energy policies (CEP), a novel framework for multi-objective motion generation. We frame the problem of composing multiple policy components from a probabilistic view. We consider a set of stochastic policies represented in arbitrary task spaces, where each policy represents a distribution of the actions to solve a particular task. Then, we aim to find the action in the configuration space that optimally satisfies all the policy components. The presented framework allows the fusion of motion generators from different sources: optimal control, data-driven policies, motion planning, and handcrafted policies. Classically, the problem of multi-objective motion generation is solved by the composition of a set of deterministic policies, rather than stochastic policies. However, there are common situations where different policy components have conflicting behaviors, leading to oscillations or the robot getting stuck in an undesirable state. While our approach is not directly able to solve the conflicting policies problem, we claim that modeling each policy as a stochastic policy allows more expressive representations for each component in contrast with the classical reactive motion generation approaches. In some tasks, such as reaching a target in a cluttered environment, we show experimentally that CEP additional expressivity allows us to model policies that reduce these conflicting behaviors. A field that benefits from these reactive motion generators is the one of robot reinforcement learning. Integrating these policy architectures with reinforcement learning allows us to include a set of inductive biases in the learning problem. These inductive biases guide the reinforcement learning agent towards informative regions or improve collision safety while exploring. In our work, we show how to integrate our proposed reactive motion generator as a structured policy for reinforcement learning. Combining the reinforcement learning agent exploration with the prior-based CEP, we can improve the learning performance and explore safer.

Conference Papers

2023

Safe Reinforcement Learning of Dynamic High-Dimensional Robotic Tasks: Navigation, Manipulation, Interaction

Puze Liu, Kuo Zhang, Davide Tateo, Snehal Jauhri, Zhiyuan Hu, Jan Peters, and Georgia Chalvatzaki

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2023

[Abs] [arXiv] [Bib] [PDF]

Safety is a crucial property of every robotic platform: any control policy should always comply with actuator limits and avoid collisions with the environment and humans. In reinforcement learning, safety is even more fundamental for exploring an environment without causing any damage. While there are many proposed solutions to the safe exploration problem, only a few of them can deal with the complexity of the real world. This paper introduces a new formulation of safe exploration for reinforcement learning of various robotic tasks. Our approach applies to a wide class of robotic platforms and enforces safety even under complex collision constraints learned from data by exploring the tangent space of the constraint manifold. Our proposed approach achieves state-of-the-art performance in simulated high-dimensional and dynamic tasks while avoiding collisions with the environment. We show safe real-world deployment of our learned controller on a TIAGo++ robot, achieving remarkable performance in manipulation and human-robot interaction tasks.
```
@inproceedings{ICRA_2022_Redsdf_Atacom,
  title = {Safe Reinforcement Learning of Dynamic High-Dimensional Robotic Tasks: Navigation, Manipulation, Interaction},
  author = {Liu, Puze and Zhang, Kuo and Tateo, Davide and Jauhri, Snehal and Hu, Zhiyuan and Peters, Jan and Chalvatzaki, Georgia},
  booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
  publisher = {IEEE},
  year = {2023},
}
```

2022

Regularized Deep Signed Distance Fields for Reactive Motion Generation

Puze Liu, Kuo Zhang, Davide Tateo, Snehal Jauhri, Jan Peters, and Chalvatzaki Georgia

In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022

[Abs] [Bib] [PDF] [Code]

Autonomous robots should operate in real-world dynamic environments and collaborate with humans in tight spaces. A key component for allowing robots to leave structured lab and manufacturing settings is their ability to evaluate online and real-time collisions with the world around them. Distance-based constraints are fundamental for enabling robots to plan their actions and act safely, protecting both humans and their hardware. However, different applications require different distance resolutions, leading to various heuristic approaches for measuring distance fields w.r.t. obstacles, which are computationally expensive and hinder their application in dynamic obstacle avoidance use-cases. We propose Regularized Deep Signed Distance Fields (ReDSDF), a single neural implicit function that can compute smooth distance fields at any scale, with fine-grained resolution over high-dimensional manifolds and articulated bodies like humans, thanks to our effective data generation and a simple inductive bias during training. We demonstrate the effectiveness of our approach in representative simulated tasks for whole-body control (WBC) and safe Human-Robot Interaction (HRI) in shared workspaces. Finally, we provide proof of concept of a real-world application in a HRI handover task with a mobile manipulator robot.
```
@inproceedings{IROS_2022_ReDSDF,
  title = {Regularized Deep Signed Distance Fields for Reactive Motion Generation},
  author = {Liu, Puze and Zhang, Kuo and Tateo, Davide and Jauhri, Snehal and Peters, Jan and Georgia, Chalvatzaki},
  booktitle = {Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  publisher = {IEEE},
  year = {2022},
}
```
Robot Reinforcement Learning on the Constraint Manifold
Best Paper Award Finalist

Puze Liu, Davide Tateo, Haitham Bou-Ammar, and Jan Peters

In Proceedings of the 5th Conference on Robot Learning (CoRL), vol. 164, pp. 1357–1366, 2022

[Abs] [Bib] [PDF] [Supp] [Code] [Website]

Reinforcement learning in robotics is extremely challenging due to many practical issues, including safety, mechanical constraints, and wear and tear. Typically, these issues are not considered in the machine learning literature. One crucial problem in applying reinforcement learning in the real world is Safe Exploration, which requires physical and safety constraints satisfaction throughout the learning process. To explore in such a safety-critical environment, leveraging known information such as robot models and constraints is beneficial to provide more robust safety guarantees. Exploiting this knowledge, we propose a novel method to learn robotics tasks in simulation efficiently while satisfying the constraints during the learning process.
```
@inproceedings{CORL_2021_Learning_on_the_Manifold,
  title = {Robot Reinforcement Learning on the Constraint Manifold},
  author = {Liu, Puze and Tateo, Davide and Bou-Ammar, Haitham and Peters, Jan},
  booktitle = {Proceedings of the 5th Conference on Robot Learning (CoRL)},
  pages = {1357--1366},
  year = {2022},
  editor = {Faust, Aleksandra and Hsu, David and Neumann, Gerhard},
  volume = {164},
  series = {Proceedings of Machine Learning Research},
  publisher = {PMLR},
  comments = {Best Paper Award Finalist},
}
```
Dimensionality Reduction and Prioritized Exploration for Policy Search

Marius Memmel, Puze Liu, Davide Tateo, and Jan Peters

In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 151, pp. 2134–2157, 2022

[Abs] [Bib] [PDF] [Supp] [Code]

Black-box policy optimization is a class of reinforcement learning algorithms that explores and updates the policies at the parameter level. This class of algorithms is widely applied in robotics with movement primitives or non-differentiable policies. Furthermore, these approaches are particularly relevant where exploration at the action level could cause actuator damage or other safety issues. However, Black-box optimization does not scale well with the increasing dimensionality of the policy, leading to high demand for samples, which are expensive to obtain in real-world systems. In many practical applications, policy parameters do not contribute equally to the return. Identifying the most relevant parameters allows to narrow down the exploration and speed up the learning. Furthermore, updating only the effective parameters requires fewer samples, improving the scalability of the method. We present a novel method to prioritize the exploration of effective parameters and cope with full covariance matrix updates. Our algorithm learns faster than recent approaches and requires fewer samples to achieve state-of-the-art results. To select the effective parameters, we consider both the Pearson correlation coefficient and the Mutual Information. We showcase the capabilities of our approach on the Relative Entropy Policy Search algorithm in several simulated environments, including robotics simulations. Code is available at https://git.ias.informatik.tu-darmstadt.de/ias_code/aistats2022/dr-creps.
```
@inproceedings{AISTATS_22_DR-CREPS,
  title = { Dimensionality Reduction and Prioritized Exploration for Policy Search },
  author = {Memmel, Marius and Liu, Puze and Tateo, Davide and Peters, Jan},
  booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS)},
  pages = {2134--2157},
  year = {2022},
  editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel},
  volume = {151},
  series = {Proceedings of Machine Learning Research},
  publisher = {PMLR},
}
```

2021

Efficient and Reactive Planning for High Speed Robot Air Hockey
Best Entertainment and Amusement Paper Award Finalist

Puze Liu, Davide Tateo, Haitham Bou-Ammar, and Jan Peters

In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 586-593, 2021

[Abs] [Bib] [HTML] [PDF] [DOI]

Highly dynamic robotic tasks require high-speed and reactive robots. These tasks are particularly challenging due to the physical constraints, hardware limitations, and the high uncertainty of dynamics and sensor measures. To face these issues, it’s crucial to design robotics agents that generate precise and fast trajectories and react immediately to environmental changes. Air hockey is an example of this kind of task. Due to the environment’s characteristics, it is possible to formalize the problem and derive clean mathematical solutions. For these reasons, this environment is perfect for pushing to the limit the performance of currently available general-purpose robotic manipulators. Using two Kuka Iiwa 14, we show how to design a policy for general-purpose robotic manipulators for the air hockey game. We demonstrate that a real robot arm can perform fast-hitting movements and that the two robots can play against each other on a medium-size air hockey table in simulation.
```
@inproceedings{IROS_2021_Air_Hockey,
  title = {Efficient and Reactive Planning for High Speed Robot Air Hockey},
  author = {Liu, Puze and Tateo, Davide and Bou-Ammar, Haitham and Peters, Jan},
  year = {2021},
  booktitle = {2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  pages = {586-593},
  doi = {10.1109/IROS51168.2021.9636263},
  comments = {Best Entertainment and Amusement Paper Award Finalist},
}
```

Composable Energy Policies for Reactive Motion Generation and Reinforcement Learning

Julen Urain, Anqi Li, Puze Liu, Carlo D’eramo, and Jan Peters

In Robotics: Science and Systems XVII (R:SS 2021), Jul, 2021

[Bib] [PDF] [DOI]

@inproceedings{RSS_21_CEP,
  title = {Composable Energy Policies for Reactive Motion Generation and Reinforcement Learning},
  author = {Urain, Julen and Li, Anqi and Liu, Puze and D'eramo, Carlo and Peters, Jan},
  booktitle = {Robotics: Science and Systems XVII (R:SS 2021)},
  editors = {Dylan A. Shell and Marc Toussaint and M. Ani Hsieh},
  month = jul,
  year = {2021},
  doi = {10.15607/RSS.2021.XVII.052},
}

Workshop Papers

2022

ReDSDF: Regularized Deep Signed Distance Fields for Robotics

Puze Liu, Kuo Zhang, Davide Tateo, Snehal Jauhri, Jan Peters, and Georgia Chalvatzaki

ICRA Workshop: Motion Planning with Implicit Neural Representations of Geometry, 2022

[Bib]

@techreport{ICRA_Workshop_ReDSDF,
  author = {Liu, Puze and Zhang, Kuo and Tateo, Davide and Jauhri, Snehal and Peters, Jan and Chalvatzaki, Georgia},
  title = {ReDSDF: Regularized Deep Signed Distance Fields for Robotics},
  workshop = {ICRA Workshop: Motion Planning with Implicit Neural Representations of Geometry},
  year = {2022},
}