Simulation to Real. #2 [Dynamic-Randomization]

728x90

Intro

일전에 만들어둔 모델은 프린팅 해서 완성을 시켰는데, 선이 걸리적거리는 문제가 생각보다 심각해서

해결하는 방안을 찾다가, Motor Base의 Frame부분을 새롭게 모델링하여 라인의 영향을 조금 줄이는

방향으로 Motor Case를 제작 중에 있다. 아마 평일에 학교에 가면 바로 제작 들어가고 그다음 날에는

완성본으로 소개가 가능할 것 같다. 자세한 내용은 추후에 모델이 완성되면 다시 언급하려 한다.

일단 새 Model을 만들기 이전에 흥미로운 주제의 논문을 발견해서 소개를 해보려 한다.

KAIST에서 나온 최신 논문[1]을 기점으로 이론을 정리하고 Model에 집중하려는 중에 Reference에서

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization [2]이라는 논문을 발견했다.

KAIST 논문[1]의 이해를 돕기 위해 해당 논문을 읽는데, Simulation과 Reality 사이의 gap을 줄이는 방법 중

Dynamic 부분에 Randomization을 넣으면 calibration error(측정, 미세한 오류)를 해결한다는 흥미로운

주제가 있었다.

Background

핵심 파트를 읽기 전에 논문에서 제시된 배경지식을 통해 DRL을 다시 한번 정리해 봤다.

연두색은 주되게 본 내용이고

노란색은 참조 정도만 한 것이다.
빨간색은 가장 핵심이라고 생각해서 그어봤다.

그 외에 파란색, 짙은 빨간색 밑줄들은 그냥 Reading연습하려고 한 거다.

1) DRL-Base

2) Hindisight Experience Replay

Method

Dynamic-Randomization의 방법을 설명하는 단계이다.핵심적인 내용만 잘라서 소개해 보려 한다.

연두색을 통해 요약했다.

1) State & Action

내가 직접 하게 되면 어떤 파트를 고려하고 어떤 state들을 받아들이는지 응용을 위해 다시 검토할 필요가 있었다.

2) Dynamics Randomization

Isaac-Gym에서도 다룬 적이 있는데, Sim내에서만 해도 다양한 하이퍼 파라미터들이 존재한다. 위 논문을 읽으며 내가
고려하고 싶은 요소에 Randomization을 첨가하면 원하는 task에 가깝게 Sim과 Real의 갭이 한층 줄어든다는 걸 알게 됐다.
동역학적 요소와 제어의 기초 같은 내용을 밑바탕으로 한 상태에서 Rand를 해주는 게 좋아 보인다.

Conclusion

뒤에 더 많은 내용이 있긴 하지만 PPO를 활용하여 학습을 시켜볼 예정이라, 참고하면 더 많은 아이디어가 나오겠지만 일단은 여기서

끊어 보려고 한다. 후에 Recurrent와 LSTM으로 내용이 추가되는데 자세한 내용은 아래 참조해놓은 원문을 읽는 것을 추천한다.

[Reference]

[1] https://arxiv.org/abs/2202.05481

Concurrent Training of a Control Policy and a State Estimator for Dynamic and Robust Legged Locomotion

In this paper, we propose a locomotion training framework where a control policy and a state estimator are trained concurrently. The framework consists of a policy network which outputs the desired joint positions and a state estimation network which outpu

arxiv.org

[2] https://arxiv.org/abs/1710.06537

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

Simulations are attractive environments for training agents as they provide an abundant source of data and alleviate certain safety concerns during the training process. But the behaviours developed by agents in simulation are often specific to the charact

arxiv.org

'Sim2Real > Acrobot' 카테고리의 다른 글

Simulation to Real. #6 [Acrobot-IsaacGym : Inverted Pendulum Balance] (0)	2022.03.23
Simulation to Real. #5 [Acrobot-IsaacGym : Inverted Pendulum Balance] (0)	2022.03.22
Simulation to Real. #4 [Acrobot-IsaacGym : rl-games] (0)	2022.03.12
Simulation to Real. #3 [Acrobot-하드웨어 만들기 ①] (0)	2022.03.08
Simulation to Real. #1 [Acrobot Simulation-모델 만들기] (0)	2022.03.01

공부가 하고싶어요

Simulation to Real. #2 [Dynamic-Randomization]

'Sim2Real > Acrobot' 카테고리의 다른 글

티스토리툴바

Simulation to Real. #2 [Dynamic-Randomization]

'Sim2Real > Acrobot' 카테고리의 다른 글

'Sim2Real/Acrobot' Related Articles

티스토리툴바