Skip to main content

Training and Play

fourier_lab builds a training environment based on NVIDIA Isaac Sim and Isaac Lab, integrating the rsl_rl reinforcement learning framework to train the Fourier GR3 robot's walking ability on complex terrains.

Installation Guide

  1. Install Ubuntu 22.04 system

  2. Conda Environment Configuration

    # Install Miniconda
    cd ~/Downloads
    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    bash Miniconda3-latest-Linux-x86_64.sh

    # Create training environment
    conda create -n fourier-lab python=3.11 -y
    conda activate fourier-lab
  3. Dependency Installation

    # Navigate to the project directory
    cd path/to/your/project

    # Update pip
    pip install --upgrade pip

    # Install Isaac Sim
    pip install "isaacsim[all,extscache]==5.1.0" --extra-index-url https://pypi.nvidia.com

    # Install torch and torchvision
    pip install -U torch==2.7.0 torchvision==0.22.0 --index-url https://download.pytorch.org/whl/cu128

    # Create project folder and navigate to it
    mkdir GRX_humanoid && cd GRX_humanoid

    # Clone repositories
    git clone https://github.com/FFTAI/fourier_lab.git
    git clone https://github.com/isaac-sim/IsaacLab.git
    cd fourier_lab

    # Install Isaac Lab
    ../IsaacLab/isaaclab.sh -i

    # Install rsl_rl
    ../IsaacLab/isaaclab.sh -p -m pip install -e rsl_rl

    # Test rsl_rl
    ../IsaacLab/isaaclab.sh -p -m pip show rsl-rl-lib

    # Install project
    cd exts/GRX_humanoid
    python -m pip install -e .

Instructions for Use

  • For the first run, please convert urdf to usd; the conversion step can be skipped when retraining the same model.
  • train.py is used to train the policy, and play.py is used to verify the motion effect of the current policy in the simulation environment and export the policy.

WBC: Differences Between LOWER / FULL / MASK

  • WBC LOWER:Controls and trains only the lower-limb and torso-related joints. It is typically used when prioritizing basic standing and walking stability, and it usually converges faster.
  • WBC FULL:Controls and trains all body joints (including the arms). It is suitable for tasks that require full-body motion capability and coordinated control.
  • WBC MASK:Built on FULL, it uses a mask mechanism so that during walking, when there is no arm-task command, the robot can swing its arms naturally.
  1. GR3 WBC LOWER Training and Demonstration (suitable for GR3)

    # Convert urdf to usd
    python scripts/tools/convert_urdf.py models/gr3v2_1_1/basic_urdf/gr3v2_1_1_lower.urdf exts/GRX_humanoid/GRX_humanoid/assets/Robots/ppv211_humanoid_lower.usd --merge-joints

    # Start training
    ../IsaacLab/isaaclab.sh -p scripts/rsl_rl/train.py --task PPV211HumanoidRoughEnvCfg_WBC_LOWER --headless

    # Demonstration test
    ../IsaacLab/isaaclab.sh -p scripts/rsl_rl/play.py --task PPV211HumanoidRoughEnvCfg_WBC_LOWER_Play

    # View training logs
    tensorboard --logdir logs/rsl_rl/ppv211_humanoid_rough_wbc_lower/
  2. GR3 WBC FULL Training and Demonstration (suitable for GR3)

    # Convert urdf to usd
    python scripts/tools/convert_urdf.py models/gr3v2_1_1/basic_urdf/gr3v2_1_1_noArmColli.urdf exts/GRX_humanoid/GRX_humanoid/assets/Robots/ppv211_noArmCollision.usd --merge-joints

    # Start training
    ../IsaacLab/isaaclab.sh -p scripts/rsl_rl/train.py --task PPV211HumanoidRoughEnvCfg_WBC_FULL --headless

    # Demonstration test
    ../IsaacLab/isaaclab.sh -p scripts/rsl_rl/play.py --task PPV211HumanoidRoughEnvCfg_WBC_FULL_Play

    # View training logs
    tensorboard --logdir logs/rsl_rl/ppv211_humanoid_rough_wbc_full/
  3. GR3 WBC MASK Training and Demonstration (suitable for GR3)

    # Convert urdf to usd
    python scripts/tools/convert_urdf.py models/gr3v2_3_3/basic_urdf/gr3v2_3_3_noArmColli.urdf exts/GRX_humanoid/GRX_humanoid/assets/Robots/ppv233_noArmCollision.usd --merge-joints

    # Start training
    ../IsaacLab/isaaclab.sh -p scripts/rsl_rl/train.py --task PPV233_Mask_WBC --headless

    # Demonstration test
    ../IsaacLab/isaaclab.sh -p scripts/rsl_rl/play.py --task PPV233_Mask_WBC_Play

    # View training logs
    tensorboard --logdir logs/rsl_rl/ppv233_mask_wbc_full/
  4. Export Policy

    • When running play.py, the policy network model will be automatically exported to the log directory of the corresponding task, e.g., logs/rsl_rl/ppv211_humanoid_rough_wbc_lower.
    • This policy model can be used for subsequent deployment on the real robot.

[!NOTE]

  • To reuse the above commands between different models, you only need to modify the file paths and task names related to gr3_v_x_x and ppv2xx.
  • The average training time using an NVIDIA RTX 4090 is about 6 seconds per iteration.
  • It takes about 8 hours to complete 5000 iterations. The actual training time can be adjusted based on reward growth and training objectives.

Thank you for your interest in the Fourier Intelligence GR3 robot project!

We hope these resources will provide strong support for your robot development!