Get started with RL-CP Fusion in minutes
This guide will help you train your first agent using RL-CP Fusion with a D4RL dataset.
Before you begin, make sure you have:
First, install the required packages:
pip install torch gym d4rl numpy tensorboardXHere's a complete example to train an agent on the HalfCheetah environment:
from conformal_sac.agent_wrapper import SACAgent
# Initialize the agent
agent = SACAgent(
env_name="halfcheetah-medium-v2",
offline=True,
iteration=100000,
seed=42,
learning_rate=3e-4,
gamma=0.99,
tau=0.005,
batch_size=256,
log_interval=2000,
alpha_q=100,
q_alpha_update_freq=50
)
# Train the agent
agent.train()
# Evaluate the trained agent
score = agent.evaluate(eval_episodes=10)
print(f"Final evaluation score: {score}")Let's break down what's happening in the code above:
We create a new SACAgent instance with specific hyperparameters:
env_name: The D4RL environment to useoffline: Set to True for offline learningiteration: Number of training iterationsseed: Random seed for reproducibilityThe train() method:
The evaluate() method runs the trained policy for multiple episodes and returns the average reward.
You can monitor the training progress using TensorBoard:
tensorboard --logdir ./exp-SAC_dual_Q_networkNow that you've trained your first agent, you can: