Action

class Action(BaseModel):

Represents an action taken in an environment.

This class defines the input context, the LLM-generated output, and metadata required for verification and tracking within an RL framework.

Parameters:

  • llm_response (str): The response generated by the LLM.
  • metadata (Dict[str, Any]): Additional metadata such as model parameters, prompt details, or response confidence scores.
  • timestamp (datetime): The timestamp when the action was generated (UTC).

Observation

class Observation(BaseModel):

Environment observation.

Parameters:

  • question: The question posed to the LLM.
  • context: Additional context for the question.
  • metadata: Optional metadata about the observation.

StepResult

class StepResult(BaseModel):

Result of an environment step.

Parameters:

  • observation: The next observation.
  • reward: Dictionary of reward scores for different aspects.
  • done: Whether the episode is complete.
  • info: Additional information about the step.

as_tuple

def as_tuple(self):

Returns all fields of the model as a tuple, in declaration order