Active learning

Reinforcement Learning from Human Feedback (RLHF)

Mina Parham

AI Engineer

Human in the loop systems

A diagram of an LLM with output evaluated by a human reviewer.

Reinforcement Learning from Human Feedback (RLHF)

Human in the loop systems

A diagram of an LLM with a large volume of data in output evaluated by a human reviewer.

Reinforcement Learning from Human Feedback (RLHF)

Human in the loop systems

A diagram of an LLM with a random choice of data in output evaluated by a human reviewer.

Reinforcement Learning from Human Feedback (RLHF)

Human in the loop systems

A diagram of an LLM with actively chosen data evaluated by a human reviewer.

Reinforcement Learning from Human Feedback (RLHF)

Active learning in RLHF

The RLHF process without the reward model part.

Reinforcement Learning from Human Feedback (RLHF)

Active learning in RLHF

The full RLHF process

Reinforcement Learning from Human Feedback (RLHF)

Active learning

An icon of documents representing input data.

Reinforcement Learning from Human Feedback (RLHF)

Active learning

An icon of documents representing data going into a model.

Reinforcement Learning from Human Feedback (RLHF)

Active learning

An icon of documents representing data going into a model, and an arrow with the label "model confident" going towards the output.

Reinforcement Learning from Human Feedback (RLHF)

Active learning

An icon of documents representing data going into a model, an arrow with the label "model confident" going towards the output, and a parallel arrow going towards a human with labels: "model unsure" and "human reviews and corrects".

Reinforcement Learning from Human Feedback (RLHF)

Active learning

An icon of documents representing data going into a model, an arrow with the label "model confident" going towards the output, a parallel arrow going towards a human with labels "model unsure" and "human reviews and corrects", and a prediction output.

Reinforcement Learning from Human Feedback (RLHF)

Active learning pipeline with low confidence

from modAL.models import ActiveLearner

# Initialize learner learner = ActiveLearner( estimator=LogisticRegression(), query_strategy=uncertainty_sampling, X_training=X_labeled, y_training=y_labeled )
  • Uncertainty sampling: points selected where confidence is lowest
Reinforcement Learning from Human Feedback (RLHF)

Active learning pipeline with low confidence

# Active learning loop
for _ in range(10):
    learner.teach(X_labeled, y_labeled)
    query_idx, _ = learner.query(X_unlabeled)
    X_labeled = np.vstack((X_labeled, X_unlabeled[query_idx]))
    y_labeled = np.append(y_labeled, y[query_idx])

X_unlabeled = np.delete(X_unlabeled, query_idx, axis=0)
Reinforcement Learning from Human Feedback (RLHF)

Let's practice!

Reinforcement Learning from Human Feedback (RLHF)

Preparing Video For Download...