Reinforcement Learning from Human Feedback (RLHF)
Mina Parham
AI Engineer





Data preferensi preference_df dengan sumber 'Journalist', 'Social Media Influencer', dan 'Marketing Professional':

Contoh data ini mudah diintegrasikan dengan mengelompokkan berdasarkan 'id':
df_majority = preference_df.groupby(['id']).apply(majority_vote)
Lalu gunakan majority voting:
from collections import Counter
def majority_vote(df):
votes = Counter(zip(df['chosen'], df['rejected']))
return max(votes, key=votes.get)
Data preferensi preference_df2 dengan tiga pakar yang sama:

preference_df2 untuk mengidentifikasi sumber yang tidak andal:df_majority = preference_df2.groupby('id').apply(majority_vote)disagreements = {source: 0 for source in preference_df2['source'].unique()}for _, row in preference_df2.iterrows(): if (row['chosen'], row['rejected']) != df_majority[row['id']]: disagreements[row['source']] += 1detect_unreliable_source = max(disagreements, key=disagreements.get)
Reinforcement Learning from Human Feedback (RLHF)