Archive | Mun Hou's Blog

Um..! 17 posts in total. Keep on posting.

20261

01-14

From PPO to GDPO — Understanding the Evolution of Reinforcement Learning Algorithms

20237

08-14

Byte-Pair-Encoding (BPE)

08-12

Paper Study: LLaMA 2 Open Foundation and Fine-Tuned Chat Models

08-11

Paper Study: LLaMA Open and Efﬁcient Foundation Language Models

03-02

Proximal Policy Optimization Algorithms (PPO and PPO2)

03-01

Off-Policy Gradient Methods

03-01

Policy Gradient Methods

03-01

Importance Sampling

20222

12-02

Detecting Out-of-Distribution Samples with kNN

02-09

Simple Implementation of Rendezvous Architecture for Machine Learning Services

0%