Post
101
Fair Reinforcement Learning for Just AI
ICLR 2026 Publication
โ๏ธ Democratic Alignment: Seamlessly incorporates multiple, competing sets of values from different agents, moving past the "one-size-fits-all" limitation of traditional RLHF.
๐ฆ Black-Box Policy Optimization: Operates as a wrapper around standard policy optimization algorithms, removing direct dependency on the total number of states or actions.
๐ Orders of Magnitude Faster: Drastically reduces sample complexity and orders of magnitude more efficient with respect to computation compared to prior tabular methods.
Link: https://github.com/EzgiKorkmaz/fair-reinforcement-learning
ICLR 2026 Publication
โ๏ธ Democratic Alignment: Seamlessly incorporates multiple, competing sets of values from different agents, moving past the "one-size-fits-all" limitation of traditional RLHF.
๐ฆ Black-Box Policy Optimization: Operates as a wrapper around standard policy optimization algorithms, removing direct dependency on the total number of states or actions.
๐ Orders of Magnitude Faster: Drastically reduces sample complexity and orders of magnitude more efficient with respect to computation compared to prior tabular methods.
Link: https://github.com/EzgiKorkmaz/fair-reinforcement-learning