Why is one-hot encoding used in RL instead of binary encoding?

Ask Question

Asked 2 months ago

Modified 2 months ago

Viewed 65 times

Basically, the question above: in RL, people typically encode the state as a tensor consisting of a plane with "channels", i.e. original Alpha Zero paper. These channels are typically one-hot encoded instead of binary encoded, with this I mean that if there are say 6 pieces in chess, instead of writing it as a 3-bit vector (since 2^3=8 > 6) people would write it as a 6-bit vector with only one bit on at the same time. This seems wasteful, so there has to be a deeper reason for why this is done.

asked Sep 4 at 9:06

FriendlyLagrangian

1011 bronze badge

$\begingroup$ Mechanical sympathy. Using one-hot encoding, you can optimize the model to predict direct probabilities (log-odds, etc.), and you can take the argmax for a discrete prediction, etc. It's much more meaningful and easier to train a model for. Consider what you'd have to do if you encoded pieces as a discrete integer. If there's no inherent ordering (ie., index 2 vs. index 3 is not a smaller error than index 1 vs. index 5), then it's wrong to try and learn integers as if their magnitude had some inherent meaning. $\endgroup$

arpad
– arpad

2025-09-04 09:16:59 +00:00
Commented Sep 4 at 9:16

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

Why is one-hot encoding used in RL instead of binary encoding?

0

Your Answer

Hot Network Questions

Why is one-hot encoding used in RL instead of binary encoding?

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions