If we divide the data into training data, validation data, and testing data, I remember the lesson from Andrew Ng saying we use the validation data for hyperparameter tuning purpose. (you can see this article: https://towardsdatascience.com/why-do-we-need-a-validation-set-in-addition-to-training-and-test-sets-5cf4a65550e0)
My question is why not using training data for hyperparameter tuning since we have more data within?