I have a dataset that I want to perform a regression on. However, some of the columns are not in numerical form. For example, the extra classes column. What I wanted to know, is if I change the no's to 0 and yes to 1, will it affect the results as the health and absences columns are on a 1-5 scale (1 being poor 5 being great) and the grade column is also 1-20. Also, the study time column I was going to change the "2 to 5 hours" to something suitable (for example 2), and then the "5 to 10 hours" column to 5 etc.
-
1$\begingroup$ Hi @Charlotte, welcome to CV! Regarding no's to 0 and yes to 1, there is no issue there. Most modelling packages will do something similar under the hood anyway. Re I was going to change the 2 to 5 hours to something suitable for example 2 - maybe? You could leave it is a categorical predictor too, which is more flexible. Setting "2-5 hours" to 2 and "5-10 hours" to 5, for example, means you are forcing a linear relationship onto the left hand side of study time variable. $\endgroup$Alex J– Alex J2024-02-21 21:32:38 +00:00Commented Feb 21, 2024 at 21:32
1 Answer
Whenever you have a categorical variable, you can label it however you like without changing the substance of your statistical analysis. Changing the Yes/No labelling for a binary variable to 0/1 does not change the variable to a non-binary variable, and it should not cause any problems so long as your formulae key off that variable appropriately. (Make sure you check your spreadsheet formulae and your code for regression to make sure that there are no inputs expecting the Yes/No format.
While the change you are proposing is allowable, it is usually better to keep descriptive labels for a binary variable rather than numeric binary coding. All standard statistical programming software should be able to handle regression inputs that are categorical variables with any labelling you wish to use.
-
$\begingroup$ ^ Ha! Tell me about isomorphism without saying "isomorphism"! (+1) $\endgroup$Galen– Galen2024-02-22 05:03:03 +00:00Commented Feb 22, 2024 at 5:03
