1

Following is an easy version of what I'm trying to do. I have the following vector:

wage = 1:10 # Generate a sequence from 1 to 10

And I want to create another vector wage_level such that:

(i) wage_level is "low" if wage less than 5

(ii) wage_level is normal if wage is equal to 5

(iii) wage_level is high if wage is greater than 5

I know I can use nested ifelse statements to do it, however, as I pointed out earlier, this is but a simplified version of what I really want to do because I have about 15 alternatives.

Edit

The answer provided below makes use of the cut() function, which actually works well in many cases. However, it does not seem to "work" in my case. Following is the detailed explanation.

I was able to use the cut() function to create the wage_level vector:

wage = runif(10, 1, 10) # Randomly generate 10 values between 1 and 10

# Here I use the cut() function
wage_level = cut(wage,
                 breaks = c(1, 4, 6, 10),
                 labels = c("low", "normal", "high"),
                 include.lowest = TRUE)
> wage
[1] 5.522422 4.793292 8.161671 5.480415 1.396909 3.403013 4.940242 7.762142 6.364159 4.603998

> wage_level
[1] normal normal high   normal low    low    normal high   high   normal
Levels: low normal high

Now, let's suppose I want to use the wage_level vector to create another vector (the rating vector) using the cut() function. The condition to create the rating vector is as follows:

(i) rating is "1" if wage_level less than "low"

(ii) rating is 2 if wage_level is equal to "normal"

(iii) rating is 3 if wage_level is greater than "high

My problem is that using the cut() function will not make the rating vector a numeric vector will the values of my choice. The following code does not work:

rating = cut(as.numeric(wage_level),
                 breaks = c(0, 1, 2, 3),
                 labels = c(1.2, 6.5, 8.9),
                 include.lowest = TRUE)

> as.numeric(rating)
 [1] 2 2 3 2 1 1 2 3 3 2

I mainly have two problems here:

(i) I would have preferred a way to use the actual strings (i.e. "low", "normal" and "high") instead of the labels indexes

(ii) The values in the rating vector have nothing to do with the values I specified.

Any other method to achieve the desired result?

Thank you very much for your help :)

5
  • 1
    You're probably looking for cut() Commented Apr 7, 2013 at 19:14
  • 1
    @ndoogan is spot on. I've found that working with a large amount of interval breaks, it is helpful to have the breaks and labels be their own variables, often created with seq and paste0 Commented Apr 7, 2013 at 19:19
  • @ndoogan Please, I edited my question, would you please take a look? Commented Apr 12, 2013 at 5:46
  • @RicardoSaporta Please, I edited my question, would you please take a look? Commented Apr 12, 2013 at 5:48
  • @SavedByJESUS Why not create rating using the original numeric variable wage? Or perhaps work with creating wage_level as an ordered factor before using it to create rating. Commented Apr 12, 2013 at 16:24

1 Answer 1

4
wage<-1:10
cut(wage,breaks=c(0,4,5,10),include.lowest=T,labels=c("low","normal","high"))
# [1] low    low    low    low    normal high   high   high   high   high  
#Levels: low normal high

What if the vector isn't ordered? No difference:

wage <- runif(10,1,10)
wage
# [1] 8.535146 4.964819 7.228050 9.150132 6.369952 8.451137 8.022293 7.621226
# [9] 1.070368 5.931904

cut(wage,breaks=c(0,4,5,10),include.lowest=T,labels=c("low","normal","high"))
# [1] high   normal high   high   high   high   high   high   low    high  

Though, notice that the normal factor is applied to values between 4 and 5. If you're really working with reals, then looking for exactly 5 might be an odd choice.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you very much for your answer @ndoogan. This is very helpful; however, what should I do if the data is not arranged in an increasing order just like in the example? What if I had wage = runif(10, 1, 10)?
Why do you think that matters?
@SavedByJESUS I agree with themel. Try it and see. It will work fine.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.