I have a pandas data frame with 50k rows. I’m trying to add a new column that is a randomly generated integer from 1 to 5.
If I want 50k random numbers I’d use:
df1['randNumCol'] = random.sample(xrange(50000), len(df1))
but for this I’m not sure how to do it.
Side note in R, I’d do:
sample(1:5, 50000, replace = TRUE)
One solution is to use
import numpy as np df1['randNumCol'] = np.random.randint(1, 6, df1.shape)
Or if the numbers are non-consecutive (albeit slower), you can use this:
df1['randNumCol'] = np.random.choice([1, 9, 20], df1.shape)
In order to make the results reproducible you can set the seed with
Answered By – Matt
Answer Checked By – Robin (AngularFixing Admin)