Using np.select with a column with ranges

Issue

I have this code:

df = pd.DataFrame({'r': {0: '01', 1: '02', 2: '03', 3: '04', 4:''},\
                   'an': {0: 'a', 1: 'b,c', 2: '', 3: 'c,a,b',4:''}})

yielding the following dataframe:

    r   an
0   01  a
1   02  b,c
2   03  
3   04  c,a,b
4       

Using np.select, the desired output is as follows:

    r   an    s
0   01  a     13
1   02  b,c   [88,753]
2   03  
3   04  c,a,b [789,48,89] 
4       

I tried usign the following code:

conditions=[
     (df['an']=='a')&(df['r']=='01'),
     (df['an']=='b')&(df['r']=='01'),
     (df['an']=='c')&(df['r']=='01'),
     (df['an']=='d')&(df['r']=='01'),
     (df['an']=='')&(df['r']=='01'),
     (df['an']=='a')&(df['r']=='02'),
     (df['an']=='b')&(df['r']=='02'),
     (df['an']=='c')&(df['r']=='02'),
     (df['an']=='d')&(df['r']=='02'),
     (df['an']=='')&(df['r']=='02'),
     (df['an']=='a')&(df['r']=='03'),
     (df['an']=='b')&(df['r']=='03'),
     (df['an']=='c')&(df['r']=='03'),
     (df['an']=='d')&(df['r']=='03'),
     (df['an']=='')&(df['r']=='03'),
     (df['an']=='a')&(df['r']=='04'),
     (df['an']=='b')&(df['r']=='04'),
     (df['an']=='c')&(df['r']=='04'),
     (df['an']=='d')&(df['r']=='04'),
     (df['an']=='')&(df['r']=='04')
      ]
      
choices=[
    13,
    75,
    6,
    89,
    '-',
    45,
    88,
    753,
    75,
    '-',
    0.2,
    15,
    79,
    63,
    '-',
    48,
    89,
    789,
    15,
    '-',
    ]
    
df['s']=np.select(conditions, choices)

Unfortunately code above only returned desired output for raw 0 (single), for the other raws it retuned 0.
Is it possible to use np.select with a range of values?

Solution

IIUC, use a container (Series/DataFrame/dictionary) to contain the matches, then reference them using a loop:

# mapping the references, can be any value
df_map = pd.DataFrame({'a': ['sa01', 'sa02', 'sa03', 'sa04'],
                       'b': ['sb01', 'sb02', 'sb03', 'sb04'],
                       'c': ['sc01', 'sc02', 'sc03', 'sc04'],
                       'd': ['sd01', 'sd02', 'sd03', 'sd04'],
                        '': ['s01', 's02', 's03', 's04'],     # optional
                       }, index=['01', '02', '03', '04']
                       )
# derive a dictionary
# (you could also manually define the dictionary
#  if not all combinations are needed)
d = df_map.stack().to_dict()
# {(0, 'a'): 'sa01',
#  (0, 'b'): 'sb01',
#  (0, 'c'): 'sc01',
#  (0, 'd'): 'sd01',
#  (0,  ''): 's01',
#  (1, 'a'): 'sa02',

# map the values
df['s'] = [l if len(l:=[d.get((r, e)) for e in s.split(',')])>1 else l[0]
           for r,s in zip(df['r'], df['an'])]

output:

    r     an                   s
0  01      a                sa01
1  02    b,c        [sb02, sc02]
2  03                        s03
3  04  c,a,b  [sc04, sa04, sb04]
4                           None

Answered By – mozway

Answer Checked By – Candace Johnson (AngularFixing Volunteer)

Leave a Reply

Your email address will not be published.