How is the smooth dice loss differentiable?


I am training a U-Net in keras by minimizing the dice_loss function that is popularly used for this problem: adapted from here and here

def dsc(y_true, y_pred):
     smooth = 1.
     y_true_f = K.flatten(y_true)
     y_pred_f = K.flatten(y_pred)
     intersection = K.sum(y_true_f * y_pred_f)
     score = (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
     return score

def dice_loss(y_true, y_pred):
    return (1 - dsc(y_true, y_pred))

This implementation is different from the traditional dice loss because it has a smoothing term to make it “differentiable”. I just don’t understand how adding the smooth term instead of something like 1e-7 in the denominator makes it better because it actually changes the loss values. I have checked this by using a trained unet model on a test set with a regular dice implementation as follows:

def dice(im1,im2):
     im1 = np.asarray(im1).astype(np.bool)
     im2 = np.asarray(im2).astype(np.bool)
     intersection = np.logical_and(im1, im2)
     return np.float(2. * intersection.sum()) / (im1.sum() + im2.sum() + 1e-7))

Can someone explain why the smooth dice loss is conventionally used?


Adding smooth to the loss does not make it differentiable. What makes it differentiable is

  1. Relaxing the threshold on the prediction: You do not cast y_pred to np.bool, but leave it as a continuous value between 0 and 1
  2. You do not use set operations as np.logical_and, but rather use the element-wise product to approximate the non-differenetiable intersection operation.

You only add smooth to avoid division by zero when both y_pred and y_true do not contain any foreground pixels.

Answered By – Shai

Answer Checked By – Clifford M. (AngularFixing Volunteer)

Leave a Reply

Your email address will not be published.