如何编写具有加权平均的keras自定义f1损失函数？

debugcn 发表于 Dev

妮基·米什拉（Nikhil Mishra）

我正在尝试在keras中进行多类分类。到目前为止，我使用categorical_crossentropy作为损失函数。但是由于所需的度量是weighted-f1，所以我不确定categorical_crossentropy是否是最佳损耗选择。我试图使用sklearn.metrics.f1_score在keras中实现加权f1分数，但是由于张量和标量之间的转换问题，我遇到了错误。

像这样：

def f1_loss(y_true, y_pred):
   return 1 - f1_score(np.argmax(y_true, axis=1), np.argmax(y_pred, axis=1), average='weighted')

其次是

 model.compile(loss=f1_loss, optimizer=opt)

如何在keras中编写此损失函数？

编辑：

y_true和y_pred的形状是（n_samples，n_classes），在我的情况下是（n_samples，4）

y_true和y_pred都是张量，因此sklearn的f1_score无法直接在它们上工作。我需要一个函数来计算张量上的加权f1。

丹尼尔·莫勒

变量是自我解释的：

def f1_weighted(true, pred): #shapes (batch, 4)

    #for metrics include these two lines, for loss, don't include them
    #these are meant to round 'pred' to exactly zeros and ones
    #predLabels = K.argmax(pred, axis=-1)
    #pred = K.one_hot(predLabels, 4) 


    ground_positives = K.sum(true, axis=0)       # = TP + FN
    pred_positives = K.sum(pred, axis=0)         # = TP + FP
    true_positives = K.sum(true * pred, axis=0)  # = TP
        #all with shape (4,)

    precision = (true_positives + K.epsilon()) / (pred_positives + K.epsilon()) 
    recall = (true_positives + K.epsilon()) / (ground_positives + K.epsilon()) 
        #both = 1 if ground_positives == 0 or pred_positives == 0
        #shape (4,)

    f1 = 2 * (precision * recall) / (precision + recall + K.epsilon())
        #not sure if this last epsilon is necessary
        #matematically not, but maybe to avoid computational instability
        #still with shape (4,)

    weighted_f1 = f1 * ground_positives / K.sum(ground_positives)
    weighted_f1 = K.sum(weighted_f1)


    return 1 - weighted_f1 #for metrics, return only 'weighted_f1'

重要笔记：

这种损失将分批进行（与任何Keras损失一样）。

因此，如果您使用小批量，则每个批之间的结果将不稳定，并且可能会得到不好的结果。使用大批量，足以包含所有类别的大量样本。

由于这种损失会使批次大小崩溃，因此，您将无法使用某些依赖于批次大小的Keras功能，例如样品重量。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。