Question about the classifier used for IntentAccuracyDailyDialog. #71

zhangjf-nlp · 2024-04-24T11:00:50Z

According to the source code of class IntentAccuracyDailyDialog(BaseMetric), the intent likelihood of utterances on DailyDialog is computed by rajkumarrrk/roberta-daily-dialog-intent-classifier.

However, according to the config.json of this classifier, it is used for emotion classification, with four labels: joy, optimism, anger, and sadness, while the intent labels on DailyDialog should be Inform, Questions, Directives, and Commissive instead.

So my question is: Is this classifier already fine-tuned on intent classification of DailyDialog utterances?

Empirically, i obeserve that the classification results of ground truth utterances in DailyDialog by this classifier are unbalanced and not well-aligned to the labelled intent distribution, as shown below.

classification results on test set

	label-0	label-1	label-2	label-3	Intent Accuracy
classification on ground truth	0.7102	0.0055	0.0275	0.2071	0.6147
intent labels in DailyDialog	0.4988	0.2231	0.1565	0.1213	-
classification on SFT generation	0.5363	0.1591	0.0944	0.2100	0.4034

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the classifier used for IntentAccuracyDailyDialog. #71

Question about the classifier used for IntentAccuracyDailyDialog. #71

zhangjf-nlp commented Apr 24, 2024

Question about the classifier used for IntentAccuracyDailyDialog. #71

Question about the classifier used for IntentAccuracyDailyDialog. #71

Comments

zhangjf-nlp commented Apr 24, 2024