Replies: 2 comments 1 reply
-
I find it interesting to have probing tasks. It would be an interesting insight for people who do XAI and interpretability.
Could you please open a discussion instead of an issue? |
Beta Was this translation helpful? Give feedback.
1 reply
-
I will convert this to a discussion, but feel free to keep it going |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've suggested(#562) incorporating Linguistic Probing tasks as outlined by Conneau et al.. These tasks serve as a method for evaluating embeddings regarding their ability to accurately capture linguistic characteristics within sentences. These tasks are under three different categories which are Surface Information, Syntactic Information and Semantic Information, please see the paper for further details. All probing tasks are casted under a classification task and detailed description of them are in the this page.
Discussion in PR #562
As pointed by @x-tabdeveloping in #562 , not all datasets fall under the classification naturally, such as sentence length prediction or most significantly Word Count dataset which is 1000 way classification. Instead, we can evaluate tasks differently, such as doing word retrieval instead of classification in Word Count proposed by @x-tabdeveloping. Thus, we could discuss possible evaluation strategies for probing tasks in this issue. Please share your thoughts.
Probing as a new Tasks to MMTEB.
Regardless of the evaluation strategies, probing is completely different from the current task categories we have as beautifully put by the abstract of the paper Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing. “Downstream” tasks, often based on sentence classification, are commonly used to evaluate the quality of sentence representations. The complexity of the tasks makes it however difficult to infer what kind of information is present in the representations. We introduce here 10 probing tasks designed to capture simple linguistic features of sentences, and we use them to study embeddings generated by three different encoders trained in eight distinct ways, uncovering intriguing properties of both encoders and training methods. Therefore I propose Probing as a new task to dataset. @Muennighoff @KennethEnevoldsen @isaac-chung @imenelydiaker and all other I have missed to tag.
Beta Was this translation helpful? Give feedback.
All reactions