Another interesting research paper on "Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks."
Ref: https://arxiv.org/abs/2307.02477
It briefs the various language models to show impressive performance on a wide variety of tasks. It covers transferrable reasoning skills by introducing counterfactual variants of familiar tasks. Some of use cases like drawing & music tasks, are attached.
It has been observed a consistent and substantial performance degradation on these counterfactual tasks, across LMs (GPT-4, GPT-3.5, Claude, PaLM-2).
No comments:
Post a Comment