Modern language models excel at factual reasoning but struggle with value diversity: the multiplicity of plausible human perspectives. Tasks such as hate speech or sexism detection expose this limitation, where human disagreement captures the diversity of perspectives that models need to account for, rather than dataset noise. In this paper, we explore whether multi-perspective in-context learning (ICL) can align large language models (LLMs) with this diversity without parameter updates. We evaluate four LLMs on five datasets across three languages (English, Arabic, Italian), considering three label-space representations (aggregated hard, disaggregated hard, and disaggregated soft) and five demonstration selection and ordering strategies. Our multi-perspective approach outperforms standard prompting on aggregated English labels, while disaggregated soft predictions better align with human judgments in Arabic and Italian datasets.These findings highlight the importance of perspective-aware LLMs for reducing bias and polarization, while also revealing the challenges of applying ICL to socially sensitive tasks. We further probe the model faithfulness using eXplainable AI (XAI), offering insights into how LLMs handle human disagreement.
Seeing All Sides: Multi-Perspective In-Context Learning for Subjective NLP
Muscato, Benedetta
;Gezici, Gizem
;Giannotti, Fosca
2026
Abstract
Modern language models excel at factual reasoning but struggle with value diversity: the multiplicity of plausible human perspectives. Tasks such as hate speech or sexism detection expose this limitation, where human disagreement captures the diversity of perspectives that models need to account for, rather than dataset noise. In this paper, we explore whether multi-perspective in-context learning (ICL) can align large language models (LLMs) with this diversity without parameter updates. We evaluate four LLMs on five datasets across three languages (English, Arabic, Italian), considering three label-space representations (aggregated hard, disaggregated hard, and disaggregated soft) and five demonstration selection and ordering strategies. Our multi-perspective approach outperforms standard prompting on aggregated English labels, while disaggregated soft predictions better align with human judgments in Arabic and Italian datasets.These findings highlight the importance of perspective-aware LLMs for reducing bias and polarization, while also revealing the challenges of applying ICL to socially sensitive tasks. We further probe the model faithfulness using eXplainable AI (XAI), offering insights into how LLMs handle human disagreement.| File | Dimensione | Formato | |
|---|---|---|---|
|
2026.findings-eacl.137.pdf
accesso aperto
Tipologia:
Published version
Licenza:
Creative Commons
Dimensione
923.4 kB
Formato
Adobe PDF
|
923.4 kB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



