Under Review: Pragmatic Implicature Processing in ChatGPT
PsyArXiv Preprints, 2023
Recommended citation: Qiu, Z., Duan, X., & Cai, Z. G. (2023). Pragmatic Implicature Processing in ChatGPT PsyArXiv. https://psyarxiv.com/qtbh9/
Recent large language models (LLMs) and LLM-driven chatbots, such as ChatGPT, have sparked debate regarding whether these artificial systems can develop human-like linguistic capacities. We examined this issue by investigating whether ChatGPT resembles humans in its ability to enrich literal meanings of utterances with pragmatic implicatures. Humans not only distinguish implicatures from truth-conditional meanings of utterances but also compute implicatures contingent on the communicative context. In three preregistered experiments (https://osf.io/4bcx9/), we assessed whether ChatGPT resembles humans in the computation of pragmatic implicatures. Experiment 1 investigated generalized conversational implicatures (GCIs); for example, the utterance “She walked into the bathroom. The window was open.” has the conversational implicature that the window is located in the bathroom, while the truth-conditional (literal) meaning of the utterance allows for the possibility that the window is located elsewhere. Humans demonstrate their ability to distinguish GCIs from the truth-conditional meanings by inhibiting the computation of GCIs when explicitly instructed to focus on the literal sense of the utterances. We tested whether ChatGPT could also inhibit the computation of GCIs as humans do. Experiment 2 and Experiment 3 investigated whether the communicative context modulates how ChatGPT computes a specific type of GCIs, namely scalar implicatures (SIs). For humans, the sentence “Julie had found a crab or a starfish” implies that Julie did not find both a crab and a starfish, even though the sentence’s literal meaning allows for this possibility. Moreover, this implicature is argued to be more available when the word “or” is in the information focus, e.g. as a reply to the question “What had Julie found?” than in the information background, e.g. as a reply to the question “Who had found a crab or a starfish?”. Experiment 2 tested whether ChatGPT shows similar sensitivity to information structure when computing SIs. Experiment 3 focused on a different contextual aspect, investigating whether face-threatening and face-boosting contexts have different effects on how ChatGPT computes SIs. Previous research has shown that human interlocutors compute more SIs in face-boosting contexts, e.g. interpreting the utterance “Some people loved your poem.” as saying “Not all people loved your poem.” but not so much when they are in face-threatening contexts; and we tested whether ChatGPT exhibits a similar tendency. In all three experiments, ChatGPT did not display human-like flexibility in switching between pragmatic and semantic processing and failed to show the well-established effects of communicative context on the SI rate. Overall, our experiments demonstrate that although ChatGPT parallels or even surpasses humans in many linguistic tasks, it still does not closely resemble human beings in the computation of GCIs. We attribute this discrepancy to differences in the acquisition of GCIs and the computational resources available to humans and machines.