Peacock figurine

Peacock consists of complex questions with answers and Sparkle queries.

The Peacock corpus consists of 10,000 complex questions with answers and corresponding Sparkle queries. This corpus was extracted by an algorithm automatically using the Persian knowledge graph called FarsBase. Each question has at least 2 rewrites written by linguists. These questions are divided into 6 different categories in terms of complexity: multi-entity, multi-jumping, temporal, superlative, comparative, and aggregate. The number of unique relations in this corpus is 432 and the number of unique entities is 2787.
Contributions:
Romina Etezadi

Reference information:

PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph (https://arxiv.org/abs/2106.14167)

بازدید: ۵

آزمایشگاه پردازش زبان طبیعی

بزرگراه شهید چمران , ولنجک

021-29904171

nlp@sbu.ac.ir