Type: Voice, annotation
Total collection: 400 hours
Collection instructions: The collection content is multi-person conversation, each audio group contains at least two people's conversation, in 400 hours, the total effective duration of each person should be kept within 0.5 hours, the total number of people collected is at least 800.
2 people: 3 people: 4 people: 5 people = 4:3:2:1
Collection equipment: mainstream mobile phone brands (iOS, Android, etc.) or dedicated recording equipment
Recording environment: indoor or outdoor, quiet or noisy. Telephone access is acceptable. Noise must not affect voice recognition.
Recording parameters: 16kHz sampling, 16bit
Recording distance: 10cm to 2m
Recording language: Uae Arabic
Accent: ABU Dhabi and Dubai
Nationality distribution: United Arab Emirates
Gender distribution: male to female ratio 7:3
Age distribution: 20-30 years old: 30-40 years old: 40-50 years old = 1:1:1
Recorded content: non-preset text dialogues on fixed topics, covering various aspects of life and work, including family, parties, leisure activities and holiday travel.
A speaker has a globally unique speaker ID that can be used to locate the speaker's pronunciation audio file and text