The Gender Identification (GI) problem is concerned with determining the gender of the author from a given text. It has numerous applications in different fields like forensics, literature, security, marketing, trade, etc. Due to its importance, researchers have put extensive efforts into identifying gender from the text for different languages. Unfortunately, the same statement is not true for the Bangla language despite its being the 7th most spoken language in the world. In this work, we explore Gender Identification from Social media Bangla Text. Specially, we consider two approaches for feature extraction. The first one is Bag-Of-Words(BOW) approach and another one is based on computing features from sentiment and emotions. There is a common stereotype that female authors write in a more emotional way than male authors. One goal of this work is to validate this stereotype for the Bangla language.
翻译:暂无翻译