Both policy and research benefit from a better understanding of individuals' jobs. However, as large-scale administrative records are increasingly employed to represent labor market activity, new automatic methods to classify jobs will become necessary. We developed an automatic job offers classifier using a dataset collected from the largest job bank of Mexico known as Bumeran https://www.bumeran.com.mx/ Last visited: 19-01-2022.. We applied machine learning algorithms such as Support Vector Machines, Naive-Bayes, Logistic Regression, Random Forest, and deep learning Long-Short Term Memory (LSTM). Using these algorithms, we trained multi-class models to classify job offers in one of the 23 classes (not uniformly distributed): Sales, Administration, Call Center, Technology, Trades, Human Resources, Logistics, Marketing, Health, Gastronomy, Financing, Secretary, Production, Engineering, Education, Design, Legal, Construction, Insurance, Communication, Management, Foreign Trade, and Mining. We used the SMOTE, Geometric-SMOTE, and ADASYN synthetic oversampling algorithms to handle imbalanced classes. The proposed convolutional neural network architecture achieved the best results when applied the Geometric-SMOTE algorithm.
翻译:政策和研究都得益于对个人工作的更好了解。然而,随着大规模行政记录越来越多地用于代表劳动力市场活动,新的职务分类自动方法将变得必要。我们利用从墨西哥最大的就业银行Bumeran https://www.bumeran.com.mx/Last 访问:19-01-2022.我们利用了支助矢量机、Naive-Bayes、后勤退缩、随机森林和深学习的长时记忆(LSTM)等机器学习算法,利用这些算法,我们培训了多级模型,对23个(分布不统一)班中的一个班的工作机会提供进行分类:销售、行政、呼叫中心、技术、贸易、人力资源、后勤、销售、卫生、气质学、融资、秘书、生产、工程、教育、设计、法律、建筑、保险、通信、管理、外贸和采矿。我们利用SMOTE、大地测量-SMOTE和ADSY合成过度采样的合成算法来处理不平衡的地理结构。拟议的模型在进行测算时取得的最佳结果。