In today's scenario, imagining a world without negativity is something very unrealistic, as bad NEWS spreads more virally than good ones. Though it seems impractical in real life, this could be implemented by building a system using Machine Learning and Natural Language Processing techniques in identifying the news datum with negative shade and filter them by taking only the news with positive shade (good news) to the end user. In this work, around two lakhs datum have been trained and tested using a combination of rule-based and data driven approaches. VADER along with a filtration method has been used as an annotating tool followed by statistical Machine Learning approach that have used Document Term Matrix (representation) and Support Vector Machine (classification). Deep Learning algorithms then came into picture to make this system reliable (Doc2Vec) which finally ended up with Convolutional Neural Network(CNN) that yielded better results than the other experimented modules. It showed up a training accuracy of 96%, while a test accuracy of (internal and external news datum) above 85% was obtained.
翻译:在今天的情景中,想象一个没有负偏向的世界是非常不现实的,因为坏消息和过滤方法在病毒上比好消息传播得更多。虽然在现实生活中似乎不切实际,但可以通过建立一个系统,用机器学习和自然语言处理技术来用阴暗的阴影来识别新闻数据,并通过只将正面阴影(好消息)的新闻(Doc2Vec)给终端用户过滤这些数据来实施。在这项工作中,大约两千瓦图已经通过基于规则的方法和数据驱动的方法相结合来培训和测试。 VADER 连同过滤方法被作为一种说明工具,而统计机器学习方法则使用文件术语矩阵(代表)和支助矢量机器(分类)来跟踪。 深层次的算法随后进入了图,使这个系统变得可靠(Doc2Vec),它最终以具有比其他实验模块更好的效果的革命神经网络(CNN) 。 它显示培训准确性达到96%,同时获得了85%以上(内部和外部新闻数据)的测试精度。