Fashion is now among the largest industries worldwide, for it represents human history and helps tell the worlds story. As a result of the Fourth Industrial Revolution, the Internet has become an increasingly important source of fashion information. However, with a growing number of web pages and social data, it is nearly impossible for humans to manually catch up with the ongoing evolution and the continuously variable content in this domain. The proper management and exploitation of big data can pave the way for the substantial growth of the global economy as well as citizen satisfaction. Therefore, computer scientists have found it challenging to handle e-commerce fashion websites by using big data and machine learning technologies. This paper first proposes a scalable focused Web Crawler engine based on the distributed computing platforms to extract and process fashion data on e-commerce websites. The role of the proposed platform is then described in developing a disentangled feature extraction method by employing deep convolutional generative adversarial networks (DCGANs) for content-based image indexing and retrieval. Finally, the state-of-the-art solutions are compared, and the results of the proposed approach are analyzed on a standard dataset. For the real-life implementation of the proposed solution, a Web-based application is developed on Apache Storm, Kafka, Solr, and Milvus platforms to create a fashion search engine called SnapMode.
翻译:由于第四次工业革命的结果,互联网已成为日益重要的时装信息源,然而,随着网页和社会数据数量不断增加,人类几乎不可能手动赶上该领域的演变过程和持续变化的内容。对大数据进行适当管理和利用,可以为全球经济的大幅增长以及公民满意度铺平道路。因此,计算机科学家发现,使用大数据和机器学习技术处理电子商务时装网站具有挑战性。本文首先提出基于分布式计算机平台的可扩缩的网络拉动器引擎,以提取和处理电子商务网站的时装数据。然后,通过使用深层革命基因对抗网络(DCGANs)开发一个分解的特征提取方法,用于内容图像索引和检索。最后,计算机科学家发现,利用大数据和机器学习技术处理电子商务时装网站具有挑战性。本文件首先提出一个基于分布式计算机平台的可扩缩重点的网络拉动引擎,以提取和处理电子商务网站时装数据。然后,对拟议平台的作用作了描述,通过使用深层的革命性基因对抗网络(DCGANs)开发基于内容的图像索引和检索。最后,对最新解决方案进行了比较,对拟议方法的结果进行了分析,并在标准数据集上分析。对于所谓的“Salformodeal-listal-lial-listal-listal”模型应用方法进行了搜索。