With the continuous development of Internet technology,online shopping has already permeated various aspects of our daily lives,profoundly changing individual experiences and offering unparalleled convenience of traditional commerce. Several leading e-commerce platforms in China have a large number of users,continuously expanding the types of online products,covering various aspects of our lives. During the online shopping process,consumers input or click on the types and features of products they are interested in,browse the corresponding product pages,and obtain a series of information about the products. This information often includes product images,descriptions,prices,versions,etc. Consumers then integrate the information from various products to make decisions on whether to purchase or not. This information also presents new opportunities for product marketing. In this context,e-commerce platforms have gradually become centralized pools of massive data,encompassing comprehensive data information about merchants,users,products,logistics,and so on. If these data can be harnessed to unleash greater value in business scenarios,it will undoubtedly empower various aspects of online shopping,providing a new experience for merchants,e-commerce platforms,and consumers.
E-commerce platforms possess a vast amount of product images with high data dimensions,carrying rich information that can visually present product details,making it convenient for both merchants and users to sell and buy goods. Image data has become the primary content carrier in the current e-commerce sales process,playing a crucial role in how consumers perceive and understand the products being sold. Meanwhile,textual descriptions also remain key in conveying information about the functionality and effects of products. Therefore,if a connection can be established between physical product images and textual information through various methods,automatically generating product tags and basic descriptions based on product images,it can greatly facilitate the management of products for merchants and contribute to the centralized analysis of massive product data by the platform. This approach ensures consistency between product images and actual features,reinforcing the connection between images,text,and actual product features during searching,thereby enhancing the shopping experience.
Among the diverse categories of online shopping products,clothing products occupy a significant proportion of online sales due to their convenience in purchase and broad audience. Online consumers can quickly browse a large number of clothes in different styles,designs,brands from different stores,effectively avoiding problems such as limited sizes,styles,and limited exposure to products that may occur in offline shopping. Additionally,consumers can assess the visual effects of clothing based on model try-on images displayed by merchants. Clothing products,compared with other categories,rely more on their visual effects when being worn,making clothing images the primary basis for consumers' shopping decisions,and image data holds greater significance in the sales of clothing products.
This paper aims to construct a model that takes clothing product images as the input,using deep learning algorithms to decode and analyze them to extract image features. Then,we plan to recognize and classify the product in various dimensions and subdivisions,generate multiple tags to describe the product,and finally produce a comprehensive description of the actual situation of the clothing. Due to the diverse styles of clothing products,it is essential to construct a suitable tag system to classify clothing products effectively. This involves extracting and refining tags from a large number of clothing image,which are then categorized into two types:one describing the overall situation of the clothing product and the other describing the category to which the clothing belongs. In the subsequent model construction,we mainly face three challenges:Firstly,the collected images come from different merchants,with variations in lighting,angles,clarity,etc.,that may cause potential unrelated factors affecting the classification results. Secondly,the model needs to ensure the accuracy of classification recognition under the limited and uneven distribution of some clothing categories. Thirdly,the model may face challenges of larger data volume and dimensions in actual application scenarios,requiring consideration of computation time and costs. The first challenge can be alleviated by some methods such as adding image noise and image preprocessing. To address the latter two issues,considering the need for balancing the accuracy and efficient training time,we propose using the transfer learning framework to construct a convolutional neural network (CNN) model. By learning a large amount of image data first,the model can then focus on learning relatively fewer number of images of clothing products,obtaining accurate training results quickly. Thus,we only need to adjust the last layer of the CNN model and inherit the other pre-trained parameters from those frameworks. After comparing various CNN model structures,training effects and time costs,GoogLeNet,VGGNet,and ResNet were ultimately selected as the transfer learning framework.
Finally,through model training,accurate classification can be achieved on four groups of tags representing the attributes,styles,seasons,and clothing categories. We have then designed products for the subsequent application of the model,forming a label generation management system based on the recognition of product images,predicting classifications across various dimensions for input clothing images. This system can bring convenience to merchants,platform administrators and consumers.