Large-Scale Coarse-to-Fine Object Retrieval Ontology and Deep Local Multitask Learning (Part 5)
In the previous post, we have discussed about offline phase in details. In this post, we will discover the online phase in details.
1. States in Online Phase
The online phase of the CFOR system corresponding to the demonstration in Figure.1
2. Technical Details
The used functions will be described as follows:
- (i) detector(imgQuery): an object in an image is automatically detected by using a trained detector. In this function, we inherit the successful software YOLO (version 3.0) to identify fashion items. Besides, the items identified are also refined by the region identification model, which is trained by “classifyModel” function.
- (ii) infor_extract(states, obj, onto, classifyModels, multitaskModel): for each query object, all attribute learning models trained in function "multitaskModel" and coarse classification models in function "classifyModel" are run. We extract the region ⟶ category ⟶ attributes and necessary features for each stage of the ontology.
- (iii) query_expansion(infor, feat): query expansion based on the mean vector is used for reranking retrieval results.
- (iv) compute_sim_score(database, infor, feat): for each pair of features, asymmetric distance is used to measure the dissimilarity distance between the query and the sample in the database
- (v) ranking(scorelist, database, top__k, GPU_search = True): based on the score between the query and all samples in the database obtained from function "compute_sim_score," ranking is applied; smaller is better.
- (vi)retrieval(indexes, score_list, database, GPU_search = True): the retrieval process contains 3 steps including feature retrieval, fine-grained retrieval, and query expansion. For global retrieval, global features of the query object obtained from function “inforextract” and the features of samples in the database are passed to function "ranking" to get 1st top-_m retrieval results. For fine-grained retrieval, attribute features of the query object obtained from function “inforextract” and the features of samples in 1st top-_m retrieval results are passed to function "ranking" to get 2nd top-k retrieval results. For query expansion, the mean vector is computed from 2nd top-k retrieval results, and each feature of 2nd top-k retrieval results is passed to function "ranking" to get final top-k retrieval results, i.e., query expansion-based reranking.
As described in Figure 1, the online phase of the CFOR system contains three stages which will be put into use in real time. They are given as follows.
3. Prediction Stage
This stage will take advantage of object ontology and classification models obtained from the offline phase and then makes predictions from coarse to fine for each query image:
Fine-grained information in terms of regions, categories, and attributes provides more options for a customer to give a full semantic query. The object will be predicted from coarse to fine. In turn, the region, category, and attribute will be predicted based on object ontology and a local MDNN. The object retrieval system uses extracted semantic information as the category and attribute to search in detail.
3. Dissimilarity Measuring Stage
This stage will take advantage of the database as well as the indexing file from the offline phase and a dissimilarity measure to get scores and then rank, rerank, and release retrieval results for each query image. This stage is based on the dissimilarity measure between attribute vectors of query images and database images:
Based on combination of K-nearest neighbour search in terms of L2 distance and asymmetric distance computation, we take advantage of parallel processing by GPU through the Faiss method to compute the distance from the query image to the necessary one in the database. The distance which is also called the score of each image in the database is then sorted to rank the dissimilarity. The smaller the score of the image, the more similar the query. Based on the number of retrieval images required or thresholds, we will have an appropriate cutoff in the score as well as the number of retrieval images. This kind of measurement is used to compute distance for both deep features vectors and attribute vectors.
4. Dissimilarity Measuring Stage
Query expansion is a technique that can help gather additional relevant information from the input to increase retrieval performance. The information can be relevant images, additional features, description, etc. based on the query expansion algorithms and data. In this stage, we would like to take advantage of the previous retrieval results and then expand the query by using the mean vector to rerank and get reranked retrieval results to improve retrieval performance.
Query expansion based on the mean vector is chosen among many methods, the mean vector computed from features of retrieval results and the features of input help reduce the bias between different considered features. Thus, the CFOR system can eliminate unrelated features; that is, retrieval features have high gap from the mean vector features, which helps reduce outliers and rise the precision score.
Query expansion based on computing mean vector is performed very fast, and it can take advantage of the Faiss similarity searching method as well. Query expansion can remove outliers, thanks to the statistic essence of the mean vector.
Next
In the next post, we will mention the Fashion Ontology, a CFOR System Testing in Fashion.