地方门户网站域名,整站网站优化推荐,定制型网站制作哪家好,怎么查看一个网站的建设地区您的需求很明确#xff1a;在人脸向量搜索结果中#xff0c;根据人物名称(actor_name)进行聚合#xff0c;确保每个演员只返回最相似的一张图片。以下是优化后的搜索方案#xff1a;
解决方案
方法一#xff1a;使用聚合功能#xff08;推荐#xff09;
GET /face_searc…您的需求很明确在人脸向量搜索结果中根据人物名称(actor_name)进行聚合确保每个演员只返回最相似的一张图片。以下是优化后的搜索方案解决方案方法一使用聚合功能推荐GET/face_search_test_index/_search{size:0,aggs:{actors:{terms:{field:actor_name.keyword,size:10},aggs:{top_hit:{top_hits:{size:1,sort:[{_score:{order:desc}}],_source:{includes:[actor_name,actor_id,image_path]}}}}}},query:{script_score:{query:{match_all:{}},script:{source: doublecosineSimilarity(double[]a,double[]b){double dotProduct0.0;double normA0.0;double normB0.0;for(int i0;ia.length;i){dotProducta[i]*b[i];normAa[i]*a[i];normBb[i]*b[i];}returndotProduct/(Math.sqrt(normA)*Math.sqrt(normB));}double similaritycosineSimilarity(params.query_vector,doc[face_vector]);returnsimilarity;,params:{query_vector:[-0.03294802084565163,0.08574431389570236,0.04574661701917648,-0.03050283156335354,-0.06638835370540619,-0.04965103417634964,0.06007932499051094,-0.17975950241088867,0.15759551525115967,-0.15764901041984558,0.2375360131263733,-0.028777025640010834,-0.25363847613334656,-0.10428159683942795,0.03292582929134369,0.13914231956005096,-0.10663023591041565,-0.14371007680892944,-0.16282042860984802,-0.15108194947242737,0.07730557024478912,0.06031457334756851,-0.013631811365485191,-0.03235295042395592,-0.1441582888364792,-0.22718726098537445,-0.09331541508436203,-0.03909553587436676,0.02703486755490303,-0.06274473667144775,0.10269181430339813,0.09461987763643265,-0.23343020677566528,-0.09261009097099304,0.0035921353846788406,0.036539267748594284,-0.08004020154476166,-0.02440868318080902,0.15535330772399902,0.02334958128631115,-0.13954328000545502,0.00402874918654561,-0.028272267431020737,0.219803124666214,0.1736876219511032,-0.015488659963011742,0.01637592911720276,-0.14725874364376068,0.12355087697505951,-0.24118836224079132,0.005113713443279266,0.1290103644132614,0.10923261940479279,0.12515988945960999,0.05419561266899109,-0.09587032347917557,0.07226473093032837,0.12341351062059402,-0.18840639293193817,0.06482309103012085,0.08653104305267334,-0.24938331544399261,-0.019747208803892136,-0.020754650235176086,0.09277407824993134,0.014888722449541092,-0.009055963717401028,-0.12095478177070618,0.21467812359333038,-0.11203934997320175,-0.11639110743999481,0.052483949810266495,-0.0846707746386528,-0.14047497510910034,-0.35461661219596863,-0.004548290744423866,0.3914399743080139,0.10970800369977951,-0.18320702016353607,0.016537625342607498,-0.0756637454032898,0.04670151695609093,0.09940724074840546,0.022470811381936073,-0.01574861630797386,-0.103361114859581,-0.11547031253576279,0.08255285769701004,0.21767574548721313,-0.11139923334121704,-0.06307178735733032,0.22926700115203857,-0.008651845157146454,0.02907070517539978,0.05055316537618637,0.030209437012672424,-0.0844748467206955,0.01178983598947525,-0.07661733776330948,-0.015207079239189625,0.05650537833571434,-0.09215937554836273,0.02020958811044693,0.08764813095331192,-0.12565577030181885,0.24529917538166046,-0.01250448264181614,0.013047449290752411,-0.040126584470272064,-0.07195307314395905,-0.052720751613378525,0.04307105392217636,0.14030040800571442,-0.17923066020011902,0.26793503761291504,0.1835155487060547,-0.06833052635192871,0.15156754851341248,0.02274666726589203,0.08952273428440094,-0.07919059693813324,-0.07516366243362427,-0.18298988044261932,-0.12182115018367767,0.04411851987242699,0.05293841287493706,0.03631458058953285,0.04240144044160843]}}}}}方法二使用k-NN搜索 客户端聚合如果您更倾向于使用k-NN搜索可以先获取结果然后在客户端进行聚合# 执行k-NN搜索responsees.search(indexface_search_test_index,body{knn:{field:face_vector,query_vector:your_vector,k:100,# 获取足够多的结果以便聚合num_candidates:1000},_source:[actor_name,actor_id,image_path]})# 客户端聚合fromcollectionsimportdefaultdict# 按演员名称分组actor_groupsdefaultdict(list)forhitinresponse[hits][hits]:actor_namehit[_source][actor_name]actor_groups[actor_name].append({score:hit[_score],actor_id:hit[_source][actor_id],image_path:hit[_source][image_path]})# 每个演员只保留分数最高的结果top_results[]foractor_name,resultsinactor_groups.items():# 按分数降序排序取第一个top_resultsorted(results,keylambdax:x[score],reverseTrue)[0]top_results.append(top_result)# 按分数排序最终结果top_resultssorted(top_results,keylambdax:x[score],reverseTrue)# 输出结果forresultintop_results:print(fActor:{actor_name}, Score:{result[score]}, ID:{result[actor_id]})注意事项字段类型确保actor_name字段在映射中设置为keyword类型以便正确聚合性能考虑方法一服务器端聚合通常更高效特别是当数据量大时相似度阈值您可能需要设置一个相似度阈值过滤掉不够相似的结果向量维度确保查询向量的维度与索引中向量的维度一致推荐方案对于您的用例我推荐使用方法一服务器端聚合因为它更高效且能减少网络传输量。如果您需要进一步优化性能可以考虑增加shard_size参数来确保每个分片返回足够的结果供聚合使用。希望这能解决您的问题如果您有任何其他疑问请随时提问。