Investigation of efficient prompt engineering for video retrieval

DOI

Bibliographic Information

Other Title
  • 映像検索における効率的なプロンプトエンジニアリングの検討

Abstract

In recent years, Vision and Language technologies have been developed. In addition, the field of video retrieval has been developing along with it. In this study, we examined the accuracy of using Vision and Language techniques for video retrieval by changing the input query sentences. We conduct two types of validation: validation 1 “Adding a phrase at the beginning of a query sentence” and validation 2 “Adding an important word in a query sentence at the end or the beginning of a sentence”. As a result, validation 2 “Adding important words in the query sentence to the end or the beginning of the sentence,” was effective. “Adding the important words at the end of the sentence” improved the accuracy in many patterns. The result of validation 1 “Appending a phrase to the beginning of a query sentence” varied depending on the phrase. The phrases that improved accuracy were particularly effective for CLIP and SLIP. When two query sentences and a prompt-engineered sentence were used in the search, accuracy was improved in more query sentences, although the number of cases where accuracy was greatly improved was reduced.

Journal

Details 詳細情報について

  • CRID
    1390298986213401088
  • DOI
    10.11371/wiieej.22.04.0_102
  • ISSN
    27589218
    02853957
  • Text Lang
    ja
  • Data Source
    • JaLC
  • Abstract License Flag
    Disallowed

Report a problem

Back to top