More

認知心理學早已指出,人類知識記憶中的重要部分是視覺知識,被用來進行形象思維。因此,基于視覺的人工智能(AI)是AI繞不開的課題,且具有重要意義。本文繼《論視覺知識》一文,討論與之相關的5個基本問題:(1)視覺知識表達;(2)視覺識別;(3)視覺形象思維模擬;(4)視覺知識的學習;(5)多重知識表達。視覺知識的獨特優點是具有形象的綜合生成能力,時空演化能力和形象顯示能力。這些正是字符知識和深度神經網絡所缺乏的。AI與計算機輔助設計/圖形學/視覺的技術聯合將在創造、預測和人機融合等方面對AI新發展提供重要的基礎動力。視覺知識和多重知識表達的研究是發展新的視覺智能的關鍵,也是促進AI 2.0取得重要突破的關鍵理論與技術。這是一塊荒蕪、寒濕而肥沃的“北大荒”,也是一塊充滿希望值得多學科合作勇探的“無人區”。

Yun-he Pan ,   panyh@zju.edu.cn   et al.
長期以來困擾人工智能領域的一個問題是:人工智能是否具有創造力,或者說,算法的推理過程是否可以具有創造性。本文從思維科學的角度探討人工智能創造力的問題。首先,列舉形象思維推理的相關研究;然后,重點介紹一種特殊的視覺知識表示形式,即視覺場景圖;最后,詳細介紹視覺場景圖構造問題與潛在應用。所有證據表明,視覺知識和視覺思維不僅可以改善當前人工智能任務的性能,而且可以用于機器創造力的實踐。

To boost research into cognition-level visual understanding, i.e., making an accurate inference based on a thorough understanding of visual details, (VCR) has been proposed. Compared with traditional visual question answering which requires models to select correct answers, VCR requires models to select not only the correct answers, but also the correct rationales. Recent research into human cognition has indicated that brain function or cognition can be considered as a global and dynamic integration of local neuron connectivity, which is helpful in solving specific cognition tasks. Inspired by this idea, we propose a to achieve VCR by dynamically reorganizing the that is contextualized using the meaning of questions and answers and leveraging the directional information to enhance the reasoning ability. Specifically, we first develop a GraphVLAD module to capture to fully model visual content correlations. Then, a contextualization process is proposed to fuse sentence representations with visual neuron representations. Finally, based on the output of , we propose to infer answers and rationales, which includes a ReasonVLAD module. Experimental results on the VCR dataset and visualization analysis demonstrate the effectiveness of our method.

Yahong Han ,   Aming Wu   et al.
Object detection is one of the hottest research directions in computer vision, has already made impressive progress in academia, and has many valuable applications in the industry. However, the mainstream detection methods still have two shortcomings: (1) even a model that is well trained using large amounts of data still cannot generally be used across different kinds of scenes; (2) once a model is deployed, it cannot autonomously evolve along with the accumulated unlabeled scene data. To address these problems, and inspired by theory, we propose a novel scene-adaptive evolution algorithm that can decrease the impact of scene changes through the concept of object groups. We first extract a large number of object proposals from unlabeled data through a pre-trained detection model. Second, we build the dictionary of object concepts by clustering the proposals, in which each cluster center represents an object prototype. Third, we look into the relations between different clusters and the object information of different groups, and propose a graph-based group information propagation strategy to determine the category of an object concept, which can effectively distinguish positive and negative proposals. With these pseudo labels, we can easily fine-tune the pre-trained model. The effectiveness of the proposed method is verified by performing different experiments, and the significant improvements are achieved.

Shiliang Pu ,   Wei Zhao   et al.

Most Popular

日韩欧美精品一中文字目-日韩欧中文字幕精品-亚洲第一区欧美日韩精品