Everyday knowledge, however trivial it is to humans, is extremely broad and complex and is an essential component to current and future AI systems. This project focuses on constructing a large-scale knowledge repository by deep integration between language and vision. Such statistical and sharable knowledge repository is operationalized as (i) spatial, (ii) temporal and (iii) frame knowledge.