Approximate world models are coarse descriptions of the elements of a scene, and are intended to be used in the selection and control of vision routines in a vision system. In this paper we present a control architecture in which the approximate models represent the complex relationships among the objects in the world, allowing the vision routines to be situation or context specific. Moreover, because of their reduced accuracy requirements, approximate world models can employ qualitative information such as those provided by linguistic descriptions of the scene. The concept is demonstrated in the development of automatic cameras for a TV studio -- SmartCams. Results are shown where SmartCams use vision processing of real imagery and information written in the script of a TV show to achieve TV-quality framing.