A Distributed Architecture for a Purposive Computer Vision System

Videos available here

According to new approaches to Complex Computer Vision Systems, vision should not be considered as a self-contained system, but as a system interacting with its environment through perception and action.

One of these approaches, the Purposive Paradigm for Computer Vision (proposed by Prof. Aloimonos), believes that vision must be considered within the set of tasks an agent must accomplish, and tries to find in the purpose of the agent the constrains to solve the ill posed problem of vision. Another fundamental characteristic of this paradigm is the large integration of visual modules with other AI ones, like planning, reasoning and learning modules, since vision is not considered as a self contained problem.

The purposive paradigm researchers believe that general purpose vision will arise from the organization of several different dedicated solutions to different visual tasks. So, the main problem is how to organize solutions and define primitive tasks, focusing on architectures for integration of visual systems.

In this context, we are studing a distributed control architecture for purposive computer vision systems. In this architecture, the vision system’s purpose is translated into a set of behaviors, which are decomposed in specific tasks.

A Multi-Agent approach is used to model purpose, behaviors, tasks and the relationship among them. Purpose is modeled by a society of autonomous agents, each one responsible for a specific visually guided behavior. Tasks are represented by primitive agents, which interacts with autonomous agents.

The agents in the architecture are organized in a society with rules of behaviors and with an authority structure. This structure enables the agents to decide how the resources of the system are allocated, for example, in order to decide which agent should have the control of one resource at a certain moment. The manner this decision is made is dependent of the rules defined for the society in a specific implementation, and can be, for example, the result of a competition between agents.

A resource is defined as a part of the system that is shared by the agents, and that can be controlled by only one agent at a time. For example, a robotic manipulator in an assembly cell is a resource, and the drive system of a mobile robot is another. On the other hand, a fixed camera (and its data acquisition hardware and software) is not a resource, as all agents can have the images it captures at the same time. Yet, this camera could be a resource if acting in an active vision system, where each agent could compete to control the process of data acquisition.

A sketch of the Autonomos Agents Society.

The authority structure of a system is deeply related to the dependence and precedence of the autonomous agents behaviors in the society, and its definition is based on the study of the linearization of an activity plan, having the behavior of the autonomous agents as operators, the resources an autonomous agent needs as the pre-conditions of the operator, and the accomplishment of the system intentions as it goals. The definition of how the autonomous agents can allocate system resources are left to the implementation of the system itself. It is worth noticing that is the authority structure that makes this architecture related to the Subsumption architecture, allowing agents to suppress each other by taking away resources.

In this society, each autonomous agent is connected to all other agents, through a decentralized communication network, communicating with a previously defined language.

As a testbed, the architecture is being implemented on a Flexible Assembly Cell at the Laboratório de Sistemas Integráveis da Escola Politécnica da Universidade de São Paulo, which is composed of several workstations, two robotic manipulators and cameras, in a system performing simple visually guided assembly tasks. Whereas the chosen domain is the one of an assembly cell, the architecture can be applied to other domains, like to autonomous mobile robots.

One of the manipulator arms at the Cell.

For this application was defined three different behaviors, each one corresponding to an autonomous agents, which are:

Assembler agent: to accomplish an assembly, picking up pieces on the workspace with the manipulator and putting them in a desired location.
Cleaner agent: to clean the workspace, which is a previously defined area, where the assembly is made. In this manner, unwanted objects that people or another manipulator may have put on this area must be taken away by this agent.
Collision Avoider agent: to avoid collisions of the manipulator with objects (other manipulators, a hand) that move in the workspace, aiming the preservation of the system’s physical integrity.

The communication between the agents is implemented using the Parallel Virtual Machine (PVM) library. The PVM is a system for distributed processing implementations that offers tools for communication between tasks, like point to point messages and broadcasts, and tools to control the spawning of tasks. It was chosen due to the simplicity it allows in the implementation of message exchanges between the agents, and because it allows the definition of which workstation a process must run. Besides that, it is widely used in the academic community, which means better support for developers, having available an Internet newsgroup where assistance can be found.

One of the cameras at the Flexible Assembly Cell.

One primitive agent is responsible for the image acquisition, managing a SunVideo system designed for image acquisition and video compression in real time. The SunVideo consists of a Sun SBus board for SPARCstations with on-board video compression engine and the XIL Imaging Library, and is used for multimedia applications and video conferences.

Computer Vision Images: The first is the image of the workspace captured with a SunVideo Board; the second is its blob colouring segmentation; the last one present each recognized object in the image in a different color. The input image is simple because was the first test done with the system.

Publications related to this work:

BIANCHI, R. A. C.; RILLO, A. H. R. C. A distributed control architecture for a purposive computer vision system. In: IEEE SYMPOSIUM ON IMAGE, SPEECH AND NATURAL LANGUAGE SYSTEMS (ISNL) - IEEE INTERNATIONAL JOINT SYMPOSIA ON INTELLIGENCE AND SYSTEMS, 2nd, Rockville, Maryland, 1996. Proceedings. Los Alamitos, CA, IEEE Computer Society Press, 1996. P. 288-294.
BIANCHI, R. A. C.; RILLO, A. H. R. C. A Purposive Computer Vision System: a Multi-Agent Approach. In: WORKSHOP ON CYBERNETIC VISION, 2nd, São Carlos, 1996. Proceedings. Los Alamitos, CA, IEEE Computer Society Press, 1997 (in print).
BIANCHI, R. A. C.; RILLO, A. H. R. C. Uma Arquitetura de Controle para Sistemas Complexos de Visão Computacional. In: SIMPÓSIO BRASILEIRO DE AUTOMAÇÃO INTELIGENTE, 3, Vitória, 1997. Anais. Vitória, Sociedade Brasileira de Automática, 1997. P. 94-99.
BIANCHI, R. A. C.; RILLO, A. H. R. C. Uma Arquitetura de Controle Distribuída para um Sistema de Visão Computacional Propositada. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL, 1, Brasília, 1997. Anais. Brasília, Sociedade Brasileira de Computação, 1997 (in print).

Last updated November 1st., 1998.

LSI LSI/DAIA

Reinaldo Augusto da Costa Bianchi
Work address:
Laboratório de Sistemas Integráveis
Av. Prof. Luciano Gualberto Trav.3 N.158
Cidade Universitária
CEP: 05508-900 - São Paulo - SP
Brazil
Phone: x55-11-818-5530
E-Mail:

rbianchi@lsi.usp.br