Cast of characters

What agents would be involved in such an application? Well, first, there's the agent that represents the WhizBang itself. For this demonstration, we'll mock up the WhizBang as shown in figure Figure 6.1.

Screenshot of the Whizbang interface

Figure 6.1. Screenshot of the Whizbang interface


To "mock up" the ability to detect information like who is around, we have a series of panels on the left hand side for detecting the location, people who are around and the general background noise level. The rest of the panel is for a speech conversation between the user and WhizBang. For this demo, we'll use text.

We also make use of a bogus email retrieval agent. When queried, this agent returns a list of e-mails. Currently, the number of e-mails is generated randomly, and the user may receive anywhere between 0 and 30 e-mails.

There is also a natural language processing agent. This agent "wraps around" the a C program that supports an NLP system. This agent called the MicaBot Agent, is responsible for parsing what the user says.

Finally, there is a learning agent that we use to learn when to read e-mail and when to display it.

There are some additional agents available: one is the debugger program that allows the user to see everything on the blackboard, and the other is one for looking at the decision trees generated in the process of learning.

Setting up Mobs

In order to set up the system, decisions must first be made about what types of mobs agents will use to communicate. First of all, to communicate text to and from the user, the type declaration in Figure 6.2 is used.

<typedesc>
  <mobdecl name="text">
    <slot name="utterance"/>
  </mobdecl>
  <mobdecl name="textFromUser">
    <parent name="text"/>
  </mobdecl>
  <mobdecl name="textForUser">
    <slot name="speaker"/>
    <parent name="text"/>
  </mobdecl>
</typedesc>

Figure 6.2. The type declaration for types of text used between agents.


In Figure 6.2, three mob types are declared: a generic text mob, with a single slot called "utterance" to store what is being said[3]. There are two subtypes: textFromUser and textForUser. textForUser has an additional slot to describe who it is who is talking.

These mobs are used to communicate between the WhizBang interface and the Natural Language Processing (NLP) agent.

The WhizBang interface also uses several other mobs to describe the environment; such as the envNoise mob to describe noise levels. Every time the noise level changes, it will write a new mob to the blackboard. Similarly, for the location of the user, and the people around them.

The e-mail agent only listens for one mob: emailListRequest. It then responds with an emailListReply, which contains three slots: count -- the number of new e-mails, from and subject. The latter two are multi-valued. In this particular case, the information is randomly generated.

The learning agent is rather complex. First of all, the learning task is defined. This is done by way of a learning task configuration. Similar to the way that new types are declared, learning tasks are defined in config/learntask. Each file in that directory is read. The configuration file for the learning agent is shown in Figure 6.3.

<learntask
  name="readOrDisplayEmail"
  learner="weka.classifiers.trees.j48.J48"
  datafile="/tmp/readordisplay.arff"
  modelfile="/tmp/readordisplay.mdl">
  
  <attribute name="location" type="discrete"
        sourcemob="envLocation" sourceslot="location" >
    <value label="home"/>
    <value label="office"/>
    <value label="car"/>
  </attribute>

  <attribute name="noiseLevel" type="continuous"
        sourcemob="envNoise" sourceslot="noiseLevel"/>

  <attribute name="whosaround" type="discrete"
        sourcemob="envWhosAround" sourceslot="whosAround">
    <value label="alone"/>
    <value label="withfriends"/>
    <value label="withstrangers"/>
  </attribute>

  <attribute name="numMails" type="continuous"
        sourcemob="emailListReply" sourceslot="count"/>
  
  <class name="readOrDisplayEmail">
    <value label="askuser"/>
    <value label="readmail"/>
    <value label="display"/>
  </class>
</learntask>

Figure 6.3. readordisplay.xml


For a given classification task (in this case "readOrDisplayEmail"), we define a learning algorithm for the task as well as files for temporary results to be stored. A learning task consists of a set of attributes, and a final class the learner is trying to classify. In this case, the possible actions are "readmail" or "display". The attributes are things like the noise level, who is around and how many e-mails were received. For each of these attributes, the learning task extracts the value from information on the blackboard. It does this by listening to any new mobs of the types important for this classification task; and thus has an idea of the "current" context.

Although this part is very complex, once a learning task is defined, other agents can use it easily. In order to provide an example to the learner; a learnerTrain is written to the blackboard, specifying the learning task and the actual class, given the current context. The learner then extracts the appropriate information from the blackboard and stores the current situation as an example.

In order to use the learner to decide what it should do given the current context, it writes a learnerTest mob, with the following fields: a requestId, so that when the learner agent replies, the other agent will know what the learner agent is replying to, and secondly the learning task. The learning agent then replies with a learnerReply mob with the copied requestId, the predicted class and the confidence of the prediction.

Figure 6.4 shows a sequence diagram for the interactions that occur in this process.

Sequence diagram for training the WhizBang

Figure 6.4. Sequence diagram for training the WhizBang


In Figure 6.4, the blackboard is not explicitly shown; rather the information flows through the blackboard are shown. "Notes" are tools to aid in understanding. So, the user types in "get my mail". The interface writes this as a mob to the blackboard. Since the MicaBot agent (the natural language agent) has registered for "textFromUser" mobs, it is informed. Similarly, the learning agent is also informed when new information about environment is added to the blackboard. The process continues, with the MicaBot agent making a query of the email agent; and generating a text description for the user.

The MicaBot agent in this case is designed to begin by eliciting user preferences, and eventually to use the elicited preferences to predict what the user would like. In this case, it tells the user, then responds to the user's reply by providing an example to the learning system using a learnerTrain mob.

In order to run the demonstration of all of these agents running, firstly kill any blackboards or other agents running (this is not strictly necessary, but it helps to make sure the script will run). Then in the Mica directory, type java unsw.cse.mica.runner.MicaRunner examples/run/learnemail-run.xml. When the "start all" buttons is hit, in addition to the agents mentioned above, this will open two other windows: firstly, a "MICA debugger". This application is generally useful, and allows you to see all the information on the blackboard. An example of the debugger is shown in Figure 6.5

The MICA debugger

Figure 6.5. The MICA debugger


The second window opened displays the decision tree learned in the process of answering the question of whether to read or display the e-mail. You should hit the "Reload" button occasionally to reload the tree. It is shown in Figure 6.6.

Tree learnt from conversation with user

Figure 6.6. Tree learnt from conversation with user


The above tree shows the concept learnt after a few rounds with the user. It has learnt that it should read e-mails in the car, and if at home, read e-mails if there is less than 18 and display them if there is more than 18.

For a closer examination, users should consider reading the source code.



[3] The current implementation of MICA will read the slot element of the xml document, but won't actually do anything about it. It's there as a form of documentation,