What agents would be involved in such an application? Well, first, there's the agent that represents the WhizBang itself. For this demonstration, we'll mock up the WhizBang as shown in figure Figure 6.1.
To "mock up" the ability to detect information like who is around, we have a series of panels on the left hand side for detecting the location, people who are around and the general background noise level. The rest of the panel is for a speech conversation between the user and WhizBang. For this demo, we'll use text.
We also make use of a bogus email retrieval agent. When queried, this agent returns a list of e-mails. Currently, the number of e-mails is generated randomly, and the user may receive anywhere between 0 and 30 e-mails.
There is also a natural language processing agent. This agent "wraps around" the a C program that supports an NLP system. This agent called the MicaBot Agent, is responsible for parsing what the user says.
Finally, there is a learning agent that we use to learn when to read e-mail and when to display it.
There are some additional agents available: one is the debugger program that allows the user to see everything on the blackboard, and the other is one for looking at the decision trees generated in the process of learning.
In order to set up the system, decisions must first be made about what types of mobs agents will use to communicate. First of all, to communicate text to and from the user, the type declaration in Figure 6.2 is used.
<typedesc> <mobdecl name="text"> <slot name="utterance"/> </mobdecl> <mobdecl name="textFromUser"> <parent name="text"/> </mobdecl> <mobdecl name="textForUser"> <slot name="speaker"/> <parent name="text"/> </mobdecl> </typedesc>
Figure 6.2. The type declaration for types of text used between agents.
In Figure 6.2, three mob types are declared: a
generic text mob, with a single slot called "utterance" to store what
is being said[3]. There are two subtypes:
textFromUser
and
textForUser
. textForUser
has an additional slot to describe who it is who is talking.
These mobs are used to communicate between the WhizBang interface and the Natural Language Processing (NLP) agent.
The WhizBang interface also uses several other mobs to describe
the environment; such as the envNoise
mob to
describe noise levels. Every time the noise level changes, it will
write a new mob to the blackboard. Similarly, for the location of the
user, and the people around them.
The e-mail agent only listens for one mob:
emailListRequest
. It then responds with an
emailListReply, which contains three slots:
count
-- the number of new e-mails,
from
and subject
.
The latter two are multi-valued. In this particular case, the
information is randomly generated.
The learning agent is rather complex. First of all, the learning
task is defined. This is done by way of a learning task configuration.
Similar to the way that new types are declared, learning tasks are
defined in config/learntask
. Each file in that
directory is read. The configuration file for the learning agent is
shown in Figure 6.3.
<learntask name="readOrDisplayEmail" learner="weka.classifiers.trees.j48.J48" datafile="/tmp/readordisplay.arff" modelfile="/tmp/readordisplay.mdl"> <attribute name="location" type="discrete" sourcemob="envLocation" sourceslot="location" > <value label="home"/> <value label="office"/> <value label="car"/> </attribute> <attribute name="noiseLevel" type="continuous" sourcemob="envNoise" sourceslot="noiseLevel"/> <attribute name="whosaround" type="discrete" sourcemob="envWhosAround" sourceslot="whosAround"> <value label="alone"/> <value label="withfriends"/> <value label="withstrangers"/> </attribute> <attribute name="numMails" type="continuous" sourcemob="emailListReply" sourceslot="count"/> <class name="readOrDisplayEmail"> <value label="askuser"/> <value label="readmail"/> <value label="display"/> </class> </learntask>
Figure 6.3. readordisplay.xml
For a given classification task (in this case "readOrDisplayEmail"), we define a learning algorithm for the task as well as files for temporary results to be stored. A learning task consists of a set of attributes, and a final class the learner is trying to classify. In this case, the possible actions are "readmail" or "display". The attributes are things like the noise level, who is around and how many e-mails were received. For each of these attributes, the learning task extracts the value from information on the blackboard. It does this by listening to any new mobs of the types important for this classification task; and thus has an idea of the "current" context.
Although this part is very complex, once a learning task is
defined, other agents can use it easily. In order to provide an
example to the learner; a learnerTrain
is
written to the blackboard, specifying the learning task and the actual
class, given the current context. The learner then extracts the
appropriate information from the blackboard and stores the current
situation as an example.
In order to use the learner to decide what it should do given
the current context, it writes a learnerTest
mob, with the following fields: a requestId
,
so that when the learner agent replies, the other agent will know what
the learner agent is replying to, and secondly the learning task. The
learning agent then replies with a learnerReply
mob with the copied requestId
, the predicted
class and the confidence of the prediction.
Figure 6.4 shows a sequence diagram for the interactions that occur in this process.
In Figure 6.4, the blackboard is not explicitly shown; rather the information flows through the blackboard are shown. "Notes" are tools to aid in understanding. So, the user types in "get my mail". The interface writes this as a mob to the blackboard. Since the MicaBot agent (the natural language agent) has registered for "textFromUser" mobs, it is informed. Similarly, the learning agent is also informed when new information about environment is added to the blackboard. The process continues, with the MicaBot agent making a query of the email agent; and generating a text description for the user.
The MicaBot agent in this case is designed to begin by eliciting
user preferences, and eventually to use the elicited preferences to
predict what the user would like. In this case, it tells the user,
then responds to the user's reply by providing an example to the
learning system using a learnerTrain
mob.
In order to run the demonstration of all of these agents running, firstly kill any blackboards or other agents running (this is not strictly necessary, but it helps to make sure the script will run). Then in the Mica directory, type java unsw.cse.mica.runner.MicaRunner examples/run/learnemail-run.xml. When the "start all" buttons is hit, in addition to the agents mentioned above, this will open two other windows: firstly, a "MICA debugger". This application is generally useful, and allows you to see all the information on the blackboard. An example of the debugger is shown in Figure 6.5
The second window opened displays the decision tree learned in the process of answering the question of whether to read or display the e-mail. You should hit the "Reload" button occasionally to reload the tree. It is shown in Figure 6.6.
The above tree shows the concept learnt after a few rounds with the user. It has learnt that it should read e-mails in the car, and if at home, read e-mails if there is less than 18 and display them if there is more than 18.
For a closer examination, users should consider reading the source code.
[3] The current implementation of MICA will read the
slot
element of the xml document, but won't
actually do anything about it. It's there as a form of
documentation,