TY - JOUR
AU1 - Copas,, C.V
AU2 - Edmonds,, E
AB - Abstract Recent progress in planning has enabled this technique to be applied to some significant real-world problems, including the construction of intelligent user interfaces. Previous research in interactive planners has emphasised their dynamism and maintenance advantages. This paper adopts a user-interaction perspective, and explores the theme that a paradigm shift in human–computer interaction is now a prospect: away from the requirement to instruct machines towards a more declarative, goal-based form of interaction. This initiative necessarily involves consideration of the design of goal description languages, and some alternatives are analysed. Some architectural issues associated with embedding planners within a user interface management system are examined, together with some practical implementation issues. Planning is discussed in the context of human–computer interaction specification methods. It is shown that planning formalisms possess advantages of expressiveness, and that executable specifications could usefully incorporate some control aspects from planning. 1 Introduction Planning techniques have long been considered to hold potential for injecting intelligence into interactive systems. The general principle is that interactive planners are the recipients of goals which describe some desired state(s) of a computer-based system. These planners possess knowledge about various actions (typically corresponding to user-level commands), including in particular the preconditions and effects of these actions. The planning task is to search (nondeterministically) for command combinations which will achieve the goal. At that point, the planner may either recommend a course of action to the user, or automatically execute the script which has been generated. Conceptually, planning may be regarded on the one hand as a constraint satisfaction problem and, on the other hand, as a restricted type of logical deduction in which axioms describing the capabilities and requirements of various actions are employed in conjunction with a specialised inference engine. Planners have historically been hampered by problems of poor expressiveness, poor performance and, to a lesser extent, ambiguous inferential completeness, but recent research progress suggests their potential may be closer to realisation. For example, expressiveness has improved with the advent of algorithms for accommodating conditional action effects [1], disjunctive preconditions, and quantification over dynamic object universes [2]. Performance has simultaneously benefited from improved algorithms, particularly the employment of constraint logic programming techniques, to the point where quasi real-time, interactive planners are being reported in domains such as network searching within the Unix operating system [3], image processing [4] and database integration [5]. One of the aims of this paper is to report on the feasibility of employing interactive planners within another domain: that of user interaction with a geographic information system (GIS). These systems, along with many other so-called high-functionality systems [6] have a poor reputation for usability. As discussed in Section 2, conventional engineering solutions to this problem, such as the construction of graphical user interfaces, suffer from inherent limitations which planners may overcome. More significantly, the little existing work on interactive planners has tended to emphasise the maintenance and dynamism advantages which such declarative systems possess in comparison to more procedural systems, and has only addressed end-user concerns indirectly. A further aim of this paper is thus to investigate some human–computer interaction (HCI) issues (in Section 3), with particular reference to the design of goal description languages. Planners have typically been built by artificial intelligence (AI) workers for the purpose of implementing some problem-solving system. However, the underlying knowledge representation (including operators, preconditions and effects) is itself of HCI relevance, given the interest in appropriate specification techniques within fields such as user interface management systems (UIMSs) and CSCW. Planners effectively involve an executable model of causality within some domain, which aligns them with model-based approaches to software development in general, and which gives them a close correspondence in particular with techniques which specify the semantics of state transitions, such as high-level Petri nets (PNs). A subsidiary aim of this paper is to compare and contrast developments in planning with executable specification practices in HCI (Section 4). It is contended that planning offers a number of features which could profitably be incorporated, including a more expressive formalism in many cases, and the possibility of more dynamic run-time control. 2 GIS user interfaces Consider the following simple visualisation task facing some GIS users, which will be used for illustration throughout the remainder of this paper. The system includes a number of data themes, representing roads, elevation, population, etc., with the display currently being blank. The users’ desire could be paraphrased as follows: ‘I would like to see the roads map in plan view, superimposed upon a white background, containing a legend in the bottom right corner and a scale-bar in the top centre’. The expected output of the system is depicted in Fig. 1. Fig. 1 Open in new tabDownload slide The output of a GIS visualisation goal. Fig. 1 Open in new tabDownload slide The output of a GIS visualisation goal. It may be objected that this task is undemanding, as it does not involve any particular sophistication in spatial analysis on the part of the user. However, it is a good example for precisely that reason, because even users who have a clear idea of their goals must still translate those goals into a sequence of GIS instructions which is both syntactically correct and semantically coherent. Employing the command-driven interface of the public-domain GIS Grass4.1 [7], seven instructions are necessary for achieving the goal, as depicted in Fig. 2. Fig. 2 Open in new tabDownload slide A typical GIS command sequence, or plan. Fig. 2 Open in new tabDownload slide A typical GIS command sequence, or plan. As may be inferred from Fig. 2, GISs tend to possess a large, relatively primitive command-set out of a concern for general-purpose capability and thus resemble the Unix operating system, or a (spatial) statistics package. A typical response to this usability problem is the construction of menu-driven, graphical interfaces; an example of which is shown in Fig. 3. These have the obvious advantage of eliminating errors of command retrieval and construction, but are not themselves beyond criticism. One feature of menus is that, linguistically, the items are usually imperatives and, in the simplest case, correspond to application commands. Thus, the influence of the command-line lingers. A further design innovation is to supply some iconic representation of the objects which comprise the system’s universe of discourse, thus allowing users to manipulate these in a pseudo-direct fashion. Within the GIS sphere, however, direct manipulation is rare; for example, only experimental systems allow one to perform map overlays by dragging icons into some viewing area [8]. Part of the problem is that it is difficult to represent all of an object’s methods (particularly abstract methods) in a gestural or pictorial fashion. More commonly, although there may be some iconic representation of objects, their methods are invoked by selection from some pop-up menu. It could thus be argued that the imperative languages in which most systems are programmed eventually permeate through to the user interface, despite the best efforts of designers to construct various facades. Fig. 3 Open in new tabDownload slide A menu-driven, graphical CIS user interface. Fig. 3 Open in new tabDownload slide A menu-driven, graphical CIS user interface. It is at this point that planners potentially offer a design alternative. Interactive planners, apart from any considerations of ‘intelligence’, are distinctive because they invert the imperative form of interaction just described. That is, users, instead of issuing numerous instructions in order to achieve their goals, may instead interact with machines in the converse fashion, by describing their goals and relying on the machine to infer the necessary instructions. In other words, the declarative manner in which planners are programmed has the effect of fostering a declarative form of interaction. This concept of the utility of declarative interaction rests upon the assumption that it is easier or at least preferable for users to describe goals rather than generate sets of instructions. It is recognised that planners could be said to foster an interaction style of indirect manipulation, because such support systems intervene between the user and the (representation of the) domain objects. One may anticipate that planners may be perceived as introducing superfluous overheads when supporting the kind of simple and self-evident tasks which currently admit well to direct manipulation or, for that matter, to imperative interaction in general. More specifically, it may be hypothesised that the acceptability of interactive planners may be expected to increase as the unit tasks in any domain involve longer sequences of instructions for their completion. The most practical scenario is one in which a variety of forms of interaction are available to the user. It may also be anticipated that, if planning technology becomes sufficiently well-understood to be appropriated by the mainstream (in the manner of the relational calculus, for example), then these systems are less likely to be deemed intelligent and may come to be regarded as routine constraint satisfiers! 3 An interactive planner for GIS The work reported here employs the public-domain planner Ucpop4.0 [2], written in Common Lisp. Planners may be distinguished by various features, which merit description at this point. The essential features of Ucpop can be listed as follows: The example visualisation goal introduced previously is described, using Ucpop4.0 syntax, in Fig. 4. This example employs existentially quantified, first-order predicates; however, universally quantified goals and negation are also supported. It is regressive, i.e. search proceeds by selecting operators which can achieve the goal state, then placing the preconditions of these operators onto an agenda of revised goals, until the current state is reached. This strategy is more focused than progressive search methods, and thus has performance advantages in domains where there are a large number of operators compared to the average number of goals involved in any plan. It builds plans from first principles, as opposed to the strategy of composing a larger plan from some pre-existing library of plan fragments. This latter approach effectively enables learning or experience to enhance performance, but is often described as hierarchical or abstract planning instead. Planners which cannot work from first principles may suffer from inflexibility because of the assumption that one may anticipate users’ goals and store a compiled response [9]. It is partially ordered or nonlinear, i.e. if alternate action sequences can achieve the same goal state, then the algorithm avoids committing to any one sequence unnecessarily, with consequent gains in performance, end-user support, and flexibility of execution (in particular, it is possible to infer opportunities for parallel execution). It is domain-independent, i.e. the various choices which arise during planning are made without recourse to any domain-specific heuristics, such as ‘always draw maps before displaying legends’. The employment of this general search control strategy preserves completeness, at the cost of some performance. A programmer’s interface allows the incorporation of more specific heuristics, which effectively imparts some of the character of an expert system to the planner. It assumes that the planner has access to all necessary information about the state of the world, and that action effects are both discrete and deterministic. These restrictions may be regarded as unreasonable within certain real-world domains (which has led to a concern for planners based upon fuzzy or modal logics), but are more reasonable in the case of some artificial software worlds. Fig. 4 Open in new tabDownload slide A GIS goal, expressed in terms of both first-order predicate logic and natural language (variables are prefixed with ‘?’). Fig. 4 Open in new tabDownload slide A GIS goal, expressed in terms of both first-order predicate logic and natural language (variables are prefixed with ‘?’). This example was chosen partly because of the comparative length of the plan which is required to satisfy the goal. In a previous imperative interface, this goal was identified as a unit task requiring the most involved macro. An example of a relatively complex operator representation is shown in Fig. 5. Fig. 5 Open in new tabDownload slide A planning representation of a GIS command, also expressed in terms of natural language (variables are prefixed with ‘?’). Fig. 5 Open in new tabDownload slide A planning representation of a GIS command, also expressed in terms of natural language (variables are prefixed with ‘?’). The entities in this domain are both persistent, e.g. data files, and more ephemeral, e.g. the contents of graphics windows. The main features of this example are, first, conditional effects (e.g. the effects of the command are different depending on whether the window contains any frames) and, secondly, universal quantification over a dynamic object universe (e.g. the above command has the effect of destroying all existing contents of the window, without having to nominate those contents explicitly). Assuming that it is desired for the planner to mediate the user-application interaction in a UIMS fashion, two interfaces require attention. The first is between the planner and the application. It is routine to transform the output of the planner into a series of application callbacks, but deeper discussion is deferred until Section 3.1.2. The main interface concern at this point is with the user. Clearly, after criticising contemporary GIS user interfaces, it would be inconsistent to claim that the predicate logic interface of Fig. 4 represents an advance in usability! In its raw form, this interface poses a number of hurdles for casual users: mastery of Lisp/Ucpop syntax; mastery of the semantics of predicate calculus, including conjunction, negation, and existential/universal quantification; and lack of guidance about the types of goal statements which are possible. It may be recognised that these types of problems are also familiar from the database world, which has the advantage of providing conceptual leverage. For example, it allows one to compare and contrast goal description languages (and techniques) with more familiar database query strategies, despite the fact that plan synthesis is not generally regarded as an information retrieval task. The predicate logic interface of Fig. 4 may be seen as an analogue of SQL: declarative (in comparison to its predecessors), demanding (for inexperienced users), and also limited by its first-order formalism (i.e. it is not possible to pose a meta-query about which predicates are available). However, planners and conventional databases do differ quite markedly in that their underlying formalisms emphasise either the dynamic or structural aspects of some domain, respectively. As a result, whilst the behaviour (i.e. the state transitions) of the GIS domain is explicated by the planning model, the universe of discourse is only implicit. This may, however, be explicated using an entity-relationship (ER) diagram, as shown in Fig. 6. Fig. 6 Open in new tabDownload slide An entity-relationship representation of the universe of discourse underlying a GIS domain. Fig. 6 Open in new tabDownload slide An entity-relationship representation of the universe of discourse underlying a GIS domain. One advantage of the data model of Fig. 6 is naturally that the ontological structure of the domain is revealed, e.g. it is apparent that some predicates function as attributes of entities (position, background-colour) whereas others serve to relate two entities (contains, displayed-in, refers-to). It is also notable that one entity (window) is not present in the natural language goal description of Fig. 4; i.e. this entity is consequential upon the goal of displaying maps. Similarly, one relation (refers-to) is effectively implicit in the natural language description. It would seem important to impress these distinctions upon end-users. As a preliminary measure, the logic-based interface may usefully be augmented with some standard, higher-order predicates, such as ‘entity’, ‘attribute’ and ‘relation’ (neglecting for the moment esoteric modelling issues such as whether attributes may be considered to be a special entity). This initiative provides the basis for a certain amount of guidance if one then postulates a meta-query facility; however, the problems of mastery of logic remain, and a new problem of meta-query construction arises. Graphical interfaces, alternatively, provide the general features of revealing domain ontologies and reducing problems of syntax in the interaction. An example of this approach for the GIS domain is shown in Fig. 7. Fig. 7 Open in new tabDownload slide A form-filling interface for specifying goals to an interactive planner. Fig. 7 Open in new tabDownload slide A form-filling interface for specifying goals to an interactive planner. Not unexpectedly, this style of interface resembles a form-filling interface to a relational database. In the spirit of deductive databases, details of whether the plan is being ‘retrieved’ or ‘derived’ are suppressed. One design issue which is not immediately apparent from the static example of Fig. 7 is that of dialogue control; which is context-sensitive, in more than one respect. First, the data model specifies certain constraints, e.g. that maps but not data can be displayed. This knowledge is used to cause appropriate forms to be displayed, based upon prior selections. Secondly, it is occasionally desirable to impose an order of field filling upon the user, which is achieved using field disabling techniques. These dynamics have at present been achieved simply by writing procedural graphics code, without any prior specification. It is recognised that standard UIMS practice is to construct an executable specification of interaction-object behaviour, and it is intended to investigate planning formalisms for this purpose. Somewhat curiously, one other example of a form-filling interface to a planner [10] appears to be based upon neither an explicit data model nor typed predicates. It is claimed that the form-filling approach overcomes users’ discomfort with logic. More precisely, such an approach may be expected to reduce problems of syntax, but the ability of graphics to facilitate a grasp of the semantics of logic is considered in this paper to remain an empirical question. One potentially troublesome feature is the requirement for the user specifically to employ both existentially and universally quantified identifiers. In the relation database field, SQL at the time of writing supports existential but not universal quantifiers, whereas the graphical approach of QBE does not support quantifiers at all. The form-filling interface may be criticised for its linguistic nature, which contrasts with the graphical nature of the ER diagram on which it is based. One progression is to propose that entities in the goal have an iconic representation. For example, if the user wishes to delete data file ‘F’ or close window ‘W’, then conventional graphical interface techniques allow one to establish a relationship between icons and their referents. The planning situation, however, is complicated by the requirement to accommodate quantifiers (e.g. ‘I would like to see a map of some/every data file’). This requires some graphical representation of both anonymous entities and sets; an example of the latter being the palettes employed within interactive drawing packages. A further design issue is the graphical specification of relations and attributes. Conveniently, some of the predicates in the example GIS domain (position, contains, displayed-in), by virtue of their spatial connotations, may be readily defined by drawing. For example, a map icon may be dragged inside a window icon in order to convey that the former is ‘displayed-in’ the latter. Negated predicates, alternatively, are challenging to represent graphically. It is ironic that, if this notion of graphical goal specification could be carried to its extreme, then the user interface would resemble an advanced direct manipulation interface to a GIS, albeit augmented with quantifiers and negation. Such an interface must depart further from conventional direct manipulation, however, by being insensitive to the sequence of operations. For example, it must be legitimate to drag a scale-bar followed by a map into some viewing area representation (in order that the planner can infer how to display both these entities), whereas in the actual application the scale-bar would become occluded by this sequence. In other words, an iconic planner interface must support a form of visual, automatic programming, whereas conventional direct manipulation effectively requires the user to perform all programming manually, in a step-by-step fashion. 3.1 Implementation issues The work reported in Section 3 was intended to demonstrate two concepts: In itself, this demonstration distinguishes this work from most previous reports of interactive planners, such as [11]. However, a variety of further practical considerations must be addressed before contemplating putting this system into production. Performance is one major concern; the less the system responds in real time, the less its suitability as a UIMS component, and the more its potential status becomes relegated to that of on-line help. Other considerations include the feasibility of interfacing the planner to an application, and the software development effort required. Contemporary planners possess sufficient expressiveness to support significant tasks within a GIS domain; and Enhancements may be made to the programmer’s interface such that at least satisfactory user interaction becomes feasible. 3.1.1 Performance The work reported in this paper employs a restricted, although intentionally challenging, sub-set of GIS operators. A complete GIS might involve 300 operators, and so scalability is obviously an issue. Regressive planners scale-up well provided that the application commands tend to have unique effects, suggesting that performance degradation may be as much a function of the compiler as it is of the planning algorithm. The usual assumption made in planning is that the shortest plan (found by breadth or best first search) is of most interest. However, if one postulates that the user may wish to inspect a range of alternative plans, possibly with some associated explanation, then both performance and completeness considerations become even more crucial. Theoretically, a major influence on planner performance is the average branching factor in a domain [2], which broadly corresponds to the number of alternative actions which must be considered at any choice point. Less formally, an ‘ideal’ domain is one in which all actions have unique effects, and no action negates any of the preconditions of other actions. One distinctive feature of this GIS domain appears to be operator complexity, with a rule-of-thumb being that increases in the average number of effects per operator increase the probability of operator inter-dependencies. Apart from the domain itself, a second influence on performance is the type of queries which are posed of that domain. For example, quantifiers in the goal statement tend to increase solution times. As a crude generalisation, our experience is that plans of three steps are synthesised in subjective real time on a Unix workstation. The seven step plan of Fig. 2 is returned in 2–3 s, as something of an extreme example (although some planning failures may take as long to report). In an image processing domain, it has been indicated that plan lengths of 10 steps may be typical, and that reliance on both domain-dependent search heuristics and preexisting plan libraries is required [4]. Without resorting to these measures, other options are available for improving performance: The employment of search heuristics which supplement those of Ucpop, but yet which need not be considered domain-specific, e.g. work on the hardest/easiest goals first, avoid considering action sequences which ‘undo’ each other, use fewest operators, distinguish between ‘primary’ and ‘incidental’ effects. These may be regarded as metaplanning heuristics. Provided they weight choices rather than prohibit avenues of search, completeness is retained. It should also be noted that latitude exists for improved planning algorithms; in particular, the possibility of extending the least commitment approach to incorporate typed operators. By reasoning with classes rather than instances of operators, a planner ought to be able to gain performance in the same way that Ucpop does by reasoning with classes rather than instances of the arguments of those operators. Existing work into typed operators has not had direct performance concerns [12,13]. It may be shown that typed operators depend upon an object taxonomy [9], also a research frontier for planners, which incidentally reinforces the comments about the desirability of data models which were made in the context of user interface construction. One practical means of performance enhancement which has tended to be neglected in the theoretical literature is goal analysis and subsequent modification. Conceptually, a dialogue may be envisaged in which an interactive user is invited to assent to modification of the goal statement, e.g. by the binding of existentially quantified variables, or by the deletion of certain predicates. This initiative raises the dilemma, however, of some analytic process being able to forecast the termination of the planning process, which is theoretically impossible. The most realistic compromise is that the analyser might employ some comparatively naive heuristics, e.g. by ascertaining how many different actions will be required for plan solution, and investigating in a preliminary fashion the degree of independence of those actions. It may be noted that the employment of a specific data model supports this spirit of goal pre-processing, in that the model is used to prevent the construction of goal statements which violate designated constraints. The end result is that fewer plan failures occur at run-time. 3.1.2 Application interface One standard assumption of many research planners is that exogenous events do not cause the state of the environment to change, i.e. the world is assumed to be closed. In the case of a single-user application, it is also reasonable to assume that the operating system prevents exogenous users from changing the state of the file system, the graphics display, etc. However, the execution of any plan needs to be followed by a process in which the planner updates its notion of the current application state, as it is unsafe simply to rely on inference for this information. Therefore, the application must provide commands which return state information, in addition to commands which effect state changes. The planner may then reason about how it can obtain state information, alongside reasoning about how it can achieve target goal states. In the GIS domain, the implementation of these principles has proved to be problematic, as the application supplies more facilities for altering its state than it supplies for verifying its state. Regarding the planner as a software robot, it could be said that any artificial entity which interacts with the application is hampered by an imbalance between effectors and sensors, reflecting once again a legacy of imperative applications. One solution is to supplement the application with more state-interrogation routines, but at the cost of some extensive low-level programming. It would be ideal if the application could be reprogrammed to signal the planning system after every state change, and this strategy has in fact been adopted for a Unix domain [3]. Without indulging in such modifications, one less than satisfactory approach is to restrict user goals to those which may subsequently be verified by the planner. The discussion so far has assumed that the planner constitutes an intelligent front-end to an imperative application, and thus some perennial UIMS issues arise of how aware the application should be of its user interface, and whether application state is shared or replicated with the front-end [14]. However, it would be premature to conclude that planning is limited to such front-ending endeavours. For example, if developing an object-oriented application from the start, then it is possible to envisage a planner reasoning about object methods rather than about application commands. A further refinement is to address the problems which may occur if the application changes state between the time of planning and the time of execution. In that case, error recovery and replanning are required, generating advanced robotics issues such as how the planning system might become aware of execution errors, and whether it should replan partially or totally. 3.1.3 Software development effort Planners are knowledge-based systems, and so knowledge acquisition is a practical issue which should not be neglected. In contrast to rule-based expert systems construction, however, there are a number of advantages. The latter generally require the encoding of personal experience, which is elusive almost by definition, whereas planners involve the more rationalist enterprise of constructing accurate models of the ‘physics’ of some domain. The level of abstraction of those models is driven by an analysis of prospective goals. Some application knowledge may be expected to be found in user manuals and documentation, and thus planner development effectively involves explicating implicit cause-and-effect relationships. Specialised planner development environments are rare. In the case of Ucpop, code may be written using a Lisp-aware text editor and checked for syntactic and basic semantic conformity. A graphical debugger allows the developer to trace reasons for anomalous or failed plans at run-time. Greater scope certainly exists in the area of static analysis of the knowledge-base, for example, by inferring action categories [12], or by depicting networks of action dependencies [15]. 4 Planning and HCI specification 4.1 Issues in dynamics specification and control Planning essentially requires that a knowledge-base containing descriptions of operator (or action) semantics is wedded to a search engine in order to produce problem-solving behaviour. In HCI, action descriptions or representations are also of interest, given the general concern with specifying the dynamics within domains such as UIMS, CSCW and TA. Discussions of HCI specification are typically not wide-ranging, and it is occasionally possible to detect the slightly myopic view that each of these domains has unique representation problems, and thus requires a unique formalism. This is not to deny that research has generated some useful, specific abstractions (one example being the notion of roles within CSCW), but that the differences between these fields are as deep as is sometimes implied. A second observation which needs to be made at this point is that there is no unqualified enthusiasm for dynamics specification. One long-standing controversy within the UIMS field has been whether the employment of explicit dialogue control models leads to a rigid form of interaction, e.g. [16]. Frustration about the lack of user acceptance for systems based upon group work-flow models has existed within the CSCW field almost from its inception, e.g. [17]. TA has been suggested to be something of a HCI panacea but, more recently, reservations have arisen about the sophistication of systems derived from temporally ordered task networks, e.g. [18]. Whilst these problems have their individual features, a common theme also emerges: that specification tends to lead to inflexible systems. Responses to this problem range from the irrational (that specification should be abandoned in the hope that the implementors of the system will make satisfactory design decisions), to the naive (that problems of inflexibility will be solved by more rigorous analysis), to capitulation (that systems should simply possess modeless dynamics, even if analysis does suggest dependencies between actions). A more satisfactory response is that specifications should express constraints rather than hard-coded action sequences, although it could not be said that there is general appreciation of the implications of this view within the HCI field. One implication is that specifications are required to be more declarative, i.e. these should state relations which must be preserved. A second implication is that some constraint solver should be available for generating the dynamics at run-time, as opposed to the strategy of enumerating most of the dynamics at compile-time. The representation employed is obviously a large factor in the success of any constraint solver, and so it is preferable not to consider specification in isolation. In the UIMS dialogue modelling field, it is commonly accepted that event models are more powerful than context-free grammars and state transition networks [19], and this is reflected in the widespread adoption of specifications based upon process algebras. These support a relatively sophisticated form of dialogue description, but there is no sense in which that dialogue may be said to be generated in response to constraints. Such reasoning would seem to require at a minimum domain axioms referring to system state. Intuitively, the concept of constraint satisfaction may be seen to be related to the concept of context-sensitive dialogues, a feature potentially supported by rule-based models. It has been shown that a simple rule-based formalism employing propositions (rather than predicates) subsumes the expressiveness of event models [20]. Rule-based systems, however, have been criticised within AI for various reasons, including their lack of structure, and also because they encourage the encoding of a comparatively shallow association between situations and conclusions. Model-based reasoning is seen as a progression, in which deeper, physical knowledge is employed. Planners epitomise the model-based reasoning approach because of the causal relationship which is captured between preconditions and effects. The distinction between planners and some forms of rule-based systems, however, is not as clear as these observations might imply. The operator descriptions contained within planner knowledge bases may be reinterpreted as rules of the general form ‘if preconditions and action is chosen, then effects’. Model-based knowledge may therefore be regarded as a representation discipline which is imposed upon the rule-based tradition. Similarly, the model-based reasoning embodied in contemporary planning algorithms may be regarded as a development of the general-purpose reasoning supplied by theorem provers. That is, planners may be regarded as special-purpose inference engines, which accounts for the occasional attribution that regressive planners, for example, employ backward chaining. Causal action knowledge is also a feature of one UIMS which has influenced later systems considerably, namely UIDE [21]. Such model-based UIMS, however, do not employ a planning algorithm but instead reason in a projective fashion, i.e. given a sequence of one or more actions, the system computes the next state of the application (in contrast to planners, which find partially ordered paths between states). Projection algorithms are computationally unremarkable in comparison to planning, as these appear to be deterministic and do not involve search. (It is unclear whether parallel actions are supported, which potentially might require the system to resolve conflicts.) Projection does have the advantage of supporting the provision of advice about the consequences of executing nominated command sequences, and it may be anticipated that projection and planning tend to be reciprocal cognitive activities of the user (as illustrated respectively by two prototypical questions: ‘what if …?’ and ‘how can I …?’). Thus, an ideal UIMS would accommodate both forms of reasoning. Contemporary planners may be further distinguished from UIMS by their expressiveness, with the incorporation of negation, existential and universal quantification, and conditional effects frequently being considered necessary for modelling anything other than toy domains. As indicated previously, one major deficiency of planners is their general disregard of data models, although this paper demonstrates that a hybrid technique is straightforward. Contemporary UIMS take the additional step of employing object-oriented data models, and thus gain expressiveness in that area. 4.2 Relationship of Petri nets Causal knowledge is also an implicit feature of some formalisms which claim no direct heritage in knowledge-based systems. In general, techniques which model the semantics of state transitions, such as high-level PNs, fit into this category. An early comparative review of UIMS formalisms which includes PNs is provided by [22]. Discussion about PNs is complicated by the facts that, firstly, the technique is highly fluid and thus provides great opportunity for individualistic extensions. Because of PN diversity, it may not be particularly meaningful to regard these as a formalism in their own right, but instead as a transition network which is augmented with both input and output information for each transition. A schematic example of one particular form of PN, a predicate/transition net, is shown in Fig. 8. This example takes some liberties with PN notation in order to facilitate comparisons with planning and model-based UIMS. Thus, instead of the conventional PN language of places/tokens/transitions, preconditions/effects/actions have been substituted. Fig. 8 Open in new tabDownload slide Schematic diagram of a predicate/transition net. Fig. 8 Open in new tabDownload slide Schematic diagram of a predicate/transition net. It may be noted that some applications of PNs within HCI have tended not to exploit their full power. For example, a net containing no choice of either preconditions or effects is effectively a finite state machine. A second common restricted variation is a marked graph, in which every predicate is the effect of one action only, and is itself the precondition of one action only [23]. It is customary for authors to emphasise that PNs explicate the possibility of parallel actions, whereas only deterministic marked graphs do this unambiguously (in other cases, opportunities for parallelism need to be inferred). The casual reader could sometimes gain the impression that PNs are little more than a graphical process description, although PNs describe transition possibilities rather than actual sequences, and the expressive capability of a fully featured predicate/transition net is actually greater. The representation of Fig. 8 is akin to that employed by either planners or existing model-based UIMS, with the main difference being that the unit of representation is not the individual action but instead a network of actions related by their inter-dependencies. PNs may thus be regarded as the visible output of a dependency analysis of some action knowledge-base, and have the advantage of making control flow more explicit to the reader. However, any execution of the net is most likely based upon a deconstructed store of actions, with their inter-relationships being computed as the need arises (as indeed occurs during planning). For example, in [15] an algorithm is presented which effectively involves joining the ‘nets’ representing individual actions on the basis of common places. These observations provoke the issue of why modellers should be burdened with constructing PNs manually, as is current practice, and whether PNs could be generated automatically from some planning-like knowledge-base. It may also be speculated that an ideal system would provide the modeller with graphical editing facilities for the knowledge-base, suggesting that PNs could also mediate user input. Fig. 9 summarises this discussion regarding the inter-relationship between existing model-based UIMS, planners, and PNs. It may be seen that all three areas share a similar style of knowledge representation (at least implicitly), but are distinguished by the algorithms which exploit that representation. Fig. 9 Open in new tabDownload slide The inter-relationship between existing model-based UIMS, planners, and Petri nets. Fig. 9 Open in new tabDownload slide The inter-relationship between existing model-based UIMS, planners, and Petri nets. In order to illustrate the commonalties between planners and PNs, it was originally intended to represent the GIS domain of this paper in PN form. However, expressiveness problems instantly arose when attempting to represent the semantics of the commands of Fig. 2, at least when using conventional PN notation. First, there is the problem of conditional effects (for example, the effects of the ‘d.rast’ command of Fig. 5 and different depending upon whether any other maps are already on display). Admittedly, so-called PN ‘emission rules’ address this issue [24], but there has been no indication of such expressive nets being executable. A second expressiveness issue is that of universal quantification; for example, one of the effects of the ‘d.rast’ command of Fig. 5 is that all previous contents of the window are now not displayed, as a means of object destruction. Once again, so-called ‘emptying arcs’ have been proposed in response [25], but with executability remaining uncertain. In conclusion, if PNs were to remain strictly a representational technique, then these are adequate to the task of modelling the GIS domain, but with two reservations. First, modellers familiar with predicate logic could be excused for finding the language of PNs (places, tokens, inhibitor arcs, etc.) to be arcane or superfluous at best, and ontologically ambiguous at worst. Second, it is precisely the executability of PNs which is frequently promoted, yet PN execution algorithms appear to lag developments in specification. The suggestion that the PN technique comes accompanied with specialised execution algorithms is in fact slightly misleading. For example, one frequently cited capability is ‘reachability analysis’, yet this broadly corresponds to the planning task of finding a sequence of actions which will transform the current state to some target state. At its most straightforward, reachability analysis may be regarded as a graph search problem which consequently yields to existing planning techniques, e.g. [26]. Progressive planning has typically been employed and, as indicated previously, this approach is generally not considered to scale-up well. Isolated instances of regressive planning (known as ‘backward reachability’ in PN parlance) have been reported [15,27]. One unique contribution of PN research is the use of matrix algebra to generate reachability solutions; a potentially exciting feature given the performance problems which plague heuristic search. However, one of the reasons that algebraic techniques work well is that these are generally only applied to restricted types of nets, admitting to deterministic solutions [23]. For more complex nets, it appears that the graph search problem may be disguised in various ways, but ultimately not eliminated. 4.3 Specification of user tasks The discussion so far has had a UIMS dialogue flavour although, as indicated previously, principles of dynamics modelling are of more general relevance. The TA field exhibits less formal diversity, partly because of an entrenched view that TA should involve task decomposition and sequence description, e.g. [28]. This approach has the unfortunate effect of resulting in a comparatively static task network, which has negative implications for the sophistication of any user-computer dialogues, advice-giving systems, etc., which might be derived from that network. This restricted view of what constitutes ‘task analysis’ also tends to neglect that, firstly, TA could involve knowledge acquisition and, secondly, that high-level cognitive simulations (i.e. those unconcerned with the micro-architecture of cognition) typically involve some task representation which is necessarily executable. If a broader focus is adopted, then many expert systems may also justifiably be regarded as executable TA, typically employing a rule-based formalism. Isolated examples of more declarative or constraint-oriented approaches to TA exist. One of the original examples of a cognitive simulator, GPS [29], also happens to be one of the original examples of a planner, with a more contemporary incarnation in Ref. [30]. ETKS [31] employs a formalism based upon task actions, preconditions and effects, but, oddly, also employs a hard-coded task network. PNs are also claimed to have been employed towards TA [32] (although the task appears to have no cognitive component). One research issue which arises at this point is the readiness with which the ‘physics’ of user task, particularly cognitive tasks, may be described in the model-based tradition. As possible evidence of difficulty, some models which are said to derive from either a cognitive simulation or task analytic perspective in practice are barely distinguishable from lower-level device models, e.g. [30,32]. On the occasions when this anomaly is acknowledged, the usual justification is that experienced users are expected to possess faithful mental models of cause-and-effect within the application or device with which they are interacting. This lack of discrimination between user and application models is undesirable in those cases where TA is being used to enhance some application. Referring back to the example goal which has been used throughout this paper, GIS users typically do not wish to display maps, etc., for idle reasons. Instead, they may have higher-level goals, such as planning routes, or deciding upon regional zoning policies. The existing planner cannot support those goals directly because the ‘awareness’ of the application is limited to files, maps, legends, etc. If it is wished to provide support for higher-level goals like route planning, then the application needs to be augmented so that, firstly, it contains higher-level data types such as routes and, secondly, provides higher-level commands (or methods, in an object-oriented application) such as ‘compare routes’ which operate on those data types. This approach requires that the user’s conceptual world may be modelled independently of the application’s world. 5 Conclusion This paper has demonstrated that contemporary planners are sufficiently expressive that it is feasible to build intelligent interfaces which support some significant user tasks within a GIS domain. A broad view of these developments suggests that more is involved than just the provision of intelligence: paradigms of user interaction may be enabled to evolve from an imperative towards a more declarative style. The advent of interactive planners raises design issues of goal description techniques, and some alternatives have been examined in this paper. It was found that the user interface to planners cannot be constructed in a methodical fashion without access to an explicit data model of the domain; something lacking in existing planners. The performance of contemporary planners has been shown to be encouraging for these to mediate the user-application interaction in a UIMS fashion, although further research is required into both performance enhancement and interactive facilities. The advent of interactive planners raises concerns about an imbalance in conventional application command sets; between commands for effecting state changes, and those for verifying current state. Constraint satisfaction techniques have been proposed as a general approach for solving the problem of inflexible system dynamics, and planners have been shown to support that approach. Planning representations have been analysed in relation to HCI specification practices, with the conclusion that many model-based formalisms could usefully exploit either the expressiveness of planners, or the dynamic run-time control which planning algorithms provide. References [1] Pednault E. , Synthesizing plans that contain actions with context-dependent effects , Computational Intelligence 4 ( 4 ) 1988 ) 356 – 372 Google Scholar Crossref Search ADS WorldCat [2] Weld D.S. , An introduction to least commitment planning , AI Magazine 15 ( 4 ) 1994 ) 27 – 61 OpenURL Placeholder Text WorldCat [3] O. Etzioni et al., OS agents: using AI techniques in the operating system environment, University of Washington, Seattle, WA, Technical Report 93-04-04, 3 August 1994 (currently available at ftp.cs.washington.edu:/pub/etzioni/os-agents.ps.Z). [4] Chien S.A. , Using AI planning techniques to automatically generate image processing procedures Hammond K. Proceedings of the Second International Conference on Artificial Intelligence Planning Systems, AIPS-2, Chicago, IL 1994 AAAI Press , Menlo Park, CA 219 OpenURL Placeholder Text WorldCat [5] Knoblock C.A. , Planning, executing, sensing and replanning for information gathering Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI-95), Montreal 1995 Morgan Kaufmann , Los Altos, CA 1686 OpenURL Placeholder Text WorldCat [6] Fischer G. , The importance of models in making complex systems comprehensible Tauber M.J. Ackermann D. Mental Models and Human–Computer Interaction 2 1991 North-Holland , Oxford OpenURL Placeholder Text WorldCat [7] US Army Construction Engineering Research Laboratory-CERL, Grass 4.1, 1993. [8] Egenhofer M.J. Richards J.R. , Exploratory access to geographic data based on the map-overlay metaphor , Journal of Visual Languages and Computing 4 ( 1993 ) 105 – 125 Google Scholar Crossref Search ADS WorldCat [9] Tenenberg J.D. , Abstraction in planning Allen J.F. Kautz H.A. Pelavin R.N. Tenenberg J.D. Reasoning About Plans 1991 Morgan Kaufmann , San Mateo 213 – 283 OpenURL Placeholder Text WorldCat [10] Etzioni O. Weld D. , A softbot-based interface to the Internet , Communications of the ACM 37 ( 7 ) 1994 ) 72 – 76 Google Scholar Crossref Search ADS WorldCat [11] Senay H. et al. , Planning for automatic help generation Cockton G. et al. Engineering for Human–Computer Interaction (EHCI’89) 1990 Elsevier , Amsterdam 293 – 311 OpenURL Placeholder Text WorldCat [12] Anderson J.S. Farley A.M. , Plan abstraction based on operator generalization Proceedings of Seventh National Conference on Artificial Intelligence (AAAI ’88), St. Paul, MN 2 1988 Morgan Kaufmann , Palo Alto, CA 100 OpenURL Placeholder Text WorldCat [13] Kramer M. Unger C. , A generalizing operator abstraction Bäckström C. Sandewall E. Current Trends in AI Planning 1994 IOS Press , Amsterdam 185 – 198 OpenURL Placeholder Text WorldCat [14] Dance J.R. et al. , The run-time structure of UIMS-supported applications , Computer Graphics 21 ( 2 ) 1987 ) 97 – 101 Google Scholar Crossref Search ADS WorldCat [15] Murata T. Nelson P.C. , A predicate-transition net model for multiple agent planning , Information Sciences 57–58 ( 1991 ) 361 – 384 Google Scholar Crossref Search ADS WorldCat [16] Took R. , Putting design into practice: formal specification and the user interface Harrison M. Thimbleby H. Formal Methods in Human–Computer Interaction 1990 Cambridge University Press , Cambidge 63 – 96 OpenURL Placeholder Text WorldCat [17] G. Fitzpatrick, J. Welsh, Process support: inflexible imposition or chaotic composition, in: S. Howard, Y.K. Leung (Eds.), Proceedings of OZCHI 94, Melbourne, November 1994, CHISIG of Ergonomics Society of Australia, 1994, pp. 147–152. [18] Copas C.V. Edmonds E.A. , Executable task analysis: integration issues Cockton G. Draper S.W. Weir G.R.S. People and Computers IX (HCI ’94) 1994 Cambridge University Press , Cambridge 339 – 352 OpenURL Placeholder Text WorldCat [19] Green M. , A survey of three dialog models , ACM Transactions on Graphics July ( 1986 ) 244 – 275 Google Scholar Crossref Search ADS WorldCat [20] Olsen D.R. , Propositional production systems for dialog description Carrasco J. Whiteside J. Proceedings of the Conference on Human Factors in Computing Systems (CHI ’90), Seattle, WA 1990 ACM Press , New York 57 OpenURL Placeholder Text WorldCat [21] Sukaviriya P.N. et al. , A second generation user interface design environment: the model and the runtime architecture Ashlund S. Mullet K. Henderson A. Hollnagel E. White E. et al. Proceedings of Interchi ‘93, Amsterdam 1993 ACM Press , New York 375 OpenURL Placeholder Text WorldCat [22] Cockton G. , Interaction ergonomics, control and separation: open problems in user interface management , Information and Software Technology 29 ( 4 ) 1987 ) 176 – 191 Google Scholar Crossref Search ADS WorldCat [23] Murata T. , Petri nets: properties, analysis and applications , Proceedings of the IEEE 77 ( 4 ) 1989 ) 541 – 580 Google Scholar Crossref Search ADS WorldCat [24] Palanque P.A. Bastide R. , Petri Net based design of user-driven interfaces using the interactive cooperative objects formalism Proceedings of the First Eurographics Conference on Design. Specification and Verification of Interactive Systems (DSV-IS ‘94), Bocca di Magra, Italy 1994 Springer , Berlin OpenURL Placeholder Text WorldCat [25] Palanque P.A. Bastide R. , Formal specification and verification of CSCW using the interactive cooperative object formalism Kirby M.A.R. Dix A.J. Finlay J.E. People and Computers X (HCI ‘95) 1995 Cambridge University Press , Cambridge 213 – 232 OpenURL Placeholder Text WorldCat [26] D. Zhang, ROPES: a tool for generating robot plans, in: Proceedings of 16th Annual Conference of IEEE Industrial Electronics Society (IECON ‘90), Vol. 1, (Pacific Grove, CA, November 1990), IEEE, Los Alamitos, CA, 1990, pp. 210–215. [27] Anglano C. Portinale L. , B-W analysis: a backward reachability analysis for diagnostic problem solving suitable to parallel implementation Valette R. Proceedings of the 15th International Conference on Application and Theory of Petri Nets, Zaragoza, Spain 1994 Springer , Berlin 39 OpenURL Placeholder Text WorldCat [28] Hartson H.R. et al. , The user action notation: a user-oriented representation for direct manipulation interfaces , ACM Transactions on Information Systems 8 ( 3 ) 1990 ) 181 – 203 Google Scholar Crossref Search ADS WorldCat [29] Newell A. Simon H.A. , Human Problem Solving 1972 Prentice-Hall , Englewood-Cliffs [30] Blandford A. Young R.M. , Developing runnable user models: separating the problem solving techniques from the domain knowledge Alty J.L. Diaper D. Guest S. People and Computers VIII (HCI ’93) , 1993 Cambridge University Press , Cambridge 111 – 121 OpenURL Placeholder Text WorldCat [31] Borkoles J. Johnson P. , ETKS: generative task modelling in user interface design Shriver B.D. Proceedings of Hawaii International Conference on System Sciences, Kailua-Kona 2 1992 IEEE Computer Society Press , New York 699 OpenURL Placeholder Text WorldCat [32] Palanque P.A. et al. , Validating interactive system design through the verification of formal task and system models Bass L. Unger C. et al. Proceedings of IFIP Workshop on Engineering for Human–Computer Interaction (EHCI ’95), Grand Targhee, WY 1995 Chapman and Hall , London OpenURL Placeholder Text WorldCat © 2000 Elsevier B.V. All rights reserved.
TI - Intelligent interfaces through interactive planners
JF - Interacting with Computers
DO - 10.1016/S0953-5438(99)00007-7
DA - 2000-07-01
UR - https://www.deepdyve.com/lp/oxford-university-press/intelligent-interfaces-through-interactive-planners-0HRsVd2Odi
SP - 545
EP - 564
VL - 12
IS - 6
DP - DeepDyve
ER -