Mar 02 2005
Natural Interfaces: human centric interfaces for the post-mouse era (UK)

In 1988 Apple Computer - famous for its Macintosh - produced some video scenarios to show how future computers would have been able to understand hands gestures, read text, and answer to voice commands. None of these computers had any trace of a mouse!
(short-talk @ TorVergata University 12/10/04 Rome)
Translated by Riccardo Cambiassi
Introduction
In the video scenarios that Apple produced between 1987 and 1989 there’s one thing that hit me the most: the company who taught the world what a mouse is, thought that in future computers that same mouse wouldn’t fit.
So, the Knowledge Navigator - probably the first tabletPC in history - was switched on by a gesture (by phisically opening it), was able to understand and answer through a voice interface and was equipped with a touchable screen.

Just after that, in the Future-Shock video-scenario, the computer was a simple transparent screen (somehow close to the last version of iMac designed by Ive) with gestures and voice recognition.

In this mouseless revolution, the human user communicates at
last in a natural way and not through peripherals and input-devices built by/for
computers. This total change of interaction metaphore would be useless if it
was not an enabler useful for “making it possible to do different things than before”.
Here’s where the concept of digital assistant - a kind of secretary to whom delegate
the management of all our digital data - conquers a central position in Apple’s concept.
Since then, we are still waiting.
Video-scenarios on the web (they are very slow and only available for download)
-
Knowledge
Navigator (14,1Mb) - Future
Shock (45Mb)
— TOP
First quality of an assistant is to SEARCH
Nowadays, on the web, there is only one true King, and this is GOOGLE.
This was somehow foreseeable, since fast access to and organization of
knowledge - the first step toward the emergence of “intelligent” behaviour
- steps through our ability to retrieve information.
The ability to SEARCH is therefore the first stadium of the problem of knowledge handling the web era. The point is that the input device and the way we are used to interact with computers is not neither voice nor gesture recognition, but mouse and windows.
Well, managing the growing complexity of information stored in our PCs and on
the Net with such primitive tools is becoming a PROBLEM, so embedding a search
engine into existing operating systems will be the next challenge (once
again lead by Apple).
SpotLight
The new system-wide search tool featured in Tiger (Mac OS X 10.4), helps user to find everything they are looking for on their Mac. SpotLight can find e-mail messages, calendars and contacts, and data, movies, images… basically every kind of document.
SpotLight’s results are shown in intuitive categories that help the user browse, choose and eventually click on the desired information.

While seeing screenshots, presentations and testing SpotLight I felt a twofold
emotion. On one side, I said: “at last!”. At last I’ll be able to inore
WHERE I save my files, HOW I call them (computer-centric vision) and just worry
about their content or when I created them (little more human centric vision).
On the other hand I also thought: “How complicate!”.
Yes because SpotLight, though simple to use, nevertheless stresses the use of
menu, search box, pop-up etc. All those components of the interface that I define “low-level
interactors” and thus more complex.
So, from my perspective, SpotLight is a blessing, and its limit is - actually
- the same limit of all graphical interfaces based on the desktop metaphor. - SpotLight
on Apple website.
Note: Maybe you think that desktop search could find other ways and that - maybe - it’s just SpotLight who’s not been properly designed. Well, try and find IF and HOW the next heir to Windows XP will try to solve this problem, and you’ll find that SpotLight is already better than Redmond promises.
On the other hand, the ultimate solution could come from who doesn’t know much of Operating Systems, but knows more than anybody else about search engines: Google.
Google DESKTOP SEARCH is a software that lets users incorporate on their PC the technlogy Google uses to search websites. Screenshot

Google Desktop Search allows to search for contents created with mainstream applications
(by now ONLY Microsoft ones) and puts Google company in direct competition with
Microsoft. At the same time, Bill Gates & co, are trying to undermine Google
supremacy in the search engines business…
The search is presented through
a web interface through the default system browser. The google server uses a
local address 127.0.0.1 and the port 4664 to do its search business on the machine.
From Microsoft’s point of view, a fearsome parasite lurking in the very heart
of its operating system. …thus the war that could lead to the first digital
assistant, is officially begun.
although the remarkable premises of SpotLight, Google Desktop Search, etc (also
on matter of ease of use)… I do not believe that this was the “future” as thought
by PARC researchers when they invented graphical interfaces, or the Macintosh
team that perfectioned them and make them useful to the people. This stuff is
still too complex. It enrages on the inner limit of the desktop metaphor and
is held captive by a jailer input-device called MOUSE!
— TOP
The three pillars of interfaces & natural
interfaces
The thesis of this paragraph is the following one: the mouse not only determines the realtionship between man and machine, but heavily influences those aspects pertaining to the application level and - in a more general way - the development of computer science itself. At the same time, starting from public interfaces built by Italian engeneer Alessandro Valli, I’ll show how it is possible to forget about the mouse and live a happy life.
A personal (and thus debatable) re-reading of Human Computer Interaction
literature, convinced me that an interface is made up by “n” different factors, determined by “machine”,
man, context, role, etc. The holistic approach of Interaction Design. …
But if I had to talk of interface by the meaningof what I design everyday,
then I would define the interface (as suggested by this article of the excellent Paolo Salone) a system based on three main pillars:
- Input Device (or the tool through which our commands intended for the machine are practically mediated)
- Interaction Technique (or those basic operating procedures that let me - through the input device - make specific operations)
- Task (the ultimate goal of the action that the user can reach through the use of the machine)
First of all, the complex task is determined by the sum of the different actions made possible by the available interaction techniques, and since these depend directly from the input device, this is the measure - both in positive and in negative - of what is possible to achieve with a give technology. And actually, beyond any accademic conjecture, what people perceive of any given computer related technology is - before anything else - its interface.

Can we think that the mouse is the sign of the “present day” computer,
as natural interaction will be for future computers?
Well, this is more or less my thesis. Can you imagine Spock&Kirk
that - while travelling on the Enterprise - play with a mouse? and HAL with
a mouse? BladeRunner with a mouse? And what about the Millenium Falcon? Let’s
face it: in no model of the future we’d like to bring the mouse with us. …maybe
it’s matter for shrinks, but it’s like that.
a human centric and usable technology. We get used to the mouse as time went
by, but the gestures and the eye-hand coordination that it forces us to develop
cannot be defined “natural” (to
the point that in less young generations the true difficulty in the usage of
PC comes from the lack of a proper coordination). The “good” side of the mouse is its being standard.
Its unvaluable contribute has been that of something that carried mankind toward
the digital age. But its work eventually will end where natural interfaces
will begin working properly. So it’s the time to start showing some real alternative.
The words “natural interaction” has been presente on the Net for a while, but
it came to my ear mainly due to one specific man: Alessandro
Valli. The team
whith which Alessando works in Florence, often measured itself against the
problem f public interfaces built in peculiar locations (such as museums),
and with the need to allow simultanous and multiple access to the information
offered by the system. The solutions applied are those tied to the recognition of the position of the people who want to interact with the system, or their own gestures. Those who meet this kind of interfaces DO NOT need any training, since it’s the system itself that offers hints and/or - while other people are using it - to start those emulation mechanisms so useful in learning.

These systems not only “humanize” the type of interaction needed from people
to enjoy the information load, but humanize the systems themselves, often giving
them both more human-like appearance and behaviour. Some of the works of Alessandro Valli
then read this old article of mine: KILLER WIRELESS MUSEUM. While if you are thinking that only some contexts are not suitable for traditional PCs and that - after all - there are no alternatives… well, keep on reading this article.
— TOP
Eye commanded computers
The first thing that was necessary to teach to people when the mouse was introduced,
wah the concept of SELECTING an item/voice to interact with. This selection
in advance was translated in an interaction technique called POINT and CLICK.
That is: the selection of an specific item/voice is done moving the POINTER
ver the command/voice e.g. highlighting through “simple” single CLICK (through
the mouse button).
Well, natural interfaces I’ll show in this paragraph are based on the separation of pont and click, and on the opportunity to highlight/preselect an item/voice present in the GUI by simply looking at it. In a second momentit will be possible to associate a specific action, through a CLICK that can happen in a number of way, like: voice command, special external button, key on conventional keyboard, etc.
If you think this is difficult, believe me if I say that it is less difficult
than learning how to coordinate eye and hand to move the mouse pointer on the
screen. In fact, wherever on the interface you want to go with the mouse, you
first have to look at it.
Well, with these new interfaces “to look at it” is the same as “reach it and preselect it”. You just have to choose if to “click” it
or not.
As we already know from the “three pillars of interface”,
changing the input-device influences every aspect of the interface. For example,
all prototypes and systems that we build to be eye-controlled, there is no pointer at all. No little arrows nor other graphical mediations. You look at something and it will react!

IMAGE ABOVE - on the left: the joint use of a keyboard’s key and the stare recreates the classic POINT and CLICK.
On the centre: the system monitors the eye (if they’re in its visual field) and understands WHAT the person is looking at on the screen, giving an immediate feedback/preselection for every item that has been look at.
On the right: …I’m testing the system keeping my hands under my jaw.
HERE BELOW: one of the first experimental interfaces for managing a whole house.
This project was realized for Fraunhofer Institute and aimed at people with minor
movement dishabilities. Thanks to later (independent) projects we were able to
prototype highly usable systems for people with serious movement disabilities. Write me for more info.

By now, these interfaces are based on a quite expensive hardware: a Tobii Eye-Tracker. This DOES NOT allow the immediate scalability of this solution to a consumer market, but I think it’s extremely important, both for vertical markets and - in a more generale way - to give a tangible thrust toward the research/creation of new, more natural interactive metaphors.
— TOP
CONCLUSION: But who needs
a new computer?
The neverending change in
ITC is bound to a game of opposite forces.
On one side, HW/SW manufacturers who HAVE TO sell newer and newer things,
but who CANNOT take the risk to change too much what already has been a success.
In this world, marketing (or at least the B-class of it) chooses first the deadline
and then the content. …Debugging and support will be done directly by the customers.
On
the other side there are people. Some of them already with no illusion about
constantly following “innovation”.
Others still fully seduced.
So, we find ourselves in a world where we can have
optic fibre at home and - to write a letter - still use the “everpresent” WORD
11 (that for the shame, has been renamed WORD 2004). That is - actually - the
result of a progressive deterioration of WORD 4. And, we write our letters on
a QWERTY keyboard, whose layout was invented to prevent the old writing machines’
letter sticks from jamming when typing too fast.
hit a dead end.
From my point of view, today we have the machine power and the technologies needed
to start re-thinking the future of computers. Why not, maybe starting from those
video-scenarios and digital assistants thought almost twenty years ago by the
geniouses at Apple Computer.
…of course, until some time ago, maybe somebody could have make me think that “technology
is not yet ready”, but now it’s not true anymore.
I’ve shown some tangible examples, and I can add more. A crew of eight people
working with me at SR LABS in Milan, with a simple desktop computer bought at
CDC, integrated an Eye-Tracker TOBII and a voice recognition system that you
can get anywhere. Now, that same computer, answers to voice commands (with a
predefined dictionary), and associates these commands to a VISUAL SELECTION that
the user is doing at the moment. *Our system knows if somebody is in front of
him, and reacts consequentially to actions indicated by the eyes and voice (e.g.
it is able to do a voice controlled copy/paste, where the WHAT to copy and WHERE
to paste it is indicated by the eyes).
to create all this, what could the mythical california based startups, or the
3000 researchers at Microsoft do, if only they work in the right direction?
Which is the direction? Obviously the one leading to human-centric technologies
and natural interfaces. The IT with Human Face whose muse was the early
missed Micheal Dertouzos (here
some articles n Business 2.0), already chief at MIT and author of the book “The
Unfinished Revolution“.
I know that writing this I jump over a number of logical steps and I don’t pretend
to be clear, but I start thinking that the ultimate computer WILL NOT
be a computer!
Let me do just another note: All interfaces and scenarios I’ve shown are
MADE IN ITALY. Someone in the high spheres please wake up! Since innovation trains
keep on leaving.
- Leeander
— TOP
leeander, Leeander
No responses yet


Leandro Agrò - 10+ anni di Design & Management
(short