J2B2 - Human and Robot Interaction


structuring of the software

We develop control software for the castor-wheeled differential drive robot J2B2. Our emphasis is on human and robot interaction in the common places of the laboratory, the Automation Technology Laboratory of the Helsinki University of Technology.

The main conducts of the robot are: following a person, autonomous navigation to any point in the laboratory, and mapping of colored objects on the floor. To switch between the states, a person presses the bumper of J2B2 on one of the four sides.

The robot has several means of feedback: Appropriate lines are audio synthesized at a pleasant rate. Besides from driving, the robot orients the pair of cameras at the source of attention. The camera behaviour makes the robot appear cute and friendly.

The highlights of the implementation are localization, routing, and tracking of motion using the laser range finder. All algorithms are real-time, which is appropriate in a dynamic environment. We layout the code structure in the graphics to the right. The number in brackets corresponds to the lines of code associated to the respective feature.

Neither of the laws of robotics by Isaac Asimov are rigorously obeyed by J2B2. The robot may injure infants and, through inaction, may allow a human being to come to harm.

J2B2 Encore - source code and resources (C using SDL) * j2b2_encore.zip 350 kB
Gestures Generator (MATLAB) orth_gestures.m 3 kB
* thanks to Fee Heitland und Matthias Elsdörfer for writing the lyrics of J2B2.
** dedicated to Andrew Ladd. j2b2GIMICtrl by Antti Maula, and SDL_gfx by Andreas Schiffler. The Festival Speech Synthesis System takes care of the audio feedback. Software developed for a 3 GHz Pentium 4.

The project supersedes the deliverables of the Field and Service Robotics Competition. Many thanks to Jari Saarinen, Antti Maula and Tapio Leppänen for maintaining the robot.

One way is to make it so simple
that there are obviously no deficiencies,
and the other way is to make it so complicated
that there are no obvious deficiencies.
Charles Antony Richard Hoare

Perception


J2B2 in the hallway of the laboratory

tracking of a walking person

utilization of the cameras

calibration of palette

camera images projected on ground

alphabet of nearly orthogonal gestures

recording a gesture

histogram equalization

Resampling of laser range finder data

The laser range finder SICK Laser Measurement Sensor 200 provides a distance profile of the reflective geometry in a 180° field of view at an refresh rate of 10 Hz. The output is accurate up to 1 cm, unless the obstacles are glass, or certain cloth that is too close to the sensor.

The following criteria allows us to assume that there is geometry between two consequtive sample points

Then, we resample each connected component using linear interpolation.

The resampled profile of the laser range finder is suitable for mapping geometry into the map.

Tracking of humans

We wish to make J2B2 interact with humans. Therefore, an important task is to detect humans and to track their individual movements.

We detect motion by analyzing a sequence of the most recent consecutive profiles of the laser range finder. Motion has occured when

The method accounts for the sampling rate of the range profile, the circumference and the maximum walking velocity of a human.

The laser range finder is superior over the cameras: The camera images are blurred when the robot is in motion itself. Besides, the field of view of the cameras is roughly 40 degrees, and any reorientation of the pan-tilt unit causes blur for a short period itself.

An issue aside: When a person approaches the robot on a straight line, generally, the robot cannot classify the terrain behind the human as free or occupied. However, the software utilizes a global map to copy free space into these obscured spots.

Marker color detection

The cameras installed on J2B2 represent the robots perspective. The maximum rate to acquire the images is 8 Hz.

All image processing is based on one efficient method: color identification according to an adaptive palette stored in a hashtable, and grouping of pixels areas using the disjoint-set data structure. However, the interpretation depends on the tilt of the camera:

We achieve object recognition by color matching. This imposes that the objects which we wish to detect have a distinct color from all other entities in sight.

The palette is calibrated in an interactive procedure demonstrated in the video. The picture below shows two example palettes.

 

Gesture recognition

The term gesture refers to a sequence of camera recorded points in image coordinates over a short period of time. The robot is aware of an alphabet of gestures, to which is matches any recorded gesture. The classification should be invariant on the overall scale, and speed.

We are interested in a set of curves for

of length each, that are arc-length parametrized

and have pairwise orthogonal coordinate functions

We are not aware of closed-form solutions to the problem. Instead, we grow ourselves an alphabet of gestures by randomly searching a space of cubic spline curves with a fixed number of control points. We arc-length parametrize the curves and align them around the origin, and then check for orthogonality.

J2B2 distinguishes the four gestures above (depicted from human perspective).

Histogram equalization

At a dim ambient light, the camera images are inconceivable to the human eye. Therefore, we equalize the histogram of the camera image. The red, green, and blue values are accumulated into one common distribution. Depending on the monitor, the resulting image typically is an enhancement to the user interface. On the other hand, the new image does not affect the decisions of the robot.

For a successful technology,
Reality must take precedence over public relations,
for Nature cannot be fooled.
Richard Phillips Feynman

Conception


laboratory map, 2cm resolution

8cm resolution with routing tree

global routing

local routing along global path

vizualization of conception of robot

path using safeguards

Prerecorded map of laboratory

J2B2's world is the Automation Technology Laboratory. That is a hallway with many offices, a kitchen, and connecting doors that lead to other universes. A high resolution prerecorded map provides a lot of solutions:

Localization and mapping

Using the fresh data of the laser range finder, we determine the change in position of the robot in a multi-resolution fashion. We are looking for the best fit of the distance profile to the previous map of obstacles.

Initially, we displace the robot corresponding to the previous angular and translational velocity. Relative to the expected position, we perform an iterative fitting method with decreasing step sizes:

The step sizes decrease by a factor of 0.6. On each level, we repeat the search until we reach a local minimum.

Path planning in the laboratory

A typical assignment of the robot is to navigate to another location in the laboratory that might be up to 40 meters away. For instance, we ask J2B2 to move from the kitchen to the student office of Space Science and Technology.

The software utilizes a predefined graphical routing tree to layout a path between two coordinates in the laboratory. The global path provides beacons for the local routing and path planning. The routing tree is shown to the right.

Local routing and path planning

The local routing algorithm computes shortest paths from a single target location to all other points in the vicinity that are physically reachable by the robot. The terrain is either classified as accessible or as inaccessible depending on the distance to the nearest obstacle. The safety margin is 30 cm, while J2B2 has a radius of roughly 24 cm. Clearing the footprint of the robot prevents J2B2 to accidentally get stuck inside the margin region.

Within the area of accessible terrain, routing reduces to a flood-fill algorithm. To approximate the true distance, we breadth-first search the 16 nearest neighbors. The algorithm is efficient, typically 32 ms, which is relevant when obstacles move.

The routing yields a path from the robot to the target, if the two coordinates are located in the same connected component of accessible terrain. The last point on the path to the target that is accessible by driving the robot on a straight line defines the compass direction. A raytracing procedure determines whether a point is accessible. The robot aligns and moves according to the compass.

Safeguarding through doors

The margin of the routing and the speed of the control to pursue the compass direction are chosen so that J2B2 safely overcomes narrow passages without coming into contact with obstacles.

Nevertheless, static narrow passages such as entries to rooms (open doors) are artificially narrowed in the laboratory map. J2B2 centers when driving through a door. The purpose is to increase safety, and to impress the spectator.

Sloppy coding can not always be overcome by ever faster hardware.
At some point, every programmer must take on
the obligation to write quality code.
Bob Zale

Connection


pursue the direction of drag

guidance in the laboratory

segmentation of the local coordinates

affection towards human

following human behind obstacles

Pursue the direction of drag

The software considers the robot as stopped if

If the location of J2B2 is altered despite the robot is stopped, the software considers the robot as being dragged.

An option to make J2B2 pursue a certain direction is to dispace the yielded robot into the desired direction by at least 15 cm. After pushing the rear bumper once, J2B2 continues into the direction of drag with an allowed deviation of 30 degrees to circumvent obstacles that would otherwise tangent the robot. J2B2 terminates the pursuit if

Guidance and roundtrip

Using the routing tree, the software lays out beacons between the current position and any destination in the laboratory. We have defined four places of interest: the student office, the central place of the laboratory, the kitchen, and the electronics room. A target is selected by pressing the bumper of one of the four sides. The aim of J2B2 is to reach the destination. The robot gives up if the bumper is pressed, or the path planning fails for a certain period of time.


The roundtrip state arranges the four landmarks in a loop. In addition, we demand that J2B2 returns to the starting point if possible. The wireless network does not cover the area beyond the four landmarks. Thus, we consider the roundtrip to explore all of the terrain accessible to the robot.

Affection towards human

J2B2 follows, backs up from, and realigns to a human. The action is chosen based on the position of the human with respect to the robot, see the figure aside.

When the human is perceived in the corridors a, the cameras are directed to the ground and the robot will turn towards the human until it faces the human.

When the human approaches the robot as close as b, the robot will backup as far as the map allows until the human is in c again.

When the location of the human is in corridor c, the cameras are directed towards the human. The pan and tilt are adjusted at a resolution of about 4 degrees.

When the distance between robot and human is above a certain threshold as it is the case in d, the robot plans a path to the location of the human and drives until the human is located in c again.

In any case, if the human is obscured by static geometry, for instance, when walking into an alley, the robot plans a path to the last point of perception and drives in the vicinity of this point.

Urge to recharge

Based on the operation time and the accumulated drive commands, our software estimates the state of the batteries. The batteries discharge in 20 to 40 minutes. When the batteries are low, the robot drives back to the room with the power supply and signals for help. After charging for about 20 minutes, the robot indicates readiness yet again.

We organize these states in a state machine. For simplicity, the states are arranged around an idle mode. To select a state, typically, the human presses one of the four bumper sides.

Netting

Murphy's law applies: In one instance, the robot was driving down the hallway at about 0.2 m/s. A computer administrator terminated the server program that hosts the control to J2B2. He did not check whether J2B2 was in operation during that time. The precaution to stop the wheels after a certain communication timeout is not implemented on the robot but in the server program, so the robot kept driving. The emergency-off-button did not interrupt the supply voltage to the motors due to an earlier modification of electronics. Finally, the main power-switch stopped J2B2.

Man becomes man only by his intelligence,
but he is man only by his heart.
Henri Frederic Amiel

Final remarks

Recycling

The localization, mapping, and routing algorithm are particularly reliable and efficient. To load a prerecorded map is optional. The cameras are calibrated to map objects located on the floor into the coordinate system of the robot. This is a convenient starting point, to adapt the source code to other projects that focus on high-level action.

Jari Saarinen uses the routing algorithm in a separate code base.

Félix Cabrera Garcia adapts the code for practical work of his Master Thesis.

Tran Duy Vu Bui installs an extra camera on J2B2 and, in several test runs, compares his visual odometry implementation to the localization method of our code.

There are plans to reuse the camera image-to-floor mapping.

The essence of science is cumulative.
Sharing is caring.

Code snippet

To give an impression, we quote the implementation of the affection state. When J2B2 is in affection mode, the code excerpt is executed about 6 times per second.

case pSTATE_AFFECTION: {                                          // SPEED, PAN, TILT
  if (pRegister==0) {
    pSTATE_TRY(20000);
    jTog[jTOG_BUMPER_COUNTING]=false;                             // when bumper hit, exit affection state
    jTog[jTOG_READY_FOR_MOVER]=true;                              // declare pMover if no pMover and some human available
    pRegister++;
  }
  jSetSpeed*=0;                                                   // dont move unless there is a reason            
  matrix<float> rob(3,1);
    rob.data[0]=cPos.data[6];
    rob.data[1]=cPos.data[7];
    rob.data[2]=1;
  cTarget=rob;
  if (pMover) {                                                   // BEGIN: robot has humanoid master
    pSTATE_TRY(20000);                                            // grant state 20 
    matrix<float> mov(3,1);
      mov.data[0]=pMover->pos.data[6]-pPos.data[6];
      mov.data[1]=pMover->pos.data[7]-pPos.data[7];
      mov.data[2]=1;
    float dst=(rob-mov).norm(2);
    matrix<float> pos=cInv*mov;
    float hum=atan2(-pos.data[0],pos.data[1]);                    // angle for alignment (with respect to front, left is positive)
    if (!pMOVER_VISUAL||pDIST_FOLLOW<dst) {                       // BEGIN: robot does not have visual to human, or human is far away
      if (jTog[jTOG_PATH_PLANNED]) {                              // previous target was accessible by robot
        int pix,piy;
        matrix<float> yme;                                        
        while (pMover->track.token(yme)) {                        // make latest available beacon of mover the new target
          pix=(int)(jMAP_X+(yme.data[7]-pPos.data[7])*jGND2MAP);
          piy=(int)(jMAP_Y+(yme.data[6]-pPos.data[6])*jGND2MAP);
          if (0<pix&&pix<jMAP_W&&0<piy&&piy<jMAP_H)
            if (cRouteDist[pix+piy*jMAP_W]<cTERRAIN) {
              cTarget.data[0]=yme.data[6]-pPos.data[6];
              cTarget.data[1]=yme.data[7]-pPos.data[7];
              break;
            }
        }
        pMover->track.next[2]=&pMover->track;                     // reset track tokenizer          
        if (0.4<cPlanDist) {                                      // follow path until default proximity  
          matrix<float> com=cInv*cCompass;
          float ang=atan2(-com.data[0],com.data[1]);              // angle for driving (with respect to front, left is positive)
          jSetSpeed.data[0]=jSPEED0_SLW*MAX(0.0f,1.0f-pGainFwd*ang*ang);
          jSetSpeed.data[1]=pGainInt*ang-pGainPos*jVel.data[0];
          if (!pMOVER_VISUAL)                                     // mover not visible while approximating
            pMover->stamp=jTic[jTOC_LAST_SENSORS]-pMOVES_GRAVEYARD+1000; // artificially keep mover alive
        }
      }
      if (pPTU_GRACEFUL) {
        jSetPTU.data[0]=-hum*1.2f;                                // look into target direction, MAGIC CONST
        if (pMOVER_VISUAL)                                        
          jSetPTU.data[1]=jRad2Til(atan(0.60f/MAX(0.30f,dst)));   // tilt cameras to look at mover
        else
          jSetPTU.data[1]=jPTUTIL_ABASHED;                        // indicate that human not visible
      }
    } else {                                                      // ELSE: human is visual and closer than pDIST_FOLLOW
      if (jTic[jTOC_STOP_ALIGNING]<jTic[jTOC_LAST_SENSORS]) {     // BEGIN: human is in vicinity of robot
        if (fabs(hum)<pDIST_WINDOW) {                             // human is in 35 field of view of robot
          matrix<float> prt=pLookInto(20,-cPos.data[3],-cPos.data[4]);
          if (dst<=pDIST_BACKUP&&0.55f<prt.norm(2)) {
            matrix<float> ret=cInv*prt;
            float ang=atan2( ret.data[0],-ret.data[1]);           // angle with respect to front, left is positive
            jSetSpeed.data[0]=-jSPEED0_SLW*MAX(0.0f,1.0f-pGainFwd*ang*ang);
            jSetSpeed.data[1]= pGainInt*ang-pGainPos*jVel.data[0];
            if (pPTU_GRANDMOTHER)
              if (ang<0)
                jSetPTU.data[0]=jPTUPAN_MIN;
              else
                jSetPTU.data[0]=jPTUPAN_MAX;
            jSetPTU.data[1]=jPTUTIL_ABASHED;                  
          } else {
            if (pPTU_GRACEFUL) {
              jSetPTU.data[0]=-hum*1.2f;                          // MAGIC CONST
              jSetPTU.data[1]=jRad2Til(atan(0.60f/MAX(0.30f,dst)));
            }
          }
        } else 
          jTic[jTOC_STOP_ALIGNING]=jTic[jTOC_LAST_SENSORS]+3000;              
      } else {                                                    // align mode
        jSetSpeed.data[1]=pGainAng*hum-pGainOme*jVel.data[0];
        if (pPTU_GRACEFUL) {
          jSetPTU.data[0]=-hum*0.6f;
          jSetPTU.data[1]=jPTUTIL_ABASHED;
        }
        if (20*jDEG2RAD<fabs(hum)||.1f<fabs(jVel.data[0]))
          jTic[jTOC_STOP_ALIGNING]=jTic[jTOC_LAST_SENSORS]+1500;
      }                                                           // END: human is in vicinity of robot
    }
  } else {                                                        // ELSE: robot does not have mover
    jSpeaks(jSPEAK_IDLE);
    jSetPTU.data[1]=jPTUTIL_PROFILE;                              // keep cameras low and take pictures of GND
  }
  if (pSTATE_GIVEUP||jBumper) {                                   // graceful exit
    jSpeaks(jSPEAK_FINISH);
    jSetSpeed*=0;
    jSetPTU.data[0]=0;
    jSetPTU.data[1]=jPTUTIL_NOOP;
    jTog[jTOG_READY_FOR_MOVER]=false;
    next=true;
  }
  break;
}
One notices that the whole play is played on close intervals (primas, secundas).
It is difficult to play such a composition because to keep the rythm
the player has to change fingers very quickly on very limited space.
The overall idea: Bach is basically saying "keep doing it, till she comes to major"
Michal Hoc