ARTIFICIAL PASSENGER


The AP is an artificial intelligence–based companion that will be resident in
 software and chips embedded in the automobile dashboard. The heart of the system is a
 conversation planner that holds a profile of you, including details of your interests and
 profession.

           A microphone picks up your answer and breaks it down into separate words with
 speech-recognition software. A camera built into the dashboard also tracks your lip
 movements to improve the accuracy of the speech recognition. A voice analyzer then
 looks for signs of tiredness by checking to see if the answer matches your profile. Slow
 responses and a lack of intonation are signs of fatigue.

          This research suggests that we can make predictions about various aspects of driver
 performance based on what we glean from the movements of a driver’s eyes and that a
 system can eventually be developed to capture this data and use it to alert people when
 their driving has become significantly impaired by fatigue.                                   
.           The AP is an artificial intelligence–based companion that will be
resident in software and chips embedded in the automobile dashboard. The heart of the
system is a conversation planner that holds a profile of you, including details of your
interests and profession.

              A microphone picks up your answer and breaks it down into separate words with
 speech-recognition software. A camera built into the dashboard also tracks your lip
 movements to improve the accuracy of the speech recognition. A voice analyzer then
 looks for signs of tiredness by checking to see if the answer matches your profile. Slow
 responses and a lack of intonation are signs of fatigue.

          This research suggests that we can make predictions about various aspects of driver
 performance based on what we glean from the movements of a driver’s eyes and that a
 system can eventually be developed to capture this data and use it to alert people when
 their driving has become significantly impaired by fatigue.

                             The natural dialog car system analyzes a driver’s answer and the
 contents of the answer together with his voice patterns to determine if he is alert while
 driving. The system warns the driver or changes the topic of conversation if the system
 determines that the driver is about to fall asleep. The system may also detect whether a
 driver is affected by alcohol or drugs.

2.1  What is an artificial passenger?

·         Natural language e-companion.

·         Sleep preventive device in cars to overcome drowsiness.

·         Life safety system.

2.2  What does it do?

·         Detects alarm conditions through sensors.

·         Broadcasts pre-stored voice messages over the speakers.

·         Captures images of the driver.

·       3.1   Field of invention
     


                                 The present invention relates to a system and method for determining three

        
·          dimensional head pose, eye gaze direction, eye closure amount, blink detection and flexible
·          
·          feature detection on the human face using image analysis from multiple video sources.
·          
·          Additionally, the invention relates to systems and methods that makes decisions using
·          
·          passive video analysis of a human head and face. These methods can be used in areas
          
·         of application such as human-performance measurement, operator monitoring and
        
·         interactive multi-media. 



·       3.2   Background of the invention
    


             Early techniques for determining head-pose used devices that were fixed to the

         
·            head of the subject to be tracked. For example, reflective devices were attached to the subjects
       
·          head and using a light source to illuminate the reflectors, the reflector locations were determined.
         
·            As such reflective devices are more easily tracked than the head itself, the problem of tracking
  
·            head-pose was simplified greatly.
·         


                         Virtual-reality headsets are another example of the subject wearing a device for

·          
·          the purpose of head-pose tracking. These devices typically rely on a directional antenna and
·          
·          radio-frequency sources, or directional magnetic measurement to determine head-pose.
·          


                        Wearing a device of any sort is clearly a disadvantage, as the user's competence

·          
·          and acceptance to wearing the device then directly effects the reliability of the system. Devices
·          
·          are generally intrusive and will affect a user's behaviour, preventing natural motion or operation.
·          

Structured light techniques that project patterns of light onto the face in order to determine head-

·          
·         pose are also known. 


                       The light patterns are structured to facilitate the recovery of 3D information using

·          
·          simple image processing. However, the technique is prone to error in conditions of lighting
·          
·          variation and is therefore unsuitable for use under natural lighting conditions.



3.3  Examples of systems that use this style of technique

·        
·          
·                                      Examples of systems that use this style of technique can be seen in "A Robust
·          
·          Model-Based Approach for 3D Head Tracking in Video Sequences" by Marius Malciu  and
·          
·          Francoise Preteux, and "Robust 3D Head Tracking Under Partial Occlusion" by Ye Zhang and
·          
·          Chandra Kambhamettu, both from Conference of Automatic and Gesture Recognition 2000,
·          
·          Grenoble France. 




4.1  Devices that are used in AP 

The main devices that are used in this artificial passenger are:-

1)      Eye tracker.

2)      Voice recognizer or speech recognizer.

4.2  How does eye tracking work?

              Collecting eye movement data requires hardware and software specifically designed to
 perform this function. Eye-tracking hardware is either mounted on a user's head or mounted
 remotely. Both systems measure the corneal reflection of an infrared light emitting diode (LED),
 which illuminates and generates a reflection off the surface of the eye. This action causes the
 pupil to appear as a bright disk in contrast to the surrounding iris and creates a small glint
 underneath the pupil . It is this glint that head-mounted and remote systems use for calibration
 and tracking.

4.2.1 Hardware: Head-mounted and remote systems

             The difference between the head-mounted and remote eye systems is how the eye
 tracker collects eye movement data. Head-mounted systems , since they are fixed on a user's
 head and therefore allow for head movement, use multiple data points to record eye movement.
 To differentiate eye movement from head movement, these systems measure the pupil glint from
 multiple angles. Since the unit is attached to the head, a person can move about when operating
 a car or flying a plane, for example.
                     For instance, human factors researchers have used head-mounted eye-tracking
 systems to study pilots' eye movements as they used cockpit controls and instruments to land
 airplanes (Fitts, Jones, and Milton 1950). These findings led to cockpit redesigns that improved
 usability and significantly reduced the likelihood of incidents caused by human error. More
 recently, head-mounted eye-tracking systems have been used by technical communicators to
 study the visual relationship between personal digital assistant (PDA) screen layout and eye
 movement.
                       Remote systems, by contrast, measure the orientation of the eye relative to a fixed
 unit such as a camera mounted underneath a computer monitor . Because remote units do not
 measure the pupil glint from multiple angles, a person's head must remain almost motionless
 during task performance. Although head restriction may seem like a significant hurdle to
 overcome, Jacob and Karn (2003) attribute the popularity of remote systems in usability to their
 relatively low cost and high durability compared with head-mounted systems.
                       Since remote systems are usually fixed to a computer screen, they are often used
 for studying onscreen eye motion. For example, cognitive psychologists have used remote eye-
tracking systems to study the relationship between cognitive scanning styles and search
 strategies (Crosby and Peterson 1991). Such eye-tracking studies have been used to develop
 and test existing visual search cognitive models. More recently, human-computer interaction
 (HCI) researchers have used remote systems to study computer and Web interface usability.
                         Through recent advances in remote eye-tracking equipment, a range of head
 movement can now be accommodated. For instance, eye-tracking hardware manufacturer Tobii
 Technology now offers a remote system that uses several smaller fixed sensors placed in the
 computer monitor frame so that the glint underneath the pupil is measured from multiple angles.
 This advance will eliminate the need for participants in eye-tracking studies to remain perfectly
 still during testing, making it possible for longer studies to be conducted using remote systems.

4.2.2 Software: Data collection, analysis, and representation

                    Data collection and analysis is handled by eye-tracking software. Although some
 software is more sophisticated than others, all share common features. Software catalogs eye-
tracking data in one of two ways. In the first, data are stored in video format. ERICA's Eye
 Gaze[TM] software, for instance, uses a small red x to represent eye movement that is useful for
 observing such movement in relation to external factors such as user verbalizations. In the other,
 data are stored as a series of x/y coordinates related to specific grid points on the computer
 screen.
                 Data can be organized in various ways--by task or participant, for example—and
 broken down into fixations and saccades that can be visually represented onscreen. Fixations,
 which typically last between 250 and 500 milliseconds, occur when the eye is focused on a
 particular point on a screen . Fixations are most commonly measured according to duration and
 frequency. If, for instance, a banner ad on a Web page receives lengthy and numerous fixations,
 it is reasonable to conclude that the ad is successful in attracting attention. Saccades, which
 usually last between 25 and 100 milliseconds, move the eye from one fixation to the next fixation
 . When saccades and fixations are sequentially organized, they produce scanpaths. If, for
 example, a company would like to know why people are not clicking on an important link in what
 the company feels is a prominent part of the page, a scanpath analysis would show how people
 visually progress through the page. In this case, such an analysis might show that the link is
 poorly placed because it is located on a part of the screen that does not receive much eye traffic.


5.1  Algorithm for monitoring head/eye motion for driver
alertness with one camera


                        Visual methods and systems are described for detecting alertness and

 vigilance of persons under conditions of fatigue, lack of sleep, and exposure to mind

 altering substances such as alcohol and drugs. In particular, the intention can have

 particular applications for truck drivers, bus drivers, train operators, pilots and watercraft

 controllers and stationary heavy equipment operators, and students and employees

 during either daytime or nighttime conditions. The invention robustly tracks a person's

 head and facial features with a single on-board camera with a fully automatic system,

 that can initialize automatically, and can reinitialize when it need's to and provide

 outputs in realtime. The system can classify rotation in all viewing direction, detects'

 eye/mouth occlusion, detects' eye blinking, and recovers the 3Dgaze of the eyes. In

 addition, the system is able to track both through occlusion like eye blinking and also

 through occlusion like rotation. Outputs can be visual and sound alarms to the driver

 directly. Additional outputs can slow down the vehicle cause and/or cause the vehicle to

 come to a full stop. Further outputs can send data on driver, operator, student and

 employee vigilance to remote locales as needed for alarms and initiating other actions


5.2  REPRESENTATIVE IMAGE:
                 This invention relates to visual monitoring systems, and in particular to

 systems and methods for using digital cameras that monitor head motion and eye motion

 with computer vision algorithms for monitoring driver alertness and vigilance for drivers

 of vehicles, trucks, buses, planes, trains and boats, and operators of stationary and

 moveable and stationary heavy equipment, from driver fatigue and driver loss of sleep,

 and effects from alcohol and drugs, as well as for monitoring students and employees

 during educational, training and workstation activities.

 (a) a single camera within a vehicle aimed at a head region of a driver;

(b) means for simultaneously monitoring head rotation, yawning and full eye occlusion of

the driver with said camera, the head rotation including nodding up and down, and

 moving left to right, and the full eye occlusion including eye blinking and complete eye

 closure, the monitoring means includes means for determining left to right rotation and

 the up and down nodding from examining approximately 10 frames from approximately

 20 frames; and

(c) alarm means for activating an alarm in real time when a threshold condition in the

 monitoring means has been reached, whereby the driver is alerted into driver vigilance.

               The monitoring means includes: means for determining gaze direction of the

 driver,a detected condition selected from at least one of: lack of sleep of the driver,

 driver fatigue, alcohol effects and drug effects of the driver, inializing means to find a

 face of the driver; grabbing means to grab a frame; tracking means to truck head of the

 driver; measuring means to measure rotation and nodding of the driver; detecting means

 to detect eye blinking and eye closures of the driver; yawing means to detect yawning of

 the driver.


5.3  Method of detecting driver vigilance comprises the following  steps

1)  Aiming a single camera at a head of a driver of a vehicle; detecting

                 frequency of up and down nodding and left to right rotations of the head within

                  a selected time period of the driver with the camera;


              2) Determining frequency of eye blinkings and eye closings of the driver within

                 the selected time period with the camera;


              3) Determining the left to right head rotations and the up and down head nodding

                 from examining approximately 10 frames from approximately 20 frames;


             4) Determining  frequency of yawning of the driver within the selected time

                 period with the camera;


             5) Generating an alarm signal in real time if a frequency value of the number of

               the frequency of the up and down nodding, the left to right rotations, the eye

                blinkings, the eye closings, the yawning exceed a selected threshold value.

Detailed description of preferred embodiments

            Before explaining the disclosed embodiment of the present in detail it is to be

 understood that the invention is not limited in its application to the details of the

 particular arrangement shown since the invention is capable of other embodiments. Also,

 the terminology used herein is for the purpose of description and not of limitation.

               The novel invention can analyze video sequences of a driver for determining
 when the driver is not paying adequate attention to the road. The invention collects data
 with a single camera placed that can be placed on the car dashboard. The system can
 focus on rotation of the head and eye blinking, two important cues for determining driver
 alertness, to make determination of the driver's vigilance level. Our head tracker consists
 of tracking the lip corners, eye centers, and side of the face. Automatic initialization of
 all features is achieved using color predicates and a connected components algorithm. A
 connected component algorithm is one in which every element in the component has a
 given property. Each element in the component is adjacent to another element either by
 being to the left, right, above, or below. Other types of connectivity can also be allowed.
 An example of a connected component algorithm follows: If we are given various land
 masses, then one could say that each land mass is a connected component because the
 water separates the land masses. However, if a bridge was built between two land masses
 then the bridge would connect them into one land mass. So a connected component is
 one in which every element in the component is accessible from any other element in the
 component.
           For the invention, the term “Occlusion” of the eyes and mouth often occurs when

 the head rotates or the eyes close, so our system tracks through such occlusion and can

 automatically reinitialize when it mis-tracks. Also, the system performs blink detection
and determines 3-D direction of gaze. These are necessary components for monitoring
 driver alertness.
          The novel method and system can track through local lip motion like yawning, and
 presents a robust tracking method of the face, and in particular, the lips, and can be
 extended to track during yawning or opening of the mouth.
              A general overview of is the novel method and system for daytime conditions is
 given below, and can include the following steps:
1. Automatically initialize lips and eyes using color predicates and connected
 components.

2. Track lip corners using dark line between lips and color predicate even through large
 mouth movement like yawning

3. Track eyes using affine motion and color predicates

4. Construct a bounding box of the head

5. Determine rotation using distances between eye and lip feature points and sides of the
 face
6. Determine eye blinking and eye closing using the number and intensity of pixels in the
 eye region

7. Determine driver vigilance level using all acquired information.

The above steps can be modified for night time conditions.
          The novel invention can provide quick substantially realtime monitoring responses.
 For example, driver vigilance can be determined within as low as approximately 20
 frames, which would be within approximately ⅔ of a second under some
 conditions(when camera is taking pictures at a rate of approximately 30 frames per
 second). Prior art systems usually require substantial amounts of times, such as at least
 400 frames which can take in excess of 20 seconds if the camera is taking pictures at
 approximately 30 frames per second. Thus, the invention is vastly superior to prior art
 systems.
            The video sequences throughout the invention were acquired using a video

 camera placed on a car dashboard. The system runs on an UltraSparc using 320×240 size

 images with 30 fps video.


                     The system will first determine day or night status. It is nighttime if: a

 camera clock time period is set for example to be between 18:00 and 07:00 hours.

 Alternatively, day or night status can be checked if the driver has his night time driving

 headlights on by wiring the system to the headlight controls of the vehicle. Additionally,

 day or night status can be set if the intensity of the image, is below a threshold. In this

 case then it must be dark. For example, if the intensity of the image (intensity is defined

 in many ways, one such way is the average of all RGB(Red, Green, Blue) values) is

 below approximately 40 then the nighttime method could be used. The possible range of

 values for the average RGB value is 0 to approximately 255, with the units being

 arbitrarily selected for the scale.

                         
                           If day time is determined then the left side of the flow chart depicted in

 FIG  will follow then first initialize to find face . A frame is grabbed from the video

 output. Tracking of the feature points is performed in steps. Measurements of rotation

 and orientation of the face occurs. Eye occlusion such as blinking and eye closure is

 examined. Determining if yawning occurs. The rotation, eye occlusion and yawning in

 formation is used to measure the driver's vigilance  .


                             If night time is determined, then the right flow chart series of steps

occurs, by first initializing to find the face. Next a frame is grabbed from the video

 output. Tracking of the lip corners and eye pupils is performed. Measure rotation and

 orientation of face. The feature points are corrected if necessary. Eye occlusion such as

 blinking and eye closure is examined. Determining if yawning is occurring is done. The

 rotation, eye occlusion and yawning steps in formation is used to measure the driver's

 vigilance.

DAYTIME CONDITIONS

                  For the daytime scenario, initialization is performed to find the face feature

 points. A frame is taken from a video stream of frames. Tracking is then done in stages.

 Lip tracking is done. There are multiple stages in the eye tracker. Stage 1 and Stage 2

  operate independently. A bounding box around the face is constructed and then the

 facial orientation can be computed. Eye occlusion is determined. Yawning is detected.

 The rotation, eye occlusion, and yawning information is fused to determine the vigilance

 level of the operator. This is repeated by which allows the method and system to grab

 another frame from a video stream of frames and continue again.

                       The system initializes itself. The lip and eye colors ((RED, BLUE,
 GREEN)RGB) are marked in the image offline. The colors in the image are marked to
 be recognized by the system. Mark the lip pixels in the image is important. All other
 pixel values in the image are considered unimportant. Each pixel has an Red(R),
 Green)G), and Blue(B) component. For a pixel that is marked as important, go to this
 location in the RGB array indexing on the R, G, B components. This array location can
 be incremented by equation (1)
where: sigma is approximately 2;
j refers to the component in the y direction and can go from approximately −2 to approximately 2;
k refers to the component in the z direction and can go from approximately −2 to approximately 2;
i refers to the component in the x direction and can go from approximately −2 to approximately 2.
          Thus simply increment values in the x, y, and z direction from approximately −2 to
 approximately +2 pixels, using the above function. As an example running through
 equation (1), given that sigma is 2, let i=0, j=1, and k=−1, then the function evaluates to
 exp(−1.0*(1+1+0)/(2*2*2))=exp(−1*2/8)=0.77880, where exp is the standard
 exponential function (e x ).

Equation (1) is run through for every pixel that is marked as important. If a color, or pixel
 value, is marked as important multiple times, its new value can be added to the current
 value. Pixel values that are marked as unimportant can decrease the value of the RGB
 indexed location via equation (2) as follows:


exp(−1.0*( j*j+k*k+i*i )/(2*(sigma−1)*(sigma−1))). (2)



where: sigma is approximately 2;

j refers to the component in the y direction and can go from approximately −2 to approximately 2;
k refers to the component in the z direction and can go from approximately −2 to approximately 2;
i refers to the component in the x direction and can go from approximately −2 to approximately 2.
          Thus simply increment values in the x, y, and z direction from approximately −2 to
 approximately +2 pixels, using the above function. As an example running through
 equation (1), given that sigma is 2, let i=0, j=1, and k=−1, then the function evaluates to
 exp(−1.0*(1+1+0)/(2*1*1))=exp(−1*2/2(=0.36788, where exp is the standard
 exponential function (e x ).

           The values in the array which are above a threshold are marked as being one of the
 specified colors. The values in the array below the threshold are marked as not being of
 the specified color. An RGB(RED, GREEN BLUE) array of the lip colors is generated,
 and the endpoints of the biggest lip colored component are selected as the mouth corners.

              The driver's skin is marked as important. All other pixel values in the image are
 considered unimportant. Each pixel has an R, G, B component. So for a pixel that is
 marked as important, go to this location in the RGB array indexing on the R, G, B
 components. Increments this array location by equation (1) given and explained above, it
 is both written and briefly described here for convenience: exp(−1.0*(j*j+k*k+i*i)/(2
 *sigma*sigma)); sigma is 2. Increment values in the x, y, and z direction from
 approximately −2 to approximately +2, using equation 1. Do this for every pixel that is
marked as important. If a color, or pixel value, is marked as important multiple times, its
 new value is added to the current value.

            Pixel values that are marked as unimportant decrease the value of the RGB
 indexed location via equation (2), given and explained above, and is both written and
 briefly described here for convenience:
 exp(−1.0*(j*j+k*k+i*i)/(2*(sigma−1)*(sigma−1))). The values in the array which are
 above a threshold are marked as being one of the specified colors. Another RGB array is
 generated of the skin colors, and the largest non-skin components above the lips are
 marked as the eyes. The program method then starts looking above the lips in a vertical
 manner until it finds two non-skin regions, which are between approximately 15 to
 approximately 800 pixels in an area. The marking of pixels can occur automatically by
 considering the common color of various skin/lip tones.

NIGHTIME CONDITIONS
     If it is nighttime perform the following steps: To determine if it is night any of the
 three conditions can occurr. If a camera clock is between 18:00 and 07:00 hours and/or if
the driver has his night time driving headlights on or if the intensity of the image, is
 below a threshold it must be dark, so use the night time algorithm steps.
                 The invention initialize eyes by finding the bright spots with dark around them.
  In the first two frames the system finds the brightest pixels with dark regions around
 them. These points are marked as the eye centers. In subsequent frames there brightest
 regions are referred to as the eye bright tracker estimate. If these estimates are too far
 from the previous values, retain the old values as the new eye location estimates. The
 next frame is then grabbed.

                  The system runs two independent subsystems. Starting with the left subsystem
 first the dark pixel is located and tested to see if it is close enough to the previous eye
 location. If these estimates are too far from the previous values, the system retains the
 old values as the new eye location estimates. If the new estimates are close to the
 previous values, then these new estimates are kept.

                       
             The second subsystem, finds the image transform . This stage tries to find a
 common function between two images in which the camera moved some amount. This
function would transform all the pixels in one image to the corresponding point in
 another image. This function is called an affine function. It has six parameters, and it is a
 motion estimation equation.

Other applications from the same method. 

1) Cabins in airplanes.

2) Water craft such as boats.

3) Trains and subways.
SUMMARY
  

Summary of the invention

          
                    A primary objective of the invention is to provide a system and method for
 monitoring driver alertness with a single camera focused on the face of the driver to
 monitor for conditions of driver fatigue and lack of sleep.

                    A secondary objective of the invention is to provide a system and method for
 monitoring driver alertness which operates in real time which would be sufficient time to
 avert an accident.

                  A third objective of the invention is to provide a system and method for
monitoring driver alertness that uses a computer vision to monitor both the eyes and the
 rotation of the driver's head through video sequences.

BIBLIOGRAPHY



[2] Mark Roth, Pittsburgh Post-Gazette.S

[3] SAE Technical Paper Series, #942321, Estimate of Driver's Alertness Level Using
     Fuzzy Method.

[4] sCrosby and Peterson 1991.

[5] New Scientist.

No comments:

Post a Comment

leave your opinion