International Journal of S cientific Research in Information S ystems and Engineering Volume 1, Issue 2, December - 2015
ISSN 2380-8128
Computer Game Controlled by Eye Movements
Yusuf Sait Erdem 1, Ibrahim Furkan Ince2*, Huseyin Kusetogullari 3, Md. Haidar Sharif4
De partme nt of Compute r Engine e ring
Ge diz Unive rsity
Se yre k, Izmir 35665 – TURKEY
E-mail1 :
E-mail2 :
E-mail3 :
E-mail4 :
yusuf.e rde m2011@ogr.gediz.e du.tr
furkan.ince @ge diz.edu.tr
huse yin.kuse togullari@gediz.e du.tr
md-haidar.sharif@gediz.e du.tr
*Corre sponding author: Ibrahim Furkan Ince
Abstract— sundry years ago people played video games for fun merely. Nowadays, video games are correlated to education, medicine, and researches. In this
paper, we have addressed a computer game which takes input from a video camera by detecting user looking direction as well as eye gestures. Since webcam is easily
accessible, we have carefully weighed it as video input device. We have tried to use the eye movements as the human computer interaction (HCI) tool, which would be
used instead of a mouse. In general, this furnishes much easier and faster interaction with computer for everyone especially elders, children, and disabled people who
cannot use mice. We have also evinced that eye tracking and eye gaze estimation can be very easy and fast way to obtain HCI. Experimental results show the potential usages of our proposed game.
Keyw ords — Eye Tracking, 2D Eye Gaze Estimation, Computer Games Controlled by Eye Movements, Human-Computer Interaction.
—————————— ——————————
1 INTRODUCTION
B
ESides pleasures, at present, video games are playing a
vital role in education, medicine, and researches. For example, video games are taking important role in the development of children. Some 15 years ago every parent was
trying to educate their children apart from the school by teaching themselves or buying books for them, but now technology
and video games makes it much easier. By playing games
children can learn languages, differentiate shapes and colors
from the childhood without forcing and having a fun [1].
There is fast development on HCI technologies for controlling
video games. Big companies find very different ways to get
user input for better game interactions in earlier days. There
are three popular different approaches for this purpose: (i)
Analyzing data from accelerometer sensing of hardware; (ii)
Speech recognition; and (iii) Analyzing optical input. Nevertheless, most of these approaches require specific hardware.
Some of them also requires extra physical effort that users are
not used to. Gamer would feel more comfortable if there
would exist better tools for gaming.
Eye gaze estimation is the one of the most important HCI
tool that is getting developed, instead of just getting user’s
input by keyboard and mouse [2, 3]. We have aimed to show
that the eye movements may be used as a HCI tool, which
may be used instead of mouse even in our daily lives without
extensive hardware. This provides much easier and fast interaction with computer for everyone including old people, children and disabled people who cannot use mouse. There are
many algorithms about eye gaze tracking system and eye
gesture detection, but many of them are not effectively used in
our daily lives. Our aim is to design the software that optimizes computer vision techniques on gaze tracking and eye gestures detection for a computer game. Consequently, these
techniques would be efficiently used in our daily lives more
often by providing much easier and faster interaction with
computer for everyone including old people, children, and
disabled people who are not able to use a mouse easily.
In this paper, we have addressed a computer game which
can be controlled by eye gaze estimation and gestures. Our
proposed system is fast enough to run on an average PC and
applicable to work with low-resolution webcam video stream.
Eye gaze and gestures are detected on the visual input of the
user. Detection of the head and pupil positions of a user is
important to estimate user's eye gaze. For this anticipation,
area of eyes and pupil position of the user are detected. Area
of eyes provides opportunity for the rough position of user's
head. The cursor on the screen is moved according to the relative position of the pupil on the area of eyes of the user. For
detecting the pupil position a low cost pupil detection algorithm [4] has been taken into account.
The remaining part of this paper has been organized as follows: Section 2 accords with the most fashionable techniques.
Section 3 describes in vivid detail of our proposed framework;
Section 4 reports some experimental results; finally, Section 5
presents the conclusion of the work with few glimpses for
further study.
2 STATE-OF-THE-ART M ETHODS
Accuracy, tracking frequency, and hardware dependencies are
important keys for eye gaze tracking systems. Since accuracy
determines the feasibility of selection of targets e.g., images
and buttons, this is used for benchmarking of the eye gaze
detection systems. In addition to, speed of the systems is another important reference of such systems [5]. There are mis-
IJSRISE © 2015
http://www.ijsrise.com
International Journal of S cientific Research in Information S ystems and Engineering Volume 2, Issue 1, October -2015
ISSN 2380-8128
cellaneous techniques described in literature for eye gaze estimation. Nevertheless, a few of them do not require additional special hardware and require manual initialization of pupils. For instances, the algorithm of Hansen et al. [6] the iris is
modeled as an ellipse, but that technique requires high quality
image taken from very close of user's eyes. For instances, Hansen et al. [6] modeled the iris as an ellipse, but that technique
requires high quality image taken from very close of user's
eyes. The systems described by Noureddin [7] and Park [8]
required visuals from two cameras with different angles. As
we are interested in developing a game that is able to run with
easily accessible hardware, we have researched more about
algorithms which only need video input without special lighting tools.
Snake algorithm proposed by Kass et al. [9] is also used for
pupil detection. But it requires improvement on its very low
accuracy. Williams [10] proposed an improvement to minimize energy functional of algorithm by using greedy algorithm technique. Additional to this method, Choi [11] hinted
an improved method for segmentation. Abe et al. [12] used
multiple snakes to increase completeness of that method but
since none of versions of snakes are not efficient in case to
detect only pupils as those method are not specifically aims to
find circles. Ince et al. [13] introduced an algorithm for fast
and sub-pixel precise detection of eye blobs for extracting eye
features. A low-cost system for two-dimensional eye gaze
estimation with low-resolution webcam images was presented
by Ince et al. [4].
A light-reflection-based method proposed by Yoo et al.
[14]. It requires 4-light-emitting-diode sources placed around
the screen, 1-light-emitting-diode source on the camera and 2
cameras; one for wide vision of user to detect user's face and
one for zooming to one pupil of user. Corneal reflection of 5light-emitting-diode sources on users one of the pupils detected on camera input. Eye gaze estimated by projection calculation of 5 reflection points of light-emitting-diode sources. This
technique has good accuracy, lets large heads movements of
the user and it does not require to detect head pose additionally. In spite of this, it needs additional excessive hardware such
as 5-light-emitting-diode sources and higher quality cameras
than average webcams.
Sewell et al. [15] proposed a method which does not require any additional hardware except an average webcam for
eye gaze estimation on an average PC. Haar-like object detection is applied to video input. User's face and eye areas are
detected. Artificial neural network techniques applied on one
detected eye area to detect iris. Iris detection is only applied to
one eye area rather than both eye areas for simplicity. By projection on iris location and head location that is detected by
Haar-like object detection, eye gaze is estimated as x and y
coordinates on the screed and cursor is placed accordingly.
Nevertheless, this method troubled with jumped movements
on cursor because of noisy input from average webcam.
3 IMPLEMENTATION STEPS
3.1 Frontal Eye Areas Detection
Visual of user whose entire face can be seen clearly by the
camera is streamed from the camera. Tolerably plain background is required for robust frontal area of eyes detection.
Frontal area of eyes is detected by using Haar-like object detectors which was proposed by Viola et al. [16] and extended
by Lienhart et al. [17] and is available in the OpenCV library
[18]. By applying this method to the input frame, for example
Figure 1, search area is reduced for the pupils and user's head
position calculated in purpose of calculating pupil position
relatively to user's head position.
Fig. 1. Sample Snapshot for Reduction of Search Area
This information used on estimation of eye gaze. A frontal
area of eyes cascade is used by Haar-like object detector to
reduce false pupil detections and increase speed, instead of
detecting eyes one-by-one. The found area is divided to half in
length to apply pupil detection for left and right eye separately. Since frontal area of eyes detection process only returns
horizontal result, head rotations cannot be detected by this
method as well as area of eyes cannot be detected at all on
extensively rotated head visuals. Consequently, user is asked
to hold his/her head fairly strait. As gamer usually pla ys
computer games in such position, it will not cause an important issue.
3.2 Pupil Detection
One elliptic model with varying radius is traversed on two
parts of search area (frontal area of eyes). Since this method
requires radius of elliptic model as input, range of radius has
to be determined for the search. This range is determined by
the height of the search area. In purpose of traversing rectangular search area (e.g., Figure 2, a rectangular iteration method
is purposed which starts traversing from given point and
traversing continues by the closest neighbor according to
Manhattan distance measure [19]. Manhattan distance measure is selected because of its simplicity. After detecting first
pupil, starting point of iteration for search is set to center of
IJSRISE © 2015
http://www.ijsrise.com
International Journal of S cientific Research in Information S ystems and Engineering Volume 2, Issue 1, October -2015
ISSN 2380-8128
detected pupil on next frames of video input for increasing
speed and accuracy. Closer candidate pupils to the starting
point have more change to be the correct pupil detection after
first iteration.
1
are calculated to find average intensity with ∑𝑁−1
𝑖=0 𝑥𝑖, where
𝑁
N be the number of pixels and 𝑥𝑖 be the intensity of current
pixel inside the elliptic hollow kernel. Figure 4 depicts a sample snapshot for the pixels fit to the elliptic hollow kernel as
follows:
Fig. 4. Sample Snapshot for the Pixels Fit to the Elliptic Kernel
Fig. 2. Rectangular Iteration Method
On each step of traversing the search area, current pixel is
counted as center of ellipse and coordinates of some number
of points are calculated according to 𝑥 2 + 𝑦 2 = 𝑟 2 equation
where x and y are the coordinates of current pixel and r is the
current radius on iteration. Calculated coordinates of points
on the edge of ellipse are checked by its outer neighbor and
next neighbor on the ellipse. Distance from each point on the
edge of ellipse to their outer and next neighbor pixels are calculated with the Euclidean distance as: √(𝑟1 − 𝑟2 ) 2 +
(𝑔1 − 𝑔2 )2 + (𝑏1 − 𝑏2 )2 where 𝑟1 , 𝑔1, 𝑏1 are red, green, and blue
values of edge pixel, respectively as well as 𝑟2 , 𝑔2, 𝑏2 are red,
green, and blue values of outer pixel, respectively. If the distance to outer pixel is higher than the edge threshold and distance to next neighbor is smaller than the line threshold, that
pixel is considered on the arc. Figure 3 depicts the model of
elliptic hollow kernel as follows:
Second iteration through inner pixels is done to find covari1
ance of intensity values of the pixels considering ∑𝑁−1
(𝑥𝑖 −
𝑁 𝑖=0
𝑎𝑣𝑔(𝑋)), where 𝑥𝑖 be the intensity of current pixel, 𝑎𝑣𝑔(𝑋) be
the average intensity, and N be the total number of pixels of
elliptic kernel area. This value divided to average intensity to
calculate rate of variance. Fitness of the ellipse is determined
by smallness of the rate of variance value. Thresholding is
applied to candidate pupils. This thresholding reduces false
pupil detections those are detected by elliptic edge detection.
By this new feature of pupil, effect of corneal reflection is reduced since pupil is usually dark. Yet sometimes corneal reflection makes it brighter. If this rate of variance is 0, it means
that it is a complete black filled ellipse. The ellipse with highest ellipse completeness of the remained pupil candidates, is
resulted as the pupil.
Fig. 3. Window-based Elliptic Hollow Kernel [4]
On getting results of each pixel on the edge of the ellipse, consequential arc points are clustered. Longest two clusters considered as true arcs of the ellipse, the rest is ignored. Sum of
the number of edge points on arcs is divided to total number
of points on the ellipse to find percentage of the completeness.
Then average of Euclidean distance of all inner pixel's RGB
values to RGB value of black color (Red: 0, Green: 0, Blue: 0)
Fig. 5. Sample Snapshot for Pupils Detection
Figure 5 above demonstrates the pupils detected by low cost
iterative pupil detection with elliptic hollow kernel algorithm
[4].
IJSRISE © 2015
http://www.ijsrise.com
International Journal of S cientific Research in Information S ystems and Engineering Volume 2, Issue 1, October -2015
ISSN 2380-8128
3.3 Eye Gaze Estimation
For eye gaze estimation, head direction and pupil position are
needed [4]. In this paper, user's head pose estimated as the
face straightly towards to the video input device and the location of head determined by frontal both eye area detection. By
reference of area of eyes, eye gaze estimated by the position of
one pupil. Consequently, only one pupil of user is detected to
increase system's performance. To calibrate eye gaze direction,
user is asked to look middle of the screen for a while when
system starts and user's current pupil position is saved as
middle position. Pupil position is calculated as relatively to
the frontal area of eyes, because by the small head movements
or changing distances to the camera may change the position
and size of frontal area of eyes thus by head movements pupil
position is calculated correctly according to user's head position.
After center location is set game is started. On next frames
pupil positions compared with center location of pupil and
eye gaze direction is founded. To increase robustness of the
system, directions are grouped to 8 as shown in Figure 6. Cursor on the screen is moved with constant speed in the specified direction or stops if pupil is close enough to center. Figure
7 demonstrates the model of shooting game controlled by eye
gaze detection where cursor direction indicates movement
direction of cursor as well as the model of 8 number of examples for pupil positions and corresponding cursor directions.
Shooting action is needed to get as input from the user additional to directions input. With this aim, user blinks to apply
that action. A blink is detected if there is no pupil detection on
frontal both eye area throughout the threshold time.
4 EXPERIMENTAL RESULTS
4.1 Experimental Setup
We have tested our game using standard laptop with its builtin camera. Figure 8 depicts a screenshot from the shooting
game. On the game, randomly placed circular target board
images appear on the screen. Users moves the cursor image on
the screen by his/her eye gaze direction and blinks strongly
for applying shooting action (e.g., Figure 8). The target disappears under the cursor if there is any. Number of targets
change according to the difficulty level. After given time is
expired for the stage or every target get hit by player, new
stage is loaded. After all stages completed final points are
calculated and displayed on the screen.
Fig. 8. Sample Snapshot of Designed Shooting Game
The designed shooting game framework is employed as the
reference testing module for proposed 2D gaze estimation
algorithm with no calibration. The size of the shooting targets
determine the precision level. According to the users' eye
movements, users are asked to shoot each selected target in a
given duration. In our experimental setup, targets with diameter of 50 pixels are distributed to the screen in random positions. Duration of shooting for each target item is given as 5
seconds. The system was tested on 10 number of users whom
were trained about the system usage before the actual experiment. The ratio of successful shootings to the overall shootings
were observed as 90% as the measure of accuracy. As a future
work, it is planned to add calibration module to the proposed
system for performance enhancement.
Fig. 6. The Model of 8 Number of Directions
Fig. 7. Framework for Controlling the Cursor Movements
Normally, in the game, there is no traditional eye gaze estimation but eye gaze direction estimation by 8+1 directions.
4.2 Findings
We have implemented our proposed game algorithm using a
small scale setup. Since eye movements are used as the human
computer interaction tool, we do not need any kind of mouse.
The proposed game is suitable for not only children but also
elders and disabled people who cannot or reluctant to use
mice. Outcome of our system shows the potential use of it in
IJSRISE © 2015
http://www.ijsrise.com
International Journal of S cientific Research in Information S ystems and Engineering Volume 2, Issue 1, October -2015
ISSN 2380-8128
daily life applications. Our proposed system is fast enough to
run on an average personal computer or laptop and applicable
to work with low-resolution webcam video stream.
[8]
[9]
4.3 Shortcomings
The proposed system was tested with the detection of the
head and pupil positions of a user. Area of eyes provided the
rough position of user's head. There are still some errors on
distinguishing adjacent directions of user's eye gaze. Accuracy
should be increased to increase playability. But using only 8+1
eye gaze directions, false cursor movements are decreased
largely. Nevertheless, the accurate head position was not estimated correctly and consequently shooting game would
suffer from this shortcoming.
[13]
5 CONCLUSION
[14]
We suggested a low cost computer game using user's looking
direction and eye gestures as well as easily accessible webcam.
Low cost iterative pupil detection using elliptic hollow kernel
algorithm as well as eye gaze direction estimation technique, a
shooting game would be implemented to run on an average
personal computer having an average webcam and the system
may be robust to environmental changes such as tilt of camera
and illumination changes. Experimental results demonstrated
the potential usages of our proposed game. Yet the detection
of accurate head position would be further studied.
ACKNOWLEDGMENT
This paper was extracted from an undergraduate course entitled: "COM497 - Senior Design Project" held in 2014-2015 academic period at Gediz University, Izmir, Turkey. Previous
version of this study was presented in the "Third International
Symposium on Engineering, Artificial Intelligence and Applications (ISEAIA 2015)" held in between 4 to 6 of November,
2015, in Girne American University, North Cyprus. This is the
extended full version of the paper already published in the
conference proceedings and selected to be published in this
journal by the conference committee.
[10]
[11]
[12]
[15]
[16]
[17]
[18]
[19]
Park, K., Chang, J., Whang, M., Lim, J., Rhee, D.W., Park, H., Cho, Y., “Practical gaze point detecting system”, Lecture Notes in Computer Science (LNCS),
vol.3175, pp. 512-519, 2004.
Kass, M., Witkin, A., Terzopoulos, D., Snakes, “Active contour models”,
International Journal of Computer Vision, vol.1, no.4, pp. 321-331, 1988.
Williams, D.J., Shah, M., “A fast algorithm for active contours and curvature
estimation”, CVGIP: Image Understanding, vol.55, no.1, pp. 14-26, 1992.
Choi, W.P., Lam, K.M., Siu, W.C., “An adaptive active contour model for
highly irregular boundaries”, Pattern Recognition, vol.34, no.2, pp. 323-331,
2001.
Abe, T., Matsuzawa, Y., “A region extraction method using multiple active
contour models”, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol.1, pp. 64-69, 2000.
Ince, I., Yang, T.C., “A new low-cost eye tracking and blink detection approach: Extracting eye features with blob extraction”, Lecture Notes in Computer Science (LNCS), vol.5754, pp. 526-533, 2009.
Yoo, D.H., Chung, M.J., Ju, D.B., Choi, I.H., “Non-intrusive eye gaze estimation using a projective invariant under head movement”, In Proceedings of
IEEE International Conference on Robotics and Automation (ICRA'06), pp. 34433448, 2006.
Sewell, W., Komogortsev, O., “Real-time eye gaze tracking with an unmodified commodity webcam employing a neural network”, In Extended Abstracts
on Human Factors in Computing Systems (CHI'10), pp. 3739-3744, 2010.
Viola, P., Jones, M.J., “Robust real-time face detection”, International Journal of
Computer Vision, vol.57, no.2, pp. 137-154, 2004.
Lienhart, R., Maydt, J., “An extended set of haar-like features for rapid object
detection”, In Proceedings of the International Conference on Image Processing,
vol.1, pp. I-900-I-903, 2002.
“OpenCV: Open Source Computer Vision”, Opencv.org, 2015.
Black, P.E., “Manhattan distance in Dictionary of Algorithms and Data Structures”, http://www.nist.gov/dads/HTML/manhattanDistance.html, 2015.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
Moursund, D., “Introduction to Using Games in Education: A Guide for
Teachers and Parents”, Second Edition, 2011.
Zhai, S., “What's in the eyes for attentive input”, Communications of the ACM,
vol.46, no.3, pp. 34-39, 2003.
Kumar, M., Klingner, J., Puranik, R., Winograd, T., Paepcke, A., “Improving
the accuracy of gaze input for interaction”, In Proceedings of the 2008 symposium
on Eye tracking research and applications (ETRA'08), pp. 65-68, 2008.
Ince, I., Kim, J., “A 2d eye gaze estimation system with low-resolution
webcam images”, EURASIP Journal on Advances in Signal Processing, 2011.
Bulling, A., Gellersen, H., “Toward mobile eye-based human-computer
interaction”, IEEE Pervasive Computing, vol.9, no.4, pp. 8-12, 2010.
Hansen, D.W., Pece, A.E.C., “Eye tracking in the wild”, Computer Vision and
Image Understanding, vol.98, no.1, pp. 155-181, 2005.
Noureddin, B., Lawrence, P.D., Man, C.F., “A non-contact device for tracking
gaze in a human computer interface”, Computer Vision and Image Understanding, vol.98, no.1, pp. 52-82, 2005.
IJSRISE © 2015
http://www.ijsrise.com