In recent years, human activity recognition in intelligent visual surveillance has drawn much attention in the field of video analysis technology due to the growing demand from many applications, such as security and surveillance, sports and gaming, healthcare, person identification at distance, crowd behavior analysis, suspicious event detection and alarming in public/private places, traffic management, etc. Generally speaking, the human activity recognition in visual surveillance divided into following stages: object (human or vehicle) segmentation, feature extraction, object classification, object tracking, and activity recognition. Development of robust object segmentation method is the prime objective for any visual surveillance system. Object segmentation is used to detect the regions corresponding to static or moving human or vehicle. In this paper, we provide a comprehensive survey of the recent development of object segmentation (especially on human) algorithms in the context of human activity recognition in visual surveillance. We will also discuss the strength and weakness of algorithms, complexities in activity understanding and identify the possible future research challenges
[1] Y. J. Lee, J. Kim, and K. Grauman. Key-segments for video object segmentation. In ICCV, 2011.
[2] P. Ochs and T. Brox. Higher order motion models and spectral clustering. In CVPR, 2012.
[3] T. Wang and J. Collomosse. Probabilistic motion diffusion of labeling priors for coherent video segmentation. IEEE Trans. Multimedia, 2012.
[4] S. Sarkar, P. J. Phillips, Z. Liu, I. R. Vega, P. Grother, and K. W. Bowyer, “The HumanID gait challenge problem: Data sets, performance, and analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 2, pp. 162–177, 2005.
[5] Y. Rui, T. S. Huang, and S. F. Chang, “Image retrieval: current techniques, promising directions and open issues,” Journal of Visual Communication and Image Representation, vol. 10, no. 4, pp. 39–62,1999.
[6] S. F. Chang, “The holy grail of content-based media analysis,” IEEE Transactions on Multimedia, vol. 9, no. 2, pp. 6–10, 2002.
[7] Nakazawa, A.; Kato, H.; Inokuchi, S. Human Tracking Using Distributed Vision Systems. In Proceedings of IEEE Fourteenth International Conference on Pattern Recognition, Brisbane, Qld., Australia, 20 August 1998; Volume 1, pp. 593–596.
[8] Bodor, R.; Jackson, B.; Papanikolopoulos, N. Vision-based Human Tracking and Activity Recognition. In Proceedings of the 11th Mediterranean Conference on Control and Automation, Rhodes, Greece, 18–20 June 2003; Volume 1, pp. 18–20.
INTERNATIONAL YOUTH SCIENCE FORUM “LITTERIS ET ARTIBUS”, 24–26 NOVEMBER 2016, LVIV, UKRAINE 75
[9] Fiaz, M.K.; Ijaz, B. Vision based Human Activity Tracking using Artificial Neural Networks. In Proceedings of IEEE International Conference on Intelligent and Advanced Systems (ICIAS), Kuala Lumpur, Malaysia, 15–17 June 2010; pp. 1–5.
[10] Sempena, S.; Maulidevi, N.U.; Aryan P.R. Human Action Recognition Using Dynamic Time Warping. In IEEE International Conference on Electrical Engineering and Informatics (ICEEI), Bandung, Indonesia, 17–19 July 2011; pp. 1–5.
[11] Ribeiro, P.C.; Santos-Victor, J. Human Activity Recognition from Video: Modeling, Feature Selection and Classification Architecture. In Proceedings of the International Workshop on Human Activity Recognition and Modelling (HAREM), Oxford, UK, 9 September 2005; Volume 1, pp. 61–70.
[12] Niu, W.; Long, J.; Han, D.; Wang, Y. Human Activity Detection and Recognition for Video Surveillance. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 27–30 June 2004; Volume 1, pp. 719–722.
[13] Moënne-Loccoz, N.; Brémond, F.; Thonnat, M. Recurrent Bayesian network for the recognition of human behaviors from video. In Proceedings of the 3rd International Conference on Computer Vision Systems (ICVS), Graz, Austria, 1–3 April 2003; pp. 68–77.
[14] Kuo, Y.; Lee, J.; Chung, P. A visual context-awareness-based sleeping-respiration measurement system. IEEE Trans. Inf. Technol. Biomed. 2010, 14, 255–265.
[15] Gao, J.; Hauptmann, A.G.; Bharucha, A.; Wactlar, H.D. Dining Activity Analysis Using a Hidden Markov Model. In Proceedings of the 17th IEEE International Conference on Pattern Recognition (ICPR), Cambridge, UK, 23–26 August 2004; Volume 2, pp. 915–918.
[16] Huynh, H.H.; Meunier, J.; Sequeira, J.; Daniel, M. Real time detection, tracking and recognition of medication intake. World Acad. Sci. Eng. Technol. 2009, 60, 280–287.
[17] Foroughi, H.; Rezvanian, A.; Paziraee, A. Robust Fall Detection Using Human Shape and Multi-Class Support Vector Machine. In Proceedings of the IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP), Bhubaneswar, India, 16–19 December 2008; pp. 413–420.
[18] Foroughi, H.; Aski, B.S.; Pourreza, H. Intelligent Video Surveillance for Monitoring Fall Detection of Elderly in Home Environments. In Proceedings of the IEEE 11th International Conference on Computer and Information Technology (ICCIT), Khulna, Bangladesh, 24–27 December 2008; pp. 219–224.
[19] Foroughi, H.; Yazdi, H.S.; Pourreza, H.; Javidi, M. An Eigenspace-based Approach for Human Fall Detection Using Integrated Time Motion Image and Multi-class Support Vector Machine. In Proceedings of IEEE 4th International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, 28–30 August 2008; pp. 83–90.
[20] Foroughi, H.; Naseri, A; Saberi, A.; Yazdi, H.S. An Eigenspace-based Approach for Human Fall Detection Using Integrated Time Motion Image and Neural Network. In Procedings of IEEE 9th International Conference on Signal Processing (ICSP), Beijing, China, 26–29 October 2008; pp. 1499–1503.
[21] Lühr, S.; Venkatesh, S.; West, G.; Bui, H.H. Explicit state duration HMM for abnormality detection in sequences of human activity. PRICAI 2004: Trends Artif. Intell. 2004, 3157, 983–984.
[22] Duong, T.V.; Phung D.Q.; Bui, H.H.; Venkatesh, S. Human Behavior Recognition with Generic Exponential Family Duration Modeling in the Hidden Semi-Markov Model. In Proceedings of IEEE 18th International Conference on Pattern Recognition (ICPR), Hong Kong, China, 20–24 August 2006; Volume 3, pp. 202–207.
[23] Liu, C.; Chung, P.; Chung, Y.; Thonnat, M. Understanding of human behaviors from videos in nursing care monitoring systems. J. High Speed Netw. 2007, 16, 91–103.
[24] Li, Y.; Miaou, S.; Hung, C.K.; Sese, J.T. A Gait Analysis System Using two Cameras with Orthogonal View. In Proceedings of IEEE International Conference on Multimedia Technology (ICMT), Hangzhou, China, 26–28 July 2011; pp. 2841–2844.
[25] Luo, Y.; Wu, T.; Hwang, J. Object-based analysis and interpretation of human motion in sports video sequences by dynamic Bayesian networks. Comput. Vis. Image Underst. 2003, 92, 196–216.
[26] Shechtman, E.; Irani, M. Space-time Behavior Based Correlation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA, 20–26 June 2005; Volume 1, pp. 405–412.
[27] Huo, F.; Hendriks, E.; Paclik, P.; Oomes, A.H.J. Markerless Human Motion Capture and Pose Recognition. In Proceedings of the 10th IEEE Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), London, UK, 6–8 May 2009; pp. 13–16.
[28] H. Kautz, "A fonnal theory of plan recognition," Ph.D. dissertation, University of Rochester, 1987.
[29] Piccardi, M. Background Subtraction Techniques: A Review. In Proceedings of IEEE International Conference on Systems, Man and Cybernetics, The Hague, The Netherlands, 10–13 October 2004; Volume 4, pp. 3099–3104.
[30] Wren, C.R.; Azarbayejani, A.; Darrell, T.; Pentland, A.P. Pfinder: Real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 780–785.
[31] Seki, M.; Fujiwara, H.; Sumi, K. A Robust Background Subtraction Method for Changing Background. In Proceedings of Fifth IEEE Workshop on Applications of Computer Vision, Palm Springs, CA, USA, 4–6 December 2000; pp. 207–213.
INTERNATIONAL YOUTH SCIENCE FORUM “LITTERIS ET ARTIBUS”, 24–26 NOVEMBER 2016, 76 LVIV, UKRAINE
[32] R. Jain and H. Nagel, "On the analysis of accumulative difference pictures from image sequences of real world scenes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, pp. 206-214, 1979.
[33] N. Friedman and S. Russell, "Image segmentation in video sequences: A probabilistic approach," in Thirteenth Corif. on Uncertainty in Artificial Intelligence, 1997, pp. 175-81.
[34] Permuter, H.; Francos, J.; Jermyn, I. A study of Gaussian mixture models of color and texture features for image classification and segmentation. Pattern Recogn. 2006, 39, 695–706.
[35] Horprasert, T.; Harwood, D.; Davis, L.S. A statistical approach for real-time robust background subtraction and shadow detection. IEEE ICCV 1999, 99, 1–19.
[36] Brendel, W.; Todorovic, S. Video Object Segmentation by Tracking Regions. In proceedings of IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 833–840.
[37] Yu, T.; Zhang, C.; Cohen, M.; Rui, Y.; Wu, Y. Monocular Video Foreground/Background Segmen-tation by Tracking Spatial-color Gaussian Mixture Models. In Proceedings of IEEE Workshop on Motion and Video Computing (WMVC), Austin, TX, USA, 23–24 February 2007; p. 5
[38] Murray, D.; Basu, A. Motion tracking with an active camera. IEEE Trans. Pattern Anal. Mach. Intell. 1994, 16, 449–459.
[39] Kim, K.K.; Cho, S.H.; Kim, H.J.; Lee, J.Y. Detecting and Tracking Moving Object Using an Active Camera. In Proceedings of IEEE 7th International Conference on Advanced Communication Technology (ICACT), Phoenix Park, Dublin, Ireland, 21–23 February 2005; Volume 2, pp. 817–820.
[40] D. Meyer, et al., "Model based extraction of articulated objects in image sequences for gait analysis," in Image Processing, 1997. Proceedings., International Conference on, 1997, pp. 78-81.
[41] Y. Sheikh and M. Shah, "Bayesian modeling of dynamic scenes for object detection," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27, pp. 1778-1792,2005.
[42] Lucas, B.D.; Kanade, T. An Iterative Image Registration Technique with An Application to Stereo Vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence, Vancouver, B.C., Canada, 24–28 August 1981.
[43] Shi, J.; Tomasi, C. Good Features to Track. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 21–23 June 1994; pp. 593–600.
[44] F.Bremond, M. Thonnat, and M. Zuniga, “Video-understanding framework for automatic behavior recognition,” Behavior Research Methods, Vol. 30, No. 3, pp. 416-426, 2006
[45] Wang, L.; Hu, W.; Tan, T. Recent developments in human motion analysis. Pattern Recognit. 2003, 36, 585–601.
[46] Amer, A.; Regazzoni, C. Introduction to the special issue on video object processing for surveillance applications. Real Time Imaging 2005, 11, 167–171.
[47] O. P. Popoola and K. Wang, "Video-Based Abnormal Human Behavior Recognition—A Review," in IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 6, pp. 865-878, Nov. 2012.
[48] J. Varadarajan and J. Odobez, “Topic models for scene analysis and abnormality detection,” in Proc. IEEE 12th Int. Conf. Comput. Vision Workshops, Sep. 27–Oct. 4, 2009, pp. 1338–1345.
[49] X. Zhang, H. Liu, Y. Gao, and D. H. Hu, “Detecting abnormal events via hierarchical Dirichlet processes,” in Proc. 13th Pacific-Asia Conf.Knowledge Discovery Data Mining, Apr.27–30, 2009, pp. 278–289.
[50] Chianese, V. Moscato, and A. Picariello, “Detecting abnormal activities in video sequences,” in Proc. Ambi-Sys workshop on Ambient media delivery and interactive television, pp. 1–8, 2008.
[51] Y. Chen, G. Liang, K. L. Ka, and Y. Xu,“Abnormal behavior detection by multi-SVM-based Bayesian network,” in Proc. Int. Conf. Inf. Acquisition, Jul. 9–11, 2007, pp. 298–303.
[52] Y. Wang, K. Huang, and T. Tan, “Abnormal activity recognition in office based on R transform,” in Proc. IEEE Int. Conf. Image Process., 2007, pp. I-341–I-344.
[53] C.-K. Lee, M.-F. Ho, W.-S. Wen, and C.-L. Huang, “Abnormal event detection in video using N-cut clustering,” in Proc. Int. Conf. Intell. Inf. Hiding Multimedia Signal Process., 2006, pp. 407–410.
[54] L. Kratz and K. Nishino, “Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models,” in Proc. IEEE Conf. Comput. Vision Pattern Recog., 2009, pp. 1446–1453