Melbourne-China Big Data Research Network

Main coordinator: Professor Rui Zhang

The First Workshop of Melbourne-China Big Data Research Network

Date:    14 Dec 2014
Venue: Room 1402, Science & Technology Building, Shenzhen University, 3688 Nanhai Ave., Nanshan District, Shenzhen, China

Workshop Program

Schedule Talk Title Presenter Affiliation
14:00-14:15 Workshop Introduction Prof. Rui Zhang University of Melbourne
14:15-14:30 Welcome from Shenzhen University Prof. Qingquan Li Shenzhen University
14:30-15:00 Replica Consistency Issuses in Wide Column Store Prof. Jianmin Wang Tsinghua Universitye
15:00-15:30 Recommending Products on Microbloging Assistant Prof. Wayne Xin Zhao Renmin University of China
15:30-15:45 Break
15:45-16:15 Exploring the Spectrum of Urban Lifestyles Dr. Nicholas (Jing) Yuan Miscrosoft Research Asia
16:15-16:45 Spatial community detection Associate Prof. Yang Yue Shenzhen University
16:45-17:05 Large Scale Retinal and Brain MRI Analysis for Early Detection of Cardiovascular Diseases Prof. Rao Kotagiri

Prof. Rui Zhang
University of Melbourne
17:05-17:25 Selected Topics in Spatial and Temporal Data Analytics Research at University of Melbourne

Talk 1: Replica Consistency Issuses in Wide Column Store

The wide column stores, such as Bigtable, Cassandra, HBase, are required to guarantee the data reliability, fault-tolerance and accessibility for users. The most common solution is that we store multiple copies of the same data in different storage devices. The multiple copies are called data replica. In this talk, we take Cassandra as an example to analyze the replica consistency mechanism in wide column stores. We model the writing propagation process among the replica with Petri net. Two types of queueing structures, sending queues and mutation queues, are identified as the determinants of the replica consistency. Some empirical studies have been done to verify our proposed consistency model.

Presenter: Jianmin Wang

Bio: Jianmin Wang is a Professor at School of Software of Tsinghua University, China. Professor Wang's research direction is data management and information system. He has done some research work on unstructured data management, work flow & BPM technology and Enterprise Information System. Currently his research focuses on concurrent process modelling in NoSQL system and process mining from large scale event log.

Other attendee(s) of the group: Xiangdong Huang (PhD student)

Talk 2: Recommending Products on Microbloging

Recent years have witnessed the great success of e-commerce companies such as Amazon and eBay. A widely used technique that companies adopt is to exploit the product recommendation system to improve user experience and enhance sales, which has attracted much attention from research communities. However, most of these studies focus on constructing the solutions within the online e-commerce Websites, which are largely limited by the information availability and design flexibility of the Website. In this talk, we present the development of METIS, a novel product recommender system based on the microblogging service. The main unique characteristics of the METIS system that distinguish it from traditional patent product recommenders are as follows: 1) We adopt the microblogging service as the platform to capture commercial intents and recommend products, which is able to capture users’ buying desires in the first time and utilize users’ profile data. 2) We propose a demographic based recommendation algorithm, which follows the flexible learning to rank framework by deriving features based on users profiles and product demographic. We present a novel method for extracting product demographic by leveraging knowledge from online social media. 3) We propose an effective method to collect training data for learning to rank algorithms. We evaluate the proposed methods on real microblog and e-commerce datasets. The experimental results show that the proposed techniques outperform the corresponding baseline methods.

Presenter: Wayne Xin Zhao

Bio: Wayne Xin Zhao is currently an Assistant Professor at the School of Information, Renmin University of China. He received the Ph.D. degree from Peking University in 2014. His research interests are web text mining and natural language processing. He was one of ten recipients of MSRA PhD Fellowship 2012. He has published about 20 referred papers in international conferences and journals such as ACL, EMNLP, COLING, ECIR, CIKM, SIGIR, SIGKDD, ACM TIST and IEEE TKDE. His current research focus is to improve e-commerce service with social data.

Other attendee(s) of the group: Prof. Jirong Wen

Talk 3: LifeSpec: Exploring the Spectrum of Urban Lifestyles

An incisive understanding of human lifestyles is not only essential to many scientific disciplines, but also has a profound business impact for targeted marketing. In this talk, we present LifeSpec, a computational framework for exploring and hierarchically categorizing urban lifestyles. Specifically, we have developed an algorithm to connect multiple social network accounts of millions of individuals and collect their publicly available heterogeneous behavioral data as well as social links. In addition, a nonparametric Bayesian approach is developed to model the lifestyle spectrum of a group of individuals. To demonstrate the effectiveness of LifeSpec, we conducted extensive experiments and case studies, with a large dataset we collected covering 1 million individuals from 493 cities. Our results suggest that LifeSpec offers a powerful paradigm for 1) revealing an individual’s lifestyle from multiple dimensions, and 2) uncovering lifestyle commonalities and variations of a group with various demographic attributes, such as vocation, education, gender, sexual orientation, and place of residence. The proposed method provides emerging implications for personalized recommendation and targeted advertising.

Presenter: Nicholas (Jing) Yuan

Bio: Nicholas (Jing) Yuan is currently an associate researcher in Microsoft Research Asia. He got a Ph.D degree in Computer Science from the School of Computer Science and Technology in 2012, and a B.S. degree in Mathematics from the School of the Gifted Young in 2007, both in University of Science and Technology of China. Currently, his research interests include behavioral data mining, spatial-temporal data mining and computational social science. During the past few years, Nicholas has published a series of papers in top-tier conferences and journals. His work has been featured by influential media such as MIT Technology Review many times. He has been honored with Microsoft Fellowship (2011), Best Paper Award of IEEE International Conference on Data Mining (2013), Best Paper Runner-up Award of ACM SIGSPATIAL (2010), and Distinguished Doctoral Dissertation Award of Chinese Academy of Sciences (2013).

Other attendee(s) of the group: Dr. Xing Xie

Talk 4: Spatial Community Detection

This talk summaries a part of the work based on our CCF-Tencent Open Fund, which focuses on detecting the spatial boundary of social network community. The aim of the work is to quantify the influence of spatial distance on online social networks. Both Tencent QQ (a strong-tie social network) and Sina Weibo (a weak-tie social network) data were examined. Results proved our understanding that, spatial distance has less impact on weak-tie social network structure than that of strong-tie social network. However, we found that the pattern of spatial distance on QQ friend lists and QQ contact friends are different. The QQ friend lists are strongly affected by administrative divisions; while the spatial boundary of QQ contact friends shows strong distance-decay effect. The findings have significant contribution on friend recommendation and POI recommendation.

Presenter: Yang Yue

Bio: Dr. Yang Yue is currently an Associate Professor with Department of Transportation Engineering, and Shenzhen key laboratory of spatial smart sensing and services, Shenzhen University. Before moving to Shenzhen University, she was with and State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University for about 5 years as an associate professor. She received the B. Eng and M.Eng, degrees in Surveying Engineering and GIS (Geographical Information System) from Wuhan Technical University of Surveying and Mapping, respectively, and the Ph.D. degree in Urban Transport from the University of Hong Kong. Her research interests include trajectory-based behavior analysis, traffic analysis, and GIS-T.

Other attendee(s) of the group: Prof. Qingquan Li

Talk 5: Large Scale Retinal and Brain MRI Analysis for Early Detection of Cardiovascular Diseases

Recent studies show that, cerebral White Matter Lesion (WML) is related to cerebrovascular diseases, cardiovascular diseases, dementia and psychiatric disorders. There is also evidence that pathologies in retina are closely related to Lesions in Brain MRI. The main goal is to develop a system that can be used for large scale screening for early detection of CVD. Manual segmentation of WML is not appropriate for long term longitudinal studies because it is time consuming and it shows high intra- and inter-rater variability. In this paper, a fully automated segmentation method is utilized to segment WML from brain Magnetic Resonance Imaging (MRI). The segmentation method uses a combination of global neighbourhood given contrast feature-based Random Forest (RF) classifier and Markov Random Field (MRF) to segment WML. To remove false positive lesions we use a rule based morphological post-processing operation. Quantitative evaluation of the proposed method was performed on 24 subjects of ENVIS-ion study. The segmentation results were validated against the manual segmentation performed by an experienced radiologist and was compared to a recently published WML segmentation method. The results show a dice similarity index of 0.75 for high lesion load, 0.71 for medium lesion load and 0.60 for low lesion load are achieved.

Presenter: Ramamohanarao (Rao) Kotagiri

Bio: Professor Rao Kotagiri received PhD from Monash University. He was awarded the Alexander von Humboldt Fellowship in 1983. He has been at the Uni. of Melb. since 1980 and was appointed as a professor in computer science in 1989. Rao held several senior positions including Head of Computer Science and Software Engineering, Head of the School of Electrical Engineering and Computer Science at the University of Melbourne and Research Director for the Cooperative Research Centre for Intelligent Decision Systems. He served or serving on the Editorial Boards of the Computer Journal, Universal Computer Science, TKDE, VLDBJ and International Journal on Data Privacy. He was the program Co-Chair for VLDB, PAKDD, DASFAA and DOOD. He is a steering committee member of ICDM, PAKDD. He received distinguished contribution award for Data Mining from PAKDD. Rao is a Fellow of the Institute of Engineers AU, a Fellow of Australian Academy Technological Sciences and Engineering and a Fellow of Australian Academy of Science. He was awarded Distinguished Contribution Award in 2009 by the Computing Research and Education Association of Australasia. He has published more than 350 articles and 48 PhD completions. He was the chair of ICDE 2013 and a co-chair of SIGMOD 2014.

Talk 6: Selected Topics in Spatial and Temporal Data Analytics Research at University of Melbourne

I will cover a few selected topics in the area of spatial and temporal data analytics being conducted at the University of Melbourne and highlight the challenges in big data in this area. The topics include solving data sparsity problem in destination prediction by GPS trajectories, computing earth mover’s distance similarity joins using MapReduce and easily implementable spatial index.

Presenter: Rui Zhang

Bio: Rui Zhang is a Professor and Reader at the University of Melbourne and Assistant Dean (Collaboration) of Melbourne School of Engineering. He has been awarded the Future Fellowship by the Australian Research Council in 2012. He obtained his Bachelor's degree from Tsinghua University in 2001 and PhD from National University of Singapore in 2006. He has been a visiting scholar in AT&T Labs-Research and Microsoft Research before and is now a regular visiting researcher at Microsoft Research Asia in Beijing. He has authored 70 publications in prestigious conferences and journals. His research interest is spatial and temporal data analytics, as well as general database and mining techniques including indexing, moving object management, data streams and sequence databases. He regularly serves as PC members of top conferences in data management and mining such as SIGMOD, VLDB, ICDE and KDD. He is an associate editor of Distributed and Parallel Databases.

Other attendee(s) of the group: Dr. Jianzhong Qi, Zeyi Wen (PhD student)


  • Logo of University of Melbourne
  • Logo of Tsinghua University
  • Logo of Renmin University of China
  • Logo of Microsoft