LSGDA 2020

The 2nd International Workshop on Large Scale Graph Data Analytics

Aim and Scope. Various application domains such as social networks, communication networks, collaboration networks, biological networks, transportation networks, knowledge networks naturally generate large scale graph data to capture the connectedness among entities. Driven by these applications, there is an increasing demand for the development of novel graph analytics models and scalable graph analytics techniques and systems. The 2nd International Workshop on Large Scale Graph Data Analytics (LSGDA 2020) aims to provide a forum for researchers from academia and industry to exchange ideas, techniques and application scenarios in large scale graph data analytics as well as discuss open challenges and identify new research directions in the area. Besides regular research papers, we also welcome vision papers, demonstration papers and papers with industry showcase from various applications.

The workshop will be of interest to researchers in developing techniques for large scale graph data analytics in various application domains. The intended audiences include researchers from both academia and industry who are interested in exploiting the value of large scale graph data.

Publication. Accepted papers will be published in the springer proceedings. Accepted papers with top quality will be recommended for publication in WWW Journal.

Workshop Program in VLDB2020 Conference Page

Research Interest

Topics of interest include but not limited to:

  • Graph data model, storage, indexing and query processing techniques
  • Graph mining techniques
  • Techniques for distributed graph analytics
  • Graph visualization techniques and system interfaces
  • Dynamic and streaming graph data analytics
  • Spatial-temporal graph analytics
  • AI techniques for graphs
  • Machine learning techniques for graphs
  • Graph analytics in various application domains such as social networks multimedia, semantic web, biological data, business processes, transport data, etc.
  • Vision papers to survey the area of graph data analytics as well as describe the future research directions

Submission Guidelines

The proceedings of the workshops will be published jointly with the conference proceedings. We welcome research papers (full or short), vision papers, demo papers and industry papers showcasing graph analytics in real applications.

Including the bibliography and any possible appendices,

  • Full papers and vision papers should be a maximum of 14 pages in length.
  • Short papers, demo papers and industry papers should be a maximum of 6 pages in length.
  • Please format your paper based on Springer LNCS template.

Please upload your submission to the LSGDA 2020 Research Track through the CMT system at:

Copyright form can be found at:

Important Dates

Paper submission: April 26, 2020 June 10, 2020

Paper notification: June 7, 2020 June 28, 2020

Camera ready deadline: July 27, 2020

Members of the workshop organizers:

General Chair: Xuemin Lin, University of New South Wales, Australia

PC Co-chairs:

  • Lu Qin, University of Technology Sydney, Australia
  • Wenjie Zhang, University of New South Wales, Australia
  • Ying Zhang, University of Technology Sydney, Australia
Publicity Chair: Kai Wang, University of New South Wales, Australia
Publication Chair: You Peng, University of New South Wales, Australia
Web Chair: Dong Wen, University of Technology Sydney, Australia

Members of the program committee:

  • Anil Pacaci, University of Waterloo, Canada
  • Bolin Ding, Data Analytics and Intelligence Lab, Alibaba Group, USA
  • Chuan Xiao, Nagoya University, Japan
  • Chunbin Lin, Amazon Web Services, USA
  • Dawei Cheng, Shanghai Jiaotong University, China
  • Donatella Firmani, Roma Tre University, Italy
  • Huasong Shan, JD.COM, USA
  • Jiafeng Hu, Google, China
  • Jianye Yang, Hunan University, China
  • Lijun Chang, University of Sydney, Australia
  • Matteo Lissandrini, Aalborg University, Denmark
  • Rong-Hua Li, Beijing Institute of Technology, China
  • Sergey Pupyrev, Facebook, USA
  • Stefano Leucci, University of L'Aquila, Italy
  • Verena Kantere, National Technical University of Athens, Greece
  • Vijil Chenthamarakshan, IBM AI Research, USA
  • Weiren Yu, University of Warwick, UK
  • Xiang Zhao, National University of Defence Technology, China
  • Xin Cao, University of New South Wales, Australia
  • Yuanyuan Zhu, Wuhan University, China
  • Zhaonian Zou, Harbin Institute of Technology, China

Workshop Programs UTC time





Keynote I

Chengfei Liu (Swinburne University of Technology)

Video-Youtube Video-Bilibili


Keynote II

Da Yan (University of Alabama at Birmingham)

Video-Youtube Video-Bilibili


Keynote III

Weiren Yu (University of Warwick)

Video-Youtube Video-Bilibili


Virtual Coffee Break

Scalable In-Memory Graph Pattern Matching on Symmetric Multiprocessor Systems
Alexander Krause (TU Dresden), Dirk Habich (TU Dresden), Wolfgang Lehner (TU Dresden)

Video-Youtube Video-Bilibili

A Graph-based Approach towards Risk Alerting for COVID-19 Spread
Aibo Guo (National University of Defense Technology), Qianzhen Zhang (National University of Defense Technology), Xiang Zhao (National University of Defence Technology)

Video-Youtube Video-Bilibili

Distributed Graph Analytics with Datalog Queries in Flink
Muhammad Imran (Technische Universität Berlin), Gábor E. Gévay (Technische Universität Berlin), Volker Markl (Technische Universität Berlin)

Video-Youtube Video-Bilibili

Explaining Results of Path Queries on Graphs: Single-Path Results for Context-Free Path Queries
Jelle Hellings (University of California Davis)

Video-Youtube Video-Bilibili

Attribute Diversified Community Search

Chengfei Liu
Swinburne University of Technology
Co-authors: Lu Chen, Rui Zhou, Afzal Azeem Chowdhary


Discovering communities that naturally exist as groups of fine-connected users is one the most important tasks for network data analytics and has tremendous real applications. In recent year, community search in attributed graphs has begun to attract attention, which aims to find communities that are both structure and attribute cohesive. Whereas, searching a community that is structure cohesive but attribute diversified, denoted as attribute diversified community search, is still at preliminary stage. In this paper, we introduce our recent effort for discovering attribute diversified community. In fact, for different applications, the needs of attribute diversification for modelling the community are quite different. We introduce three attribute diversified community models in which attribute diversification takes different roles for presenting objective, query requirement, and constraint. We also discuss major techniques for speeding up the attribute diversified community search.

Parallel Mining of Frequent Subtree Patterns

Da Yan
University of Alabama at Birmingham
Co-authors: Wenwen Qu, Guimu Guo, Xiaoling Wang, Lei Zou, Yang Zhou


Mining frequent subtree patterns in a tree database (or, forest) is useful in domains such as bioinformatics and mining semistructured data. We consider the problem of mining embedded subtrees in a database of rooted, labeled, and ordered trees. We compare two existing serial mining algorithms, PrefixTreeSpan and TreeMiner, and adapt them for parallel execution using PrefixFPM, our general-purpose framework for frequent pattern mining that is designed to effectively utilize the CPU cores in a multicore machine. Our experiments show that TreeMiner is faster than its successor PrefixTreeSpan when a limited number of CPU cores are used, as the total mining workloads is smaller; however, PrefixTreeSpan has a much higher speedup ratio and can beat TreeMiner when given enough CPU cores.

An Axiomatic Role Similarity Measure Based on Graph Topology

Weiren Yu
University of Warwick
Co-authors: Sima Iranmanesh, Aparajita Haldar, Maoyin Zhang, Hakan Ferhatosmanoglu


RoleSim and SimRank are popular graph-theoretic similarity measures with many applications in, e.g., web search, collaborative filtering, and sociometry. While RoleSim addresses the automorphic (role) equivalence of pairwise similarity which SimRank lacks, it ignores the neighboring similarity information out of the automorphically equivalent set. Consequently, two pairs of nodes, which are not automorphically equivalent by nature, cannot be well distinguished by RoleSim if the averages of their neighboring similarities over the automorphically equivalent set are the same.
To alleviate this problem: 1) We propose a novel similarity model, namely RoleSim*, which accurately evaluates pairwise role similarities in a more comprehensive manner. RoleSim* not only guarantees the automorphic equivalence that SimRank lacks, but also takes into account the neighboring similarity information outside the automorphically equivalent sets that are overlooked by RoleSim. 2) We prove the existence and uniqueness of the RoleSim* solution, and show its three axiomatic properties (i.e., symmetry, boundedness, and non-increasing monotonicity). 3) We provide a concise bound for iteratively computing RoleSim* formula, and estimate the number of iterations required to attain a desired accuracy. 4) We induce a distance metric based on RoleSim* similarity, and show that the RoleSim* metric fulfills the triangular inequality, which implies the sum-transitivity of its similarity scores. Our experimental results on real and synthetic datasets demonstrate that RoleSim* achieves higher accuracy than its competitors while retaining comparable computational complexity bounds of RoleSim.