The 2nd International Workshop on Large Scale Graph Data Analytics
Aim and Scope. Various application domains such as social networks, communication networks, collaboration networks, biological networks, transportation networks, knowledge networks naturally generate large scale graph data to capture the connectedness among entities. Driven by these applications, there is an increasing demand for the development of novel graph analytics models and scalable graph analytics techniques and systems. The 2nd International Workshop on Large Scale Graph Data Analytics (LSGDA 2020) aims to provide a forum for researchers from academia and industry to exchange ideas, techniques and application scenarios in large scale graph data analytics as well as discuss open challenges and identify new research directions in the area. Besides regular research papers, we also welcome vision papers, demonstration papers and papers with industry showcase from various applications.
The workshop will be of interest to researchers in developing techniques for large scale graph data analytics in various application domains. The intended audiences include researchers from both academia and industry who are interested in exploiting the value of large scale graph data.
Publication. Accepted papers will be published in the springer proceedings. Accepted papers with top quality will be recommended for publication in WWW Journal.
Topics of interest include but not limited to:
The proceedings of the workshops will be published jointly with the conference proceedings. We welcome research papers (full or short), vision papers, demo papers and industry papers showcasing graph analytics in real applications.
Including the bibliography and any possible appendices,
Please upload your submission to the LSGDA 2020 Research Track through the CMT system at: https://cmt3.research.microsoft.com/LSGDA2020
Copyright form can be found at: https://lsgda.github.io/2020/docs/Copyright_Form_LSGDA_SFDI_2020.pdf
Paper submission: April 26, 2020 June 10, 2020
Paper notification: June 7, 2020 June 28, 2020
Camera ready deadline: July 27, 2020
Members of the workshop organizers:
General Chair: Xuemin Lin, University of New South Wales, Australia
PC Co-chairs:
Members of the program committee:
VIRTUAL CONFERENCE ROOM (ZOOM OR OTHER) SLACK CHANNEL
08:00-08:05 | Welcome |
08:05-08:45 | Keynote IChengfei Liu (Swinburne University of Technology)
|
08:45-9:25 | Keynote IIDa Yan (University of Alabama at Birmingham)
|
9:25-10:05 | Keynote IIIWeiren Yu (University of Warwick)
|
10:05-10:20 | Virtual Coffee Break |
10:20-10:45 |
Scalable In-Memory Graph Pattern Matching on Symmetric Multiprocessor Systems
Alexander Krause (TU Dresden), Dirk Habich (TU Dresden), Wolfgang Lehner (TU Dresden)
|
10:45-11:10 |
A Graph-based Approach towards Risk Alerting for COVID-19 Spread
Aibo Guo (National University of Defense Technology), Qianzhen Zhang (National University of Defense Technology), Xiang Zhao (National University of Defence Technology)
|
11:10-11:35 |
Distributed Graph Analytics with Datalog Queries in Flink
Muhammad Imran (Technische Universität Berlin), Gábor E. Gévay (Technische Universität Berlin), Volker Markl (Technische Universität Berlin)
|
11:35-12:00 |
Explaining Results of Path Queries on Graphs: Single-Path Results for Context-Free Path Queries
Jelle Hellings (University of California Davis)
|
Abstract
Discovering communities that naturally exist as groups of fine-connected users is one the most important tasks for network data analytics and has tremendous real applications. In recent year, community search in attributed graphs has begun to attract attention, which aims to find communities that are both structure and attribute cohesive. Whereas, searching a community that is structure cohesive but attribute diversified, denoted as attribute diversified community search, is still at preliminary stage. In this paper, we introduce our recent effort for discovering attribute diversified community. In fact, for different applications, the needs of attribute diversification for modelling the community are quite different. We introduce three attribute diversified community models in which attribute diversification takes different roles for presenting objective, query requirement, and constraint. We also discuss major techniques for speeding up the attribute diversified community search.
CloseAbstract
Mining frequent subtree patterns in a tree database (or, forest) is useful in domains such as bioinformatics and mining semistructured data. We consider the problem of mining embedded subtrees in a database of rooted, labeled, and ordered trees. We compare two existing serial mining algorithms, PrefixTreeSpan and TreeMiner, and adapt them for parallel execution using PrefixFPM, our general-purpose framework for frequent pattern mining that is designed to effectively utilize the CPU cores in a multicore machine. Our experiments show that TreeMiner is faster than its successor PrefixTreeSpan when a limited number of CPU cores are used, as the total mining workloads is smaller; however, PrefixTreeSpan has a much higher speedup ratio and can beat TreeMiner when given enough CPU cores.
CloseAbstract
RoleSim and SimRank are popular graph-theoretic similarity measures with many applications in, e.g., web search, collaborative filtering, and sociometry. While RoleSim addresses the automorphic (role) equivalence of pairwise similarity which SimRank lacks, it ignores the neighboring similarity information out of the automorphically equivalent set. Consequently, two pairs of nodes, which are not automorphically equivalent by nature, cannot be well distinguished by RoleSim if the averages of their neighboring similarities over the automorphically equivalent set are the same.
To alleviate this problem: 1) We propose a novel similarity model, namely RoleSim*, which accurately evaluates pairwise role similarities in a more comprehensive manner. RoleSim* not only guarantees the automorphic equivalence that SimRank lacks, but also takes into account the neighboring similarity information outside the automorphically equivalent sets that are overlooked by RoleSim. 2) We prove the existence and uniqueness of the RoleSim* solution, and show its three axiomatic properties (i.e., symmetry, boundedness, and non-increasing monotonicity). 3) We provide a concise bound for iteratively computing RoleSim* formula, and estimate the number of iterations required to attain a desired accuracy. 4) We induce a distance metric based on RoleSim* similarity, and show that the RoleSim* metric fulfills the triangular inequality, which implies the sum-transitivity of its similarity scores. Our experimental results on real and synthetic datasets demonstrate that RoleSim* achieves higher accuracy than its competitors while retaining comparable computational complexity bounds of RoleSim.