We conducted experiments on a publicly accessible dataset 6 , the Meetup dataset [Pham et al. 2015], which contains user profiles and event participation records from the state of California (CA) and New York City (NYC). To filter out outliers and boost community detection performance, we retained users who have at least five interest tags and attended more than five events. Meanwhile, we eliminated events with less than five participants. We ultimately obtained 5,904 users in CA and 6,440 users in NYC. The details are summarized in Table I. In addition, the online and offline user representations are characterized by their interest tags and physical event participation records, respectively. Namely, all interest tags and event attendance records were transformed into the feature vectors to stand for the users’ online and offline representations, respectively. Moreover, we constructed the dual-networks where vertices represent users and edges stand for their online and offline pairwise similarities, i.e., relationships, according to the corresponding representations.
The dataset is released at at www.ntu.edu.sg/home/gaocong/datacode.htm.