Three papers accepted in ACM WWW 2022


1
Title: “I Have No Text in My Post”: Using Visual Hints to Model User Emotions in Social Media
Author: Junho Song, Kyungsik Han, and Sang-Wook Kim
Abstract
As an emotion plays an important role in people’s everyday lives and is often mirrored in their social media use, extensive research has been conducted to characterize and model emotions from social media data. However, prior research has not sufficiently considered trends of social media use-the increasing use of images and the decreasing use of text-nor identified the features of images in social media that are likely to be different from those in non-social media. Our study aims to fill this gap by (1) considering the notion of visual hints that depict contextual information of images, (2) presenting their characteristics in positive or negative emotions, and (3) demonstrating their effectiveness in emotion prediction modeling through an in-depth analysis of their relationship with the text in the same posts. The results of our experiments showed that our visual hint-based model achieved 20% improvement in emotion prediction, compared with the baseline. In particular, the performance of our model was comparable with that of the text-based model, highlighting not only a strong relationship between visual hints of the image and emotion, but also the potential of using only images for emotion prediction which well reflects current and future trends of social media use. 2
Title: Not All Layer Are Equal: A Layer-Wise Adaptive Approach Toward Large-Scale DNN Training
Author: Yunyong Ko, Dongwon Lee, and Sang-Wook Kim
Abstract
A large-batch training with data parallelism is a widely adopted approach to efficiently train a large deep neural network (DNN) model. Large-batch training, however, often suffers from the problem of the model quality degradation because of its fewer iterations. To alleviate this problem, in general, learning rate (lr) scaling methods have been applied, which increases the learning rate to make an update larger at each iteration. Unfortunately, however, we observe that large-batch training with state-of-the-art lr scaling methods still often degrade the model quality when a batch size crosses a specific limit, rendering such lr methods less useful. To this phenomenon, we hypothesize that existing lr scaling methods overlook the subtle but important differences across “layers” in training, which results in the degradation of the overall model quality. From this hypothesis, we propose a novel approach (LENA) toward the learning rate scaling for large-scale DNN training, employing: (1) a layer-wise adaptive lr scaling to adjust lr for each layer individually, and (2) a layer-wise state-aware warm-up to track the state of the training for each layer and finish its warm-up automatically. The comprehensive evaluation with variations of batch sizes demonstrates that LENA achieves the target accuracy (i.e., the accuracy of single-worker training): (1) within the fewest iterations across different batch sizes (up to 45.2% fewer iterations and 44.7% shorter time than the existing state-of-the-art method), and (2) for training very large-batch sizes, surpassing the limits of all baselines. 3
Title: GraphReformCD: Graph Reformulation for Effective Community Detection in Real-World Graphs
Author: Jiwon Hong, Dong-hyuk Seo, Jeewon Ahn, and Sang-Wook Kim
Abstract
Community detection, one of the most important tools for graph analysis, finds groups of strongly connected nodes in a graph. However, community detection may suffer from misleading information in a graph, such as a nontrivial number of inter-community edges or an insufficient number of intra-community edges. In this paper, we propose GraphReformCD that reformulates a given graph into a new graph in such a way that community detection can be conducted more accurately. For the reformulation, it builds a k-nearest neighbor graph that gives a node k opportunities to connect itself to those nodes that are likely to belong to the same community together with the node. To find the nodes that belong to the same community, it employs the structural similarities such as Jaccard index and SimRank. To validate the effectiveness of our GraphReformCD, we perform extensive experiments with six real-world and four synthetic graphs. The results show that our GraphReformCD enables state-of-the-art methods to improve their accuracy significantly up to 40.6% in community detection.

업데이트: