Paper Group ANR 1742
A Simple but Effective BERT Model for Dialog State Tracking on Resource-Limited Systems. Using BERT for Word Sense Disambiguation. HeteSpaceyWalk: A Heterogeneous Spacey Random Walk for Heterogeneous Information Network Embedding. Learning to Modulate for Non-coherent MIMO. PoshakNet: Framework for matching dresses from real-life photos using GAN a …
A Simple but Effective BERT Model for Dialog State Tracking on Resource-Limited Systems
Title | A Simple but Effective BERT Model for Dialog State Tracking on Resource-Limited Systems |
Authors | Tuan Manh Lai, Quan Hung Tran, Trung Bui, Daisuke Kihara |
Abstract | In a task-oriented dialog system, the goal of dialog state tracking (DST) is to monitor the state of the conversation from the dialog history. Recently, many deep learning based methods have been proposed for the task. Despite their impressive performance, current neural architectures for DST are typically heavily-engineered and conceptually complex, making it difficult to implement, debug, and maintain them in a production setting. In this work, we propose a simple but effective DST model based on BERT. In addition to its simplicity, our approach also has a number of other advantages: (a) the number of parameters does not grow with the ontology size (b) the model can operate in situations where the domain ontology may change dynamically. Experimental results demonstrate that our BERT-based model outperforms previous methods by a large margin, achieving new state-of-the-art results on the standard WoZ 2.0 dataset. Finally, to make the model small and fast enough for resource-restricted systems, we apply the knowledge distillation method to compress our model. The final compressed model achieves comparable results with the original model while being 8x smaller and 7x faster. |
Tasks | |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12995v3 |
https://arxiv.org/pdf/1910.12995v3.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-but-effective-bert-model-for-dialog |
Repo | |
Framework | |
Using BERT for Word Sense Disambiguation
Title | Using BERT for Word Sense Disambiguation |
Authors | Jiaju Du, Fanchao Qi, Maosong Sun |
Abstract | Word Sense Disambiguation (WSD), which aims to identify the correct sense of a given polyseme, is a long-standing problem in NLP. In this paper, we propose to use BERT to extract better polyseme representations for WSD and explore several ways of combining BERT and the classifier. We also utilize sense definitions to train a unified classifier for all words, which enables the model to disambiguate unseen polysemes. Experiments show that our model achieves the state-of-the-art results on the standard English All-word WSD evaluation. |
Tasks | Word Sense Disambiguation |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08358v1 |
https://arxiv.org/pdf/1909.08358v1.pdf | |
PWC | https://paperswithcode.com/paper/using-bert-for-word-sense-disambiguation |
Repo | |
Framework | |
HeteSpaceyWalk: A Heterogeneous Spacey Random Walk for Heterogeneous Information Network Embedding
Title | HeteSpaceyWalk: A Heterogeneous Spacey Random Walk for Heterogeneous Information Network Embedding |
Authors | Yu He, Yangqiu Song, Jianxin Li, Cheng Ji, Jian Peng, Hao Peng |
Abstract | Heterogeneous information network (HIN) embedding has gained increasing interests recently. However, the current way of random-walk based HIN embedding methods have paid few attention to the higher-order Markov chain nature of meta-path guided random walks, especially to the stationarity issue. In this paper, we systematically formalize the meta-path guided random walk as a higher-order Markov chain process, and present a heterogeneous personalized spacey random walk to efficiently and effectively attain the expected stationary distribution among nodes. Then we propose a generalized scalable framework to leverage the heterogeneous personalized spacey random walk to learn embeddings for multiple types of nodes in an HIN guided by a meta-path, a meta-graph, and a meta-schema respectively. We conduct extensive experiments in several heterogeneous networks and demonstrate that our methods substantially outperform the existing state-of-the-art network embedding algorithms. |
Tasks | Network Embedding |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.03228v1 |
https://arxiv.org/pdf/1909.03228v1.pdf | |
PWC | https://paperswithcode.com/paper/hetespaceywalk-a-heterogeneous-spacey-random |
Repo | |
Framework | |
Learning to Modulate for Non-coherent MIMO
Title | Learning to Modulate for Non-coherent MIMO |
Authors | Ye Wang, Toshiaki Koike-Akino |
Abstract | The deep learning trend has recently impacted a variety of fields, including communication systems, where various approaches have explored the application of neural networks in place of traditional designs. Neural networks flexibly allow for data/simulation-driven optimization, but are often employed as black boxes detached from direct application of domain knowledge. Our work considers learning-based approaches addressing modulation and signal detection design for the non-coherent MIMO channel. We demonstrate that simulation-driven optimization can be performed while entirely avoiding neural networks, yet still perform comparably. Additionally, we show the feasibility of MIMO communications over extremely short coherence windows (i.e., channel coefficient stability period), with as few as two time slots. |
Tasks | |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.03711v1 |
http://arxiv.org/pdf/1903.03711v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-modulate-for-non-coherent-mimo |
Repo | |
Framework | |
PoshakNet: Framework for matching dresses from real-life photos using GAN and Siamese Network
Title | PoshakNet: Framework for matching dresses from real-life photos using GAN and Siamese Network |
Authors | Abhigyan Khaund, Daksh Thapar, Aditya Nigam |
Abstract | Online garment shopping has gained many customers in recent years. Describing a dress using keywords does not always yield the proper results, which in turn leads to dissatisfaction of customers. A visual search based system will be enormously beneficent to the industry. Hence, we propose a framework that can retrieve similar clothes that can be found in an image. The first task is to extract the garment from the input image (street photo). There are various challenges for that, including pose, illumination, and background clutter. We use a Generative Adversarial Network for the task of retrieving the garment that the person in the image was wearing. It has been shown that GAN can retrieve the garment very efficiently despite the challenges of street photos. Finally, a siamese based matching system takes the retrieved cloth image and matches it with the clothes in the dataset, giving us the top k matches. We take a pre-trained inception-ResNet v1 module as a siamese network (trained using triplet loss for face detection) and fine-tune it on the shopping dataset using center loss. The dataset has been collected inhouse. For training the GAN, we use the LookBook dataset, which is publically available. |
Tasks | Face Detection |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04237v1 |
https://arxiv.org/pdf/1911.04237v1.pdf | |
PWC | https://paperswithcode.com/paper/poshaknet-framework-for-matching-dresses-from |
Repo | |
Framework | |
Face Detection on Surveillance Images
Title | Face Detection on Surveillance Images |
Authors | Mohammad Iqbal Nouyed, Guodong Guo |
Abstract | In last few decades, a lot of progress has been made in the field of face detection. Various face detection methods have been proposed by numerous researchers working in this area. The two well-known benchmarking platform: the FDDB and WIDER face detection provide quite challenging scenarios to assess the efficacy of the detection methods. These benchmarking data sets are mostly created using images from the public network ie. the Internet. A recent, face detection and open-set recognition challenge has shown that those same face detection algorithms produce high false alarms for images taken in surveillance scenario. This shows the difficult nature of the surveillance environment. Our proposed body pose based face detection method was one of the top performers in this competition. In this paper, we perform a comparative performance analysis of some of the well known face detection methods including the few used in that competition, and, compare them to our proposed body pose based face detection method. Experiment results show that, our proposed method that leverages body information to detect faces, is the most realistic approach in terms of accuracy, false alarms and average detection time, when surveillance scenario is in consideration. |
Tasks | Face Detection, Open Set Learning |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.11121v1 |
https://arxiv.org/pdf/1910.11121v1.pdf | |
PWC | https://paperswithcode.com/paper/face-detection-on-surveillance-images |
Repo | |
Framework | |
Adversarial Examples Are a Natural Consequence of Test Error in Noise
Title | Adversarial Examples Are a Natural Consequence of Test Error in Noise |
Authors | Nic Ford, Justin Gilmer, Nicolas Carlini, Dogus Cubuk |
Abstract | Over the last few years, the phenomenon of adversarial examples — maliciously constructed inputs that fool trained machine learning models — has captured the attention of the research community, especially when the adversary is restricted to small modifications of a correctly handled input. Less surprisingly, image classifiers also lack human-level performance on randomly corrupted images, such as images with additive Gaussian noise. In this paper we provide both empirical and theoretical evidence that these are two manifestations of the same underlying phenomenon, establishing close connections between the adversarial robustness and corruption robustness research programs. This suggests that improving adversarial robustness should go hand in hand with improving performance in the presence of more general and realistic image corruptions. Based on our results we recommend that future adversarial defenses consider evaluating the robustness of their methods to distributional shift with benchmarks such as Imagenet-C. |
Tasks | |
Published | 2019-01-29 |
URL | http://arxiv.org/abs/1901.10513v1 |
http://arxiv.org/pdf/1901.10513v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-examples-are-a-natural |
Repo | |
Framework | |
Enhancing self-supervised monocular depth estimation with traditional visual odometry
Title | Enhancing self-supervised monocular depth estimation with traditional visual odometry |
Authors | Lorenzo Andraghetti, Panteleimon Myriokefalitakis, Pier Luigi Dovesi, Belen Luque, Matteo Poggi, Alessandro Pieropan, Stefano Mattoccia |
Abstract | Estimating depth from a single image represents an attractive alternative to more traditional approaches leveraging multiple cameras. In this field, deep learning yielded outstanding results at the cost of needing large amounts of data labeled with precise depth measurements for training. An issue softened by self-supervised approaches leveraging monocular sequences or stereo pairs in place of expensive ground truth depth annotations. This paper enables to further improve monocular depth estimation by integrating into existing self-supervised networks a geometrical prior. Specifically, we propose a sparsity-invariant autoencoder able to process the output of conventional visual odometry algorithms working in synergy with depth-from-mono networks. Experimental results on the KITTI dataset show that by exploiting the geometrical prior, our proposal: i) outperforms existing approaches in the literature and ii) couples well with both compact and complex depth-from-mono architectures, allowing for its deployment on high-end GPUs as well as on embedded devices (e.g., NVIDIA Jetson TX2). |
Tasks | Depth And Camera Motion, Depth Estimation, Monocular Depth Estimation, Visual Odometry |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.03127v2 |
https://arxiv.org/pdf/1908.03127v2.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-self-supervised-monocular-depth |
Repo | |
Framework | |
Creating Lightweight Object Detectors with Model Compression for Deployment on Edge Devices
Title | Creating Lightweight Object Detectors with Model Compression for Deployment on Edge Devices |
Authors | Yiwu Yao, Weiqiang Yang, Haoqi Zhu |
Abstract | To achieve lightweight object detectors for deployment on the edge devices, an effective model compression pipeline is proposed in this paper. The compression pipeline consists of automatic channel pruning for the backbone, fixed channel deletion for the branch layers and knowledge distillation for the guidance learning. As results, the Resnet50-v1d is auto-pruned and fine-tuned on ImageNet to attain a compact base model as the backbone of object detector. Then, lightweight object detectors are implemented with proposed compression pipeline. For instance, the SSD-300 with model size=16.3MB, FLOPS=2.31G, and mAP=71.2 is created, revealing a better result than SSD-300-MobileNet. |
Tasks | Model Compression |
Published | 2019-05-06 |
URL | https://arxiv.org/abs/1905.01787v1 |
https://arxiv.org/pdf/1905.01787v1.pdf | |
PWC | https://paperswithcode.com/paper/creating-lightweight-object-detectors-with |
Repo | |
Framework | |
Causal relationship between eWOM topics and profit of rural tourism at Japanese Roadside Stations “MICHINOEKI”
Title | Causal relationship between eWOM topics and profit of rural tourism at Japanese Roadside Stations “MICHINOEKI” |
Authors | Elisa Claire Alemán Carreón, Tetsuro Ito, Hirofumi Nonaka, Minoru Kumano, Toru Hiraoka, Masaharu Hirota |
Abstract | Affected by urbanization, centralization and the decrease of overall population, Japan has been making efforts to revitalize the rural areas across the country. One particular effort is to increase tourism to these rural areas via regional branding, using local farm products as tourist attractions across Japan. Particularly, a program subsidized by the government called Michinoeki, which stands for ‘roadside station’, was created 20 years ago and it strives to provide a safe and comfortable space for cultural interaction between road travelers and the local community, as well as offering refreshment, and relevant information to travelers. However, despite its importance in the revitalization of the Japanese economy, studies with newer technologies and methodologies are lacking. Using sales data from establishments in the Kyushu area of Japan, we used Support Vector to classify content from Twitter into relevant topics and studied their causal relationship to the sales for each establishment using LiNGAM, a linear non-gaussian acyclic model built for causal structure analysis, to perform an improved market analysis considering more than just correlation. Under the hypotheses stated by the LiNGAM model, we discovered a positive causal relationship between the number of tweets mentioning those establishments, specially mentioning deserts, a need for better access and traf^ic options, and a potentially untapped customer base in motorcycle biker groups. |
Tasks | |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.12039v2 |
http://arxiv.org/pdf/1904.12039v2.pdf | |
PWC | https://paperswithcode.com/paper/190412039 |
Repo | |
Framework | |
Predicting intelligence based on cortical WM/GM contrast, cortical thickness and volumetry
Title | Predicting intelligence based on cortical WM/GM contrast, cortical thickness and volumetry |
Authors | Juan Miguel Valverde, Vandad Imani, John D. Lewis, Jussi Tohka |
Abstract | We propose a four-layer fully-connected neural network (FNN) for predicting fluid intelligence scores from T1-weighted MR images for the ABCD-challenge. In addition to the volumes of brain structures, the FNN uses cortical WM/GM contrast and cortical thickness at 78 cortical regions. These last two measurements were derived from the T1-weighted MR images using cortical surfaces produced by the CIVET pipeline. The age and gender of the subjects and the scanner manufacturer are also used as features for the learning algorithm. This yielded 283 features provided to the FNN with two hidden layers of 20 and 15 nodes. The method was applied to the data from the ABCD study. Trained with a training set of 3736 subjects, the proposed method achieved a MSE of 71.596 and a correlation of 0.151 in the validation set of 415 subjects. For the final submission, the model was trained with 3568 subjects and it achieved a MSE of 94.0270 in the test set comprised of 4383 subjects. |
Tasks | |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.05660v1 |
https://arxiv.org/pdf/1909.05660v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-intelligence-based-on-cortical |
Repo | |
Framework | |
Spoofing and Anti-Spoofing with Wax Figure Faces
Title | Spoofing and Anti-Spoofing with Wax Figure Faces |
Authors | Shan Jia, Xin Li, Chuanbo Hu, Zhengquan Xu |
Abstract | We have witnessed rapid advances in both face presentation attack models and presentation attack detection (PAD) in recent years. Compared to widely studied 2D face presentation attacks (e.g. printed photos and video replays), 3D face presentation attacks are more challenging because face recognition systems (FRS) is more easily confused by the 3D characteristics of materials similar to real faces. Existing 3D face spoofing databases, mostly based on 3D facial masks, are restricted to small data size and suffer from poor authenticity due to the difficulty and expense of mask production. In this work, we introduce a wax figure face database (WFFD) as a novel and super-realistic 3D face presentation attack. This database contains 2300 image pairs (totally 4600) and 745 subjects including both real and wax figure faces with high diversity from online collections. On one hand, our experiments have demonstrated the spoofing potential of WFFD on three popular FRSs. On the other hand, we have developed a multi-feature voting scheme for wax figure face detection (anti-spoofing), which combines three discriminative features at the decision level. The proposed detection method was compared against several face PAD approaches and found to outperform other competing methods. Surprisingly, our fusion-based detection method achieves an Average Classification Error Rate (ACER) of 11.73% on the WFFD database, which is even better than human-based detection. |
Tasks | Face Detection, Face Recognition |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05457v1 |
https://arxiv.org/pdf/1910.05457v1.pdf | |
PWC | https://paperswithcode.com/paper/spoofing-and-anti-spoofing-with-wax-figure |
Repo | |
Framework | |
Deep Message Passing on Sets
Title | Deep Message Passing on Sets |
Authors | Yifeng Shi, Junier Oliva, Marc Niethammer |
Abstract | Modern methods for learning over graph input data have shown the fruitfulness of accounting for relationships among elements in a collection. However, most methods that learn over set input data use only rudimentary approaches to exploit intra-collection relationships. In this work we introduce Deep Message Passing on Sets (DMPS), a novel method that incorporates relational learning for sets. DMPS not only connects learning on graphs with learning on sets via deep kernel learning, but it also bridges message passing on sets and traditional diffusion dynamics commonly used in denoising models. Based on these connections, we develop two new blocks for relational learning on sets: the set-denoising block and the set-residual block. The former is motivated by the connection between message passing on general graphs and diffusion-based denoising models, whereas the latter is inspired by the well-known residual network. In addition to demonstrating the interpretability of our model by learning the true underlying relational structure experimentally, we also show the effectiveness of our approach on both synthetic and real-world datasets by achieving results that are competitive with or outperform the state-of-the-art. |
Tasks | Denoising, Relational Reasoning |
Published | 2019-09-21 |
URL | https://arxiv.org/abs/1909.09877v1 |
https://arxiv.org/pdf/1909.09877v1.pdf | |
PWC | https://paperswithcode.com/paper/190909877 |
Repo | |
Framework | |
LucidDream: Controlled Temporally-Consistent DeepDream on Videos
Title | LucidDream: Controlled Temporally-Consistent DeepDream on Videos |
Authors | Joel Ruben Antony Moniz, Eunsu Kang, Barnabás Póczos |
Abstract | In this work, we aim to propose a set of techniques to improve the controllability and aesthetic appeal when DeepDream, which uses a pre-trained neural network to modify images by hallucinating objects into them, is applied to videos. In particular, we demonstrate a simple modification that improves control over the class of object that DeepDream is induced to hallucinate. We also show that the flickering artifacts which frequently appear when DeepDream is applied on videos can be mitigated by the use of an additional temporal consistency loss term. |
Tasks | |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.11960v1 |
https://arxiv.org/pdf/1911.11960v1.pdf | |
PWC | https://paperswithcode.com/paper/luciddream-controlled-temporally-consistent |
Repo | |
Framework | |
The Impact of Extraneous Variables on the Performance of Recurrent Neural Network Models in Clinical Tasks
Title | The Impact of Extraneous Variables on the Performance of Recurrent Neural Network Models in Clinical Tasks |
Authors | Eugene Laksana, Melissa Aczon, Long Ho, Cameron Carlin, David Ledbetter, Randall Wetzel |
Abstract | Electronic Medical Records (EMR) are a rich source of patient information, including measurements reflecting physiologic signs and administered therapies. Identifying which variables are useful in predicting clinical outcomes can be challenging. Advanced algorithms such as deep neural networks were designed to process high-dimensional inputs containing variables in their measured form, thus bypass separate feature selection or engineering steps. We investigated the effect of extraneous input variables on the predictive performance of Recurrent Neural Networks (RNN) by including in the input vector extraneous variables randomly drawn from theoretical and empirical distributions. RNN models using different input vectors (EMR variables; EMR and extraneous variables; extraneous variables only) were trained to predict three clinical outcomes: in-ICU mortality, 72-hour ICU re-admission, and 30-day ICU-free days. The measured degradations of the RNN’s predictive performance with the addition of extraneous variables to EMR variables were negligible. |
Tasks | Feature Selection |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.01125v1 |
http://arxiv.org/pdf/1904.01125v1.pdf | |
PWC | https://paperswithcode.com/paper/the-impact-of-extraneous-variables-on-the |
Repo | |
Framework | |