Paper Group NANR 238
Two-Step Training and Mixed Encoding-Decoding for Implementing a Generative Chatbot with a Small Dialogue Corpus. Rendering Portraitures from Monocular Camera and Beyond. Training Autoencoders by Alternating Minimization. Unsupervised Learning of Multi-Frame Optical Flow with Occlusions. Faster Distributed Synchronous SGD with Weak Synchronization. …
Two-Step Training and Mixed Encoding-Decoding for Implementing a Generative Chatbot with a Small Dialogue Corpus
Title | Two-Step Training and Mixed Encoding-Decoding for Implementing a Generative Chatbot with a Small Dialogue Corpus |
Authors | Jintae Kim, Hyeon-Gu Lee, Harksoo Kim, Yeonsoo Lee, Young-Gil Kim |
Abstract | |
Tasks | Chatbot, Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6707/ |
https://www.aclweb.org/anthology/W18-6707 | |
PWC | https://paperswithcode.com/paper/two-step-training-and-mixed-encoding-decoding |
Repo | |
Framework | |
Rendering Portraitures from Monocular Camera and Beyond
Title | Rendering Portraitures from Monocular Camera and Beyond |
Authors | Xiangyu Xu, Deqing Sun, Sifei Liu, Wenqi Ren, Yu-Jin Zhang, Ming-Hsuan Yang, Jian Sun |
Abstract | Shallow Depth-of-Field (DoF) is a desirable effect in photography which renders artistic photos. Usually, it requires single-lens reflex cameras and certain photography skills to generate such effects. Recently, dual-lens on cellphones is used to estimate scene depth and simulate DoF effects for portrait shots. However, this technique cannot be applied to photos already taken and does not work well for whole-body scenes where the subject is at a distance from the cameras. In this work, we introduce an automatic system that achieves portrait DoF rendering for monocular cameras. Specifically, we first exploit Convolutional Neural Networks to estimate the relative depth and portrait segmentation maps from a single input image. Since these initial estimates from a single input are usually coarse and lack fine details, we further learn pixel affinities to refine the coarse estimation maps. With the refined estimation, we conduct depth and segmentation-aware blur rendering to the input image with a Conditional Random Field and image matting. In addition, we train a spatially-variant Recursive Neural Network to learn and accelerate this rendering process. We show that the proposed algorithm can effectively generate portraitures with realistic DoF effects using one single input. Experimental results also demonstrate that our depth and segmentation estimation modules perform favorably against the state-of-the-art methods both quantitatively and qualitatively. |
Tasks | Image Matting |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Xiangyu_Xu_Rendering_Portraitures_from_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Xiangyu_Xu_Rendering_Portraitures_from_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/rendering-portraitures-from-monocular-camera |
Repo | |
Framework | |
Training Autoencoders by Alternating Minimization
Title | Training Autoencoders by Alternating Minimization |
Authors | Sneha Kudugunta, Adepu Shankar, Surya Chavali, Vineeth Balasubramanian, Purushottam Kar |
Abstract | We present DANTE, a novel method for training neural networks, in particular autoencoders, using the alternating minimization principle. DANTE provides a distinct perspective in lieu of traditional gradient-based backpropagation techniques commonly used to train deep networks. It utilizes an adaptation of quasi-convex optimization techniques to cast autoencoder training as a bi-quasi-convex optimization problem. We show that for autoencoder configurations with both differentiable (e.g. sigmoid) and non-differentiable (e.g. ReLU) activation functions, we can perform the alternations very effectively. DANTE effortlessly extends to networks with multiple hidden layers and varying network configurations. In experiments on standard datasets, autoencoders trained using the proposed method were found to be very promising when compared to those trained using traditional backpropagation techniques, both in terms of training speed, as well as feature extraction and reconstruction performance. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=B1D6ty-A- |
https://openreview.net/pdf?id=B1D6ty-A- | |
PWC | https://paperswithcode.com/paper/training-autoencoders-by-alternating |
Repo | |
Framework | |
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions
Title | Unsupervised Learning of Multi-Frame Optical Flow with Occlusions |
Authors | Joel Janai, Fatma Guney, Anurag Ranjan, Michael Black, Andreas Geiger |
Abstract | Learning optical flow with neural networks is hampered by the need for obtaining training data with associated ground truth. Unsupervised learning is a promising direction, yet the performance of current unsupervised methods is still limited. In particular, the lack of proper occlusion handling in commonly used data terms constitutes a major source of error. While most optical flow methods process pairs of consecutive frames, more advanced occlusion reasoning can be realized when considering multiple frames. In this paper, we propose a framework for unsupervised learning of optical flow and occlusions over multiple frames. More specifically, we exploit the minimal configuration of three frames to strengthen the photometric loss and explicitly reason about occlusions. We demonstrate that our multi-frame, occlusion-sensitive formulation outperforms existing unsupervised two-frame methods and even produces results on par with some fully supervised methods. |
Tasks | Optical Flow Estimation |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Joel_Janai_Unsupervised_Learning_of_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Joel_Janai_Unsupervised_Learning_of_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-multi-frame-optical |
Repo | |
Framework | |
Faster Distributed Synchronous SGD with Weak Synchronization
Title | Faster Distributed Synchronous SGD with Weak Synchronization |
Authors | Cong Xie, Oluwasanmi O. Koyejo, Indranil Gupta |
Abstract | Distributed training of deep learning is widely conducted with large neural networks and large datasets. Besides asynchronous stochastic gradient descent~(SGD), synchronous SGD is a reasonable alternative with better convergence guarantees. However, synchronous SGD suffers from stragglers. To make things worse, although there are some strategies dealing with slow workers, the issue of slow servers is commonly ignored. In this paper, we propose a new parameter server~(PS) framework dealing with not only slow workers, but also slow servers by weakening the synchronization criterion. The empirical results show good performance when there are stragglers. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=H13WofbAb |
https://openreview.net/pdf?id=H13WofbAb | |
PWC | https://paperswithcode.com/paper/faster-distributed-synchronous-sgd-with-weak |
Repo | |
Framework | |
Pivot Based Language Modeling for Improved Neural Domain Adaptation
Title | Pivot Based Language Modeling for Improved Neural Domain Adaptation |
Authors | Yftah Ziser, Roi Reichart |
Abstract | Representation learning with pivot-based methods and with Neural Networks (NNs) have lead to significant progress in domain adaptation for Natural Language Processing. However, most previous work that follows these approaches does not explicitly exploit the structure of the input text, and its output is most often a single representation vector for the entire text. In this paper we present the Pivot Based Language Model (PBLM), a representation learning model that marries together pivot-based and NN modeling in a structure aware manner. Particularly, our model processes the information in the text with a sequential NN (LSTM) and its output consists of a representation vector for every input word. Unlike most previous representation learning models in domain adaptation, PBLM can naturally feed structure aware text classifiers such as LSTM and CNN. We experiment with the task of cross-domain sentiment classification on 20 domain pairs and show substantial improvements over strong baselines. |
Tasks | Domain Adaptation, Language Modelling, Representation Learning, Sentiment Analysis |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1112/ |
https://www.aclweb.org/anthology/N18-1112 | |
PWC | https://paperswithcode.com/paper/pivot-based-language-modeling-for-improved |
Repo | |
Framework | |
Action-dependent Control Variates for Policy Optimization via Stein Identity
Title | Action-dependent Control Variates for Policy Optimization via Stein Identity |
Authors | Hao Liu*, Yihao Feng*, Yi Mao, Dengyong Zhou, Jian Peng, Qiang Liu |
Abstract | Policy gradient methods have achieved remarkable successes in solving challenging reinforcement learning problems. However, it still often suffers from the large variance issue on policy gradient estimation, which leads to poor sample efficiency during training. In this work, we propose a control variate method to effectively reduce variance for policy gradient methods. Motivated by the Stein’s identity, our method extends the previous control variate methods used in REINFORCE and advantage actor-critic by introducing more flexible and general action-dependent baseline functions. Empirical studies show that our method essentially improves the sample efficiency of the state-of-the-art policy gradient approaches. |
Tasks | Policy Gradient Methods |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=H1mCp-ZRZ |
https://openreview.net/pdf?id=H1mCp-ZRZ | |
PWC | https://paperswithcode.com/paper/action-dependent-control-variates-for-policy |
Repo | |
Framework | |
Not-So-CLEVR: Visual Relations Strain Feedforward Neural Networks
Title | Not-So-CLEVR: Visual Relations Strain Feedforward Neural Networks |
Authors | Junkyung Kim, Matthew Ricci, Thomas Serre |
Abstract | The robust and efficient recognition of visual relations in images is a hallmark of biological vision. Here, we argue that, despite recent progress in visual recognition, modern machine vision algorithms are severely limited in their ability to learn visual relations. Through controlled experiments, we demonstrate that visual-relation problems strain convolutional neural networks (CNNs). The networks eventually break altogether when rote memorization becomes impossible such as when the intra-class variability exceeds their capacity. We further show that another type of feedforward network, called a relational network (RN), which was shown to successfully solve seemingly difficult visual question answering (VQA) problems on the CLEVR datasets, suffers similar limitations. Motivated by the comparable success of biological vision, we argue that feedback mechanisms including working memory and attention are the key computational components underlying abstract visual reasoning. |
Tasks | Question Answering, Visual Question Answering, Visual Reasoning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HymuJz-A- |
https://openreview.net/pdf?id=HymuJz-A- | |
PWC | https://paperswithcode.com/paper/not-so-clevr-visual-relations-strain |
Repo | |
Framework | |
Japanese Advertising Slogan Generator using Case Frame and Word Vector
Title | Japanese Advertising Slogan Generator using Case Frame and Word Vector |
Authors | Kango Iwama, Yoshinobu Kano |
Abstract | There has been many works published for automatic sentence generation of a variety of domains. However, there would be still no single method available at present that can generate sentences for all of domains. Each domain will require a suitable generation method. We focus on automatic generation of Japanese advertisement slogans in this paper. We use our advertisement slogan database, case frame information, and word vector information. We employed our system to apply for a copy competition for human copywriters, where our advertisement slogan was left as a finalist. Our system could be regarded as the world first system that generates slogans in a practical level, as an advertising agency already employs our system in their business. |
Tasks | Machine Translation, Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6526/ |
https://www.aclweb.org/anthology/W18-6526 | |
PWC | https://paperswithcode.com/paper/japanese-advertising-slogan-generator-using |
Repo | |
Framework | |
Bringing Order to Neural Word Embeddings with Embeddings Augmented by Random Permutations (EARP)
Title | Bringing Order to Neural Word Embeddings with Embeddings Augmented by Random Permutations (EARP) |
Authors | Trevor Cohen, Dominic Widdows |
Abstract | Word order is clearly a vital part of human language, but it has been used comparatively lightly in distributional vector models. This paper presents a new method for incorporating word order information into word vector embedding models by combining the benefits of permutation-based order encoding with the more recent method of skip-gram with negative sampling. The new method introduced here is called Embeddings Augmented by Random Permutations (EARP). It operates by applying permutations to the coordinates of context vector representations during the process of training. Results show an 8{%} improvement in accuracy on the challenging Bigger Analogy Test Set, and smaller but consistent improvements on other analogy reference sets. These findings demonstrate the importance of order-based information in analogical retrieval tasks, and the utility of random permutations as a means to augment neural embeddings. |
Tasks | Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-1045/ |
https://www.aclweb.org/anthology/K18-1045 | |
PWC | https://paperswithcode.com/paper/bringing-order-to-neural-word-embeddings-with |
Repo | |
Framework | |
Proximal Dehaze-Net: A Prior Learning-Based Deep Network for Single Image Dehazing
Title | Proximal Dehaze-Net: A Prior Learning-Based Deep Network for Single Image Dehazing |
Authors | Dong Yang, Jian Sun |
Abstract | Photos taken in hazy weather are usually covered with white masks and often lose important details. In this paper, we propose a novel deep learning approach for single image dehazing by learning dark channel and transmission priors. First, we build an energy model for dehazing using dark channel and transmission priors and design an iterative optimization algorithm using proximal operators for these two priors. Second, we unfold the iterative algorithm to be a deep network, dubbed as extit{proximal dehaze-net}, by learning the proximal operators using convolutional neural networks. Our network combines the advantages of traditional prior-based dehazing methods and deep learning methods by incorporating haze-related prior learning into deep network. Experiments show that our method achieves state-of-the-art performance for single image dehazing. |
Tasks | Image Dehazing, Single Image Dehazing |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Dong_Yang_Proximal_Dehaze-Net_A_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Dong_Yang_Proximal_Dehaze-Net_A_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/proximal-dehaze-net-a-prior-learning-based |
Repo | |
Framework | |
Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View
Title | Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View |
Authors | Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, Thomas Funkhouser |
Abstract | We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation ( <=50%) in the form of an RGB-D image. To make this possible, Im2Pano3D leverages strong contextual priors learned from large-scale synthetic and real-world indoor scenes. To ease the prediction of 3D structure, we propose to parameterize 3D surfaces with their plane equations and train the model to predict these parameters directly. To provide meaningful training supervision, we make use of multiple loss functions that consider both pixel level accuracy and global context consistency. Experiments demonstrate that Im2Pano3D is able to predict the semantics and 3D structure of the unobserved scene with more than 56% pixel accuracy and less than 0.52m average distance error, which is significantly better than alternative approaches. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Song_Im2Pano3D_Extrapolating_360deg_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Song_Im2Pano3D_Extrapolating_360deg_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/im2pano3d-extrapolating-360a-structure-and |
Repo | |
Framework | |
Espresso: Efficient Forward Propagation for Binary Deep Neural Networks
Title | Espresso: Efficient Forward Propagation for Binary Deep Neural Networks |
Authors | Fabrizio Pedersoli, George Tzanetakis, Andrea Tagliasacchi |
Abstract | There are many applications scenarios for which the computational performance and memory footprint of the prediction phase of Deep Neural Networks (DNNs) need to be optimized. Binary Deep Neural Networks (BDNNs) have been shown to be an effective way of achieving this objective. In this paper, we show how Convolutional Neural Networks (CNNs) can be implemented using binary representations. Espresso is a compact, yet powerful library written in C/CUDA that features all the functionalities required for the forward propagation of CNNs, in a binary file less than 400KB, without any external dependencies. Although it is mainly designed to take advantage of massive GPU parallelism, Espresso also provides an equivalent CPU implementation for CNNs. Espresso provides special convolutional and dense layers for BCNNs, leveraging bit-packing and bit-wise computations for efficient execution. These techniques provide a speed-up of matrix-multiplication routines, and at the same time, reduce memory usage when storing parameters and activations. We experimentally show that Espresso is significantly faster than existing implementations of optimized binary neural networks (~ 2 orders of magnitude). Espresso is released under the Apache 2.0 license and is available at http://github.com/organization/project. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Sk6fD5yCb |
https://openreview.net/pdf?id=Sk6fD5yCb | |
PWC | https://paperswithcode.com/paper/espresso-efficient-forward-propagation-for-1 |
Repo | |
Framework | |
Post-training for Deep Learning
Title | Post-training for Deep Learning |
Authors | Thomas Moreau, Julien Audiffren |
Abstract | One of the main challenges of deep learning methods is the choice of an appropriate training strategy. In particular, additional steps, such as unsupervised pre-training, have been shown to greatly improve the performances of deep structures. In this article, we propose an extra training step, called post-training, which only optimizes the last layer of the network. We show that this procedure can be analyzed in the context of kernel theory, with the first layers computing an embedding of the data and the last layer a statistical model to solve the task based on this embedding. This step makes sure that the embedding, or representation, of the data is used in the best possible way for the considered task. This idea is then tested on multiple architectures with various data sets, showing that it consistently provides a boost in performance. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=H1O0KGC6b |
https://openreview.net/pdf?id=H1O0KGC6b | |
PWC | https://paperswithcode.com/paper/post-training-for-deep-learning |
Repo | |
Framework | |
Proposed Method for Annotation of Scientific Arguments in Terms of Semantic Relations and Argument Schemes
Title | Proposed Method for Annotation of Scientific Arguments in Terms of Semantic Relations and Argument Schemes |
Authors | Nancy Green |
Abstract | This paper presents a proposed method for annotation of scientific arguments in biological/biomedical journal articles. Semantic entities and relations are used to represent the propositional content of arguments in instances of argument schemes. We describe an experiment in which we encoded the arguments in a journal article to identify issues in this approach. Our catalogue of argument schemes and a copy of the annotated article are now publically available. |
Tasks | Argument Mining |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5213/ |
https://www.aclweb.org/anthology/W18-5213 | |
PWC | https://paperswithcode.com/paper/proposed-method-for-annotation-of-scientific |
Repo | |
Framework | |