Paper Group NANR 201
Academic-Industrial Perspective on the Development and Deployment of a Moderation System for a Newspaper Website. Bayesian Distributed Stochastic Gradient Descent. AmbientGAN: Generative models from lossy measurements. Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images. Linear RGB-D SLAM for Planar Environments. Apollo …
Academic-Industrial Perspective on the Development and Deployment of a Moderation System for a Newspaper Website
Title | Academic-Industrial Perspective on the Development and Deployment of a Moderation System for a Newspaper Website |
Authors | Dietmar Schabus, Marcin Skowron |
Abstract | |
Tasks | Information Retrieval, Text Classification |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1253/ |
https://www.aclweb.org/anthology/L18-1253 | |
PWC | https://paperswithcode.com/paper/academic-industrial-perspective-on-the |
Repo | |
Framework | |
Bayesian Distributed Stochastic Gradient Descent
Title | Bayesian Distributed Stochastic Gradient Descent |
Authors | Michael Teng, Frank Wood |
Abstract | We introduce Bayesian distributed stochastic gradient descent (BDSGD), a high-throughput algorithm for training deep neural networks on parallel clusters. This algorithm uses amortized inference in a deep generative model to perform joint posterior predictive inference of mini-batch gradient computation times in a compute cluster specific manner. Specifically, our algorithm mitigates the straggler effect in synchronous, gradient-based optimization by choosing an optimal cutoff beyond which mini-batch gradient messages from slow workers are ignored. In our experiments, we show that eagerly discarding the mini-batch gradient computations of stragglers not only increases throughput but actually increases the overall rate of convergence as a function of wall-clock time by virtue of eliminating idleness. The principal novel contribution and finding of this work goes beyond this by demonstrating that using the predicted run-times from a generative model of cluster worker performance improves substantially over the static-cutoff prior art, leading to reduced deep neural net training times on large computer clusters. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7874-bayesian-distributed-stochastic-gradient-descent |
http://papers.nips.cc/paper/7874-bayesian-distributed-stochastic-gradient-descent.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-distributed-stochastic-gradient |
Repo | |
Framework | |
AmbientGAN: Generative models from lossy measurements
Title | AmbientGAN: Generative models from lossy measurements |
Authors | Ashish Bora, Eric Price, Alexandros G. Dimakis |
Abstract | Generative models provide a way to model structure in complex distributions and have been shown to be useful for many tasks of practical interest. However, current techniques for training generative models require access to fully-observed samples. In many settings, it is expensive or even impossible to obtain fully-observed samples, but economical to obtain partial, noisy observations. We consider the task of learning an implicit generative model given only lossy measurements of samples from the distribution of interest. We show that the true underlying distribution can be provably recovered even in the presence of per-sample information loss for a class of measurement models. Based on this, we propose a new method of training Generative Adversarial Networks (GANs) which we call AmbientGAN. On three benchmark datasets, and for various measurement models, we demonstrate substantial qualitative and quantitative improvements. Generative models trained with our method can obtain $2$-$4$x higher inception scores than the baselines. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Hy7fDog0b |
https://openreview.net/pdf?id=Hy7fDog0b | |
PWC | https://paperswithcode.com/paper/ambientgan-generative-models-from-lossy |
Repo | |
Framework | |
Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images
Title | Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images |
Authors | Andrei Zanfir, Elisabeta Marinoiu, Mihai Zanfir, Alin-Ionut Popa, Cristian Sminchisescu |
Abstract | We present MubyNet – a feed-forward, multitask, bottom up system for the integrated localization, as well as 3d pose and shape estimation, of multiple people in monocular images. The challenge is the formal modeling of the problem that intrinsically requires discrete and continuous computation, e.g. grouping people vs. predicting 3d pose. The model identifies human body structures (joints and limbs) in images, groups them based on 2d and 3d information fused using learned scoring functions, and optimally aggregates such responses into partial or complete 3d human skeleton hypotheses under kinematic tree constraints, but without knowing in advance the number of people in the scene and their visibility relations. We design a multi-task deep neural network with differentiable stages where the person grouping problem is formulated as an integer program based on learned body part scores parameterized by both 2d and 3d information. This avoids suboptimality resulting from separate 2d and 3d reasoning, with grouping performed based on the combined representation. The final stage of 3d pose and shape prediction is based on a learned attention process where information from different human body parts is optimally integrated. State-of-the-art results are obtained in large scale datasets like Human3.6M and Panoptic, and qualitatively by reconstructing the 3d shape and pose of multiple people, under occlusion, in difficult monocular images. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8061-deep-network-for-the-integrated-3d-sensing-of-multiple-people-in-natural-images |
http://papers.nips.cc/paper/8061-deep-network-for-the-integrated-3d-sensing-of-multiple-people-in-natural-images.pdf | |
PWC | https://paperswithcode.com/paper/deep-network-for-the-integrated-3d-sensing-of |
Repo | |
Framework | |
Linear RGB-D SLAM for Planar Environments
Title | Linear RGB-D SLAM for Planar Environments |
Authors | Pyojin Kim, Brian Coltin, H. Jin Kim |
Abstract | We propose a new formulation for including orthogonal planar features as a global model into a linear SLAM approach based on sequential Bayesian filtering. Previous planar SLAM algorithms estimate the camera poses and multiple landmark planes in a pose graph optimization. However, since it is formulated as a high dimensional nonlinear optimization problem, there is no guarantee the algorithm will converge to the global optimum. To overcome these limitations, we present a new SLAM method that jointly estimates camera position and planar landmarks in the map within a linear Kalman filter framework. It is rotations that make the SLAM problem highly nonlinear. Therefore, we solve for the rotational motion of the camera using structural regularities in the Manhattan world (MW), resulting in a linear SLAM formulation. We test our algorithm on standard RGB-D benchmarks as well as additional large indoor environments, demonstrating comparable performance to other state-of-the-art SLAM methods without the use of expensive nonlinear optimization. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Pyojin_Kim_Linear_RGB-D_SLAM_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Pyojin_Kim_Linear_RGB-D_SLAM_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/linear-rgb-d-slam-for-planar-environments |
Repo | |
Framework | |
Apollo at SemEval-2018 Task 9: Detecting Hypernymy Relations Using Syntactic Dependencies
Title | Apollo at SemEval-2018 Task 9: Detecting Hypernymy Relations Using Syntactic Dependencies |
Authors | Mihaela Onofrei, Ionu{\textcommabelow{t}} Hulub, Tr, Diana ab{\u{a}}{\textcommabelow{t}}, Daniela G{^\i}fu |
Abstract | This paper presents the participation of Apollo{'}s team in the SemEval-2018 Task 9 {}Hypernym Discovery{''}, Subtask 1: { }General-Purpose Hypernym Discovery{''}, which tries to produce a ranked list of hypernyms for a specific term. We propose a novel approach for automatic extraction of hypernymy relations from a corpus by using dependency patterns. We estimated that the application of these patterns leads to a higher score than using the traditional lexical patterns. |
Tasks | Hypernym Discovery |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1146/ |
https://www.aclweb.org/anthology/S18-1146 | |
PWC | https://paperswithcode.com/paper/apollo-at-semeval-2018-task-9-detecting |
Repo | |
Framework | |
NLP_HZ at SemEval-2018 Task 9: a Nearest Neighbor Approach
Title | NLP_HZ at SemEval-2018 Task 9: a Nearest Neighbor Approach |
Authors | Wei Qiu, Mosha Chen, Linlin Li, Luo Si |
Abstract | Hypernym discovery aims to discover the hypernym word sets given a hyponym word and proper corpus. This paper proposes a simple but effective method for the discovery of hypernym sets based on word embedding, which can be used to measure the contextual similarities between words. Given a test hyponym word, we get its hypernym lists by computing the similarities between the hyponym word and words in the training data, and fill the test word{'}s hypernym lists with the hypernym list in the training set of the nearest similarity distance to the test word. In SemEval 2018 task9, our results, achieve 1st on Spanish, 2nd on Italian, 6th on English in the metric of MAP. |
Tasks | Hypernym Discovery, Information Retrieval, Natural Language Inference, Question Answering, Word Sense Disambiguation |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1148/ |
https://www.aclweb.org/anthology/S18-1148 | |
PWC | https://paperswithcode.com/paper/nlp_hz-at-semeval-2018-task-9-a-nearest |
Repo | |
Framework | |
ADAPT at SemEval-2018 Task 9: Skip-Gram Word Embeddings for Unsupervised Hypernym Discovery in Specialised Corpora
Title | ADAPT at SemEval-2018 Task 9: Skip-Gram Word Embeddings for Unsupervised Hypernym Discovery in Specialised Corpora |
Authors | Alfredo Maldonado, Filip Klubi{\v{c}}ka |
Abstract | This paper describes a simple but competitive unsupervised system for hypernym discovery. The system uses skip-gram word embeddings with negative sampling, trained on specialised corpora. Candidate hypernyms for an input word are predicted based based on cosine similarity scores. Two sets of word embedding models were trained separately on two specialised corpora: a medical corpus and a music industry corpus. Our system scored highest in the medical domain among the competing unsupervised systems but performed poorly on the music industry domain. Our system does not depend on any external data other than raw specialised corpora. |
Tasks | Hypernym Discovery, Word Embeddings |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1151/ |
https://www.aclweb.org/anthology/S18-1151 | |
PWC | https://paperswithcode.com/paper/adapt-at-semeval-2018-task-9-skip-gram-word |
Repo | |
Framework | |
Improving a Neural-based Tagger for Multiword Expressions Identification
Title | Improving a Neural-based Tagger for Multiword Expressions Identification |
Authors | Du{\v{s}}an Vari{\v{s}}, Natalia Klyueva |
Abstract | |
Tasks | Dependency Parsing, Machine Translation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1401/ |
https://www.aclweb.org/anthology/L18-1401 | |
PWC | https://paperswithcode.com/paper/improving-a-neural-based-tagger-for-multiword |
Repo | |
Framework | |
TD Learning with Constrained Gradients
Title | TD Learning with Constrained Gradients |
Authors | Ishan Durugkar, Peter Stone |
Abstract | Temporal Difference Learning with function approximation is known to be unstable. Previous work like \citet{sutton2009fast} and \citet{sutton2009convergent} has presented alternative objectives that are stable to minimize. However, in practice, TD-learning with neural networks requires various tricks like using a target network that updates slowly \citep{mnih2015human}. In this work we propose a constraint on the TD update that minimizes change to the target values. This constraint can be applied to the gradients of any TD objective, and can be easily applied to nonlinear function approximation. We validate this update by applying our technique to deep Q-learning, and training without a target network. We also show that adding this constraint on Baird’s counterexample keeps Q-learning from diverging. |
Tasks | Q-Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Bk-ofQZRb |
https://openreview.net/pdf?id=Bk-ofQZRb | |
PWC | https://paperswithcode.com/paper/td-learning-with-constrained-gradients |
Repo | |
Framework | |
Latent Gaussian Activity Propagation: Using Smoothness and Structure to Separate and Localize Sounds in Large Noisy Environments
Title | Latent Gaussian Activity Propagation: Using Smoothness and Structure to Separate and Localize Sounds in Large Noisy Environments |
Authors | Daniel Johnson, Daniel Gorelik, Ross E. Mawhorter, Kyle Suver, Weiqing Gu, Steven Xing, Cody Gabriel, Peter Sankhagowit |
Abstract | We present an approach for simultaneously separating and localizing multiple sound sources using recorded microphone data. Inspired by topic models, our approach is based on a probabilistic model of inter-microphone phase differences, and poses separation and localization as a Bayesian inference problem. We assume sound activity is locally smooth across time, frequency, and location, and use the known position of the microphones to obtain a consistent separation. We compare the performance of our method against existing algorithms on simulated anechoic voice data and find that it obtains high performance across a variety of input conditions. |
Tasks | Bayesian Inference, Topic Models |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7606-latent-gaussian-activity-propagation-using-smoothness-and-structure-to-separate-and-localize-sounds-in-large-noisy-environments |
http://papers.nips.cc/paper/7606-latent-gaussian-activity-propagation-using-smoothness-and-structure-to-separate-and-localize-sounds-in-large-noisy-environments.pdf | |
PWC | https://paperswithcode.com/paper/latent-gaussian-activity-propagation-using |
Repo | |
Framework | |
THE EFFECTIVENESS OF A TWO-LAYER NEURAL NETWORK FOR RECOMMENDATIONS
Title | THE EFFECTIVENESS OF A TWO-LAYER NEURAL NETWORK FOR RECOMMENDATIONS |
Authors | Oleg Rybakov, Vijai Mohan, Avishkar Misra, Scott LeGrand, Rejith Joseph, Kiuk Chung, Siddharth Singh, Qian You, Eric Nalisnick, Leo Dirac, Runfei Luo |
Abstract | We present a personalized recommender system using neural network for recommending products, such as eBooks, audio-books, Mobile Apps, Video and Music. It produces recommendations based on customer’s implicit feedback history such as purchases, listens or watches. Our key contribution is to formulate recommendation problem as a model that encodes historical behavior to predict the future behavior using soft data split, combining predictor and auto-encoder models. We introduce convolutional layer for learning the importance (time decay) of the purchases depending on their purchase date and demonstrate that the shape of the time decay function can be well approximated by a parametrical function. We present offline experimental results showing that neural networks with two hidden layers can capture seasonality changes, and at the same time outperform other modeling techniques, including our recommender in production. Most importantly, we demonstrate that our model can be scaled to all digital categories, and we observe significant improvements in an online A/B test. We also discuss key enhancements to the neural network model and describe our production pipeline. Finally we open-sourced our deep learning library which supports multi-gpu model parallel training. This is an important feature in building neural network based recommenders with large dimensionality of input and output data. |
Tasks | Recommendation Systems |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=B1lMMx1CW |
https://openreview.net/pdf?id=B1lMMx1CW | |
PWC | https://paperswithcode.com/paper/the-effectiveness-of-a-two-layer-neural |
Repo | |
Framework | |
Modeling bilingual word associations as connected monolingual networks
Title | Modeling bilingual word associations as connected monolingual networks |
Authors | Yevgen Matusevych, Amir Ardalan Kalantari Dehaghi, Suzanne Stevenson |
Abstract | |
Tasks | |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/W18-0106/ |
https://www.aclweb.org/anthology/W18-0106 | |
PWC | https://paperswithcode.com/paper/modeling-bilingual-word-associations-as |
Repo | |
Framework | |
Prediction Under Uncertainty with Error Encoding Networks
Title | Prediction Under Uncertainty with Error Encoding Networks |
Authors | Mikael Henaff, Junbo Zhao, Yann Lecun |
Abstract | In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty. It is based on a simple idea of disentangling com- ponents of the future state which are predictable from those which are inherently unpredictable, and encoding the unpredictable components into a low-dimensional latent variable which is fed into the forward model. Our method uses a simple su- pervised training objective which is fast and easy to train. We evaluate it in the context of video prediction on multiple datasets and show that it is able to consi- tently generate diverse predictions without the need for alternating minimization over a latent space or adversarial training. |
Tasks | Video Prediction |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HJIhGXWCZ |
https://openreview.net/pdf?id=HJIhGXWCZ | |
PWC | https://paperswithcode.com/paper/prediction-under-uncertainty-with-error-1 |
Repo | |
Framework | |
Igevorse at SemEval-2018 Task 10: Exploring an Impact of Word Embeddings Concatenation for Capturing Discriminative Attributes
Title | Igevorse at SemEval-2018 Task 10: Exploring an Impact of Word Embeddings Concatenation for Capturing Discriminative Attributes |
Authors | Maxim Grishin |
Abstract | This paper presents a comparison of several approaches for capturing discriminative attributes and considers an impact of concatenation of several word embeddings of different nature on the classification performance. A similarity-based method is proposed and compared with classical machine learning approaches. It is shown that this method outperforms others on all the considered word vector models and there is a performance increase when concatenated datasets are used. |
Tasks | Semantic Textual Similarity, Word Embeddings |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1164/ |
https://www.aclweb.org/anthology/S18-1164 | |
PWC | https://paperswithcode.com/paper/igevorse-at-semeval-2018-task-10-exploring-an |
Repo | |
Framework | |