Paper Group AWR 94
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation. Semantic Compositional Networks for Visual Captioning. Learning Shape Abstractions by Assembling Volumetric Primitives. Search Personalization with Embeddings. Hand Segmentation for Hand-Object Interaction from Depth map …
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
Title | How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation |
Authors | Chia-Wei Liu, Ryan Lowe, Iulian V. Serban, Michael Noseworthy, Laurent Charlin, Joelle Pineau |
Abstract | We investigate evaluation metrics for dialogue response generation systems where supervised labels, such as task completion, are not available. Recent works in response generation have adopted metrics from machine translation to compare a model’s generated response to a single target response. We show that these metrics correlate very weakly with human judgements in the non-technical Twitter domain, and not at all in the technical Ubuntu domain. We provide quantitative and qualitative results highlighting specific weaknesses in existing metrics, and provide recommendations for future development of better automatic evaluation metrics for dialogue systems. |
Tasks | Machine Translation |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.08023v2 |
http://arxiv.org/pdf/1603.08023v2.pdf | |
PWC | https://paperswithcode.com/paper/how-not-to-evaluate-your-dialogue-system-an |
Repo | https://github.com/piekey1994/IOM |
Framework | pytorch |
Semantic Compositional Networks for Visual Captioning
Title | Semantic Compositional Networks for Visual Captioning |
Authors | Zhe Gan, Chuang Gan, Xiaodong He, Yunchen Pu, Kenneth Tran, Jianfeng Gao, Lawrence Carin, Li Deng |
Abstract | A Semantic Compositional Network (SCN) is developed for image captioning, in which semantic concepts (i.e., tags) are detected from the image, and the probability of each tag is used to compose the parameters in a long short-term memory (LSTM) network. The SCN extends each weight matrix of the LSTM to an ensemble of tag-dependent weight matrices. The degree to which each member of the ensemble is used to generate an image caption is tied to the image-dependent probability of the corresponding tag. In addition to captioning images, we also extend the SCN to generate captions for video clips. We qualitatively analyze semantic composition in SCNs, and quantitatively evaluate the algorithm on three benchmark datasets: COCO, Flickr30k, and Youtube2Text. Experimental results show that the proposed method significantly outperforms prior state-of-the-art approaches, across multiple evaluation metrics. |
Tasks | Image Captioning, Semantic Composition |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.08002v2 |
http://arxiv.org/pdf/1611.08002v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-compositional-networks-for-visual |
Repo | https://github.com/zhegan27/Semantic_Compositional_Nets |
Framework | none |
Learning Shape Abstractions by Assembling Volumetric Primitives
Title | Learning Shape Abstractions by Assembling Volumetric Primitives |
Authors | Shubham Tulsiani, Hao Su, Leonidas J. Guibas, Alexei A. Efros, Jitendra Malik |
Abstract | We present a learning framework for abstracting complex shapes by learning to assemble objects using 3D volumetric primitives. In addition to generating simple and geometrically interpretable explanations of 3D objects, our framework also allows us to automatically discover and exploit consistent structure in the data. We demonstrate that using our method allows predicting shape representations which can be leveraged for obtaining a consistent parsing across the instances of a shape collection and constructing an interpretable shape similarity measure. We also examine applications for image-based prediction as well as shape manipulation. |
Tasks | |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00404v4 |
http://arxiv.org/pdf/1612.00404v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-shape-abstractions-by-assembling |
Repo | https://github.com/paschalidoud/superquadric_parsing |
Framework | pytorch |
Search Personalization with Embeddings
Title | Search Personalization with Embeddings |
Authors | Thanh Vu, Dat Quoc Nguyen, Mark Johnson, Dawei Song, Alistair Willis |
Abstract | Recent research has shown that the performance of search personalization depends on the richness of user profiles which normally represent the user’s topical interests. In this paper, we propose a new embedding approach to learning user profiles, where users are embedded on a topical interest space. We then directly utilize the user profiles for search personalization. Experiments on query logs from a major commercial web search engine demonstrate that our embedding approach improves the performance of the search engine and also achieves better search performance than other strong baselines. |
Tasks | |
Published | 2016-12-12 |
URL | http://arxiv.org/abs/1612.03597v1 |
http://arxiv.org/pdf/1612.03597v1.pdf | |
PWC | https://paperswithcode.com/paper/search-personalization-with-embeddings |
Repo | https://github.com/daiquocnguyen/CapsE |
Framework | tf |
Hand Segmentation for Hand-Object Interaction from Depth map
Title | Hand Segmentation for Hand-Object Interaction from Depth map |
Authors | Byeongkeun Kang, Kar-Han Tan, Nan Jiang, Hung-Shuo Tai, Daniel Tretter, Truong Q. Nguyen |
Abstract | Hand segmentation for hand-object interaction is a necessary preprocessing step in many applications such as augmented reality, medical application, and human-robot interaction. However, typical methods are based on color information which is not robust to objects with skin color, skin pigment difference, and light condition variations. Thus, we propose hand segmentation method for hand-object interaction using only a depth map. It is challenging because of the small depth difference between a hand and objects during an interaction. To overcome this challenge, we propose the two-stage random decision forest (RDF) method consisting of detecting hands and segmenting hands. To validate the proposed method, we demonstrate results on the publicly available dataset of hand segmentation for hand-object interaction. The proposed method achieves high accuracy in short processing time comparing to the other state-of-the-art methods. |
Tasks | Hand Segmentation |
Published | 2016-03-08 |
URL | http://arxiv.org/abs/1603.02345v3 |
http://arxiv.org/pdf/1603.02345v3.pdf | |
PWC | https://paperswithcode.com/paper/hand-segmentation-for-hand-object-interaction |
Repo | https://github.com/byeongkeun-kang/HOI-dataset |
Framework | none |
Low-Rank Factorization of Determinantal Point Processes for Recommendation
Title | Low-Rank Factorization of Determinantal Point Processes for Recommendation |
Authors | Mike Gartrell, Ulrich Paquet, Noam Koenigstein |
Abstract | Determinantal point processes (DPPs) have garnered attention as an elegant probabilistic model of set diversity. They are useful for a number of subset selection tasks, including product recommendation. DPPs are parametrized by a positive semi-definite kernel matrix. In this work we present a new method for learning the DPP kernel from observed data using a low-rank factorization of this kernel. We show that this low-rank factorization enables a learning algorithm that is nearly an order of magnitude faster than previous approaches, while also providing for a method for computing product recommendation predictions that is far faster (up to 20x faster or more for large item catalogs) than previous techniques that involve a full-rank DPP kernel. Furthermore, we show that our method provides equivalent or sometimes better predictive performance than prior full-rank DPP approaches, and better performance than several other competing recommendation methods in many cases. We conduct an extensive experimental evaluation using several real-world datasets in the domain of product recommendation to demonstrate the utility of our method, along with its limitations. |
Tasks | Point Processes, Product Recommendation |
Published | 2016-02-17 |
URL | http://arxiv.org/abs/1602.05436v1 |
http://arxiv.org/pdf/1602.05436v1.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-factorization-of-determinantal-point |
Repo | https://github.com/mankmonjre/k-DPP-reco-engine |
Framework | none |
A Semisupervised Approach for Language Identification based on Ladder Networks
Title | A Semisupervised Approach for Language Identification based on Ladder Networks |
Authors | Ehud Ben-Reuven, Jacob Goldberger |
Abstract | In this study we address the problem of training a neuralnetwork for language identification using both labeled and unlabeled speech samples in the form of i-vectors. We propose a neural network architecture that can also handle out-of-set languages. We utilize a modified version of the recently proposed Ladder Network semisupervised training procedure that optimizes the reconstruction costs of a stack of denoising autoencoders. We show that this approach can be successfully applied to the case where the training dataset is composed of both labeled and unlabeled acoustic data. The results show enhanced language identification on the NIST 2015 language identification dataset. |
Tasks | Denoising, Language Identification |
Published | 2016-04-01 |
URL | http://arxiv.org/abs/1604.00317v1 |
http://arxiv.org/pdf/1604.00317v1.pdf | |
PWC | https://paperswithcode.com/paper/a-semisupervised-approach-for-language |
Repo | https://github.com/udibr/LRE |
Framework | none |
Learning Semantically and Additively Compositional Distributional Representations
Title | Learning Semantically and Additively Compositional Distributional Representations |
Authors | Ran Tian, Naoaki Okazaki, Kentaro Inui |
Abstract | This paper connects a vector-based composition model to a formal semantics, the Dependency-based Compositional Semantics (DCS). We show theoretical evidence that the vector compositions in our model conform to the logic of DCS. Experimentally, we show that vector-based composition brings a strong ability to calculate similar phrases as similar vectors, achieving near state-of-the-art on a wide range of phrase similarity tasks and relation classification; meanwhile, DCS can guide building vectors for structured queries that can be directly executed. We evaluate this utility on sentence completion task and report a new state-of-the-art. |
Tasks | Relation Classification |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02461v1 |
http://arxiv.org/pdf/1606.02461v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-semantically-and-additively |
Repo | https://github.com/tianran/vecdcs |
Framework | none |
Capacity and Trainability in Recurrent Neural Networks
Title | Capacity and Trainability in Recurrent Neural Networks |
Authors | Jasmine Collins, Jascha Sohl-Dickstein, David Sussillo |
Abstract | Two potential bottlenecks on the expressiveness of recurrent neural networks (RNNs) are their ability to store information about the task in their parameters, and to store information about the input history in their units. We show experimentally that all common RNN architectures achieve nearly the same per-task and per-unit capacity bounds with careful training, for a variety of tasks and stacking depths. They can store an amount of task information which is linear in the number of parameters, and is approximately 5 bits per parameter. They can additionally store approximately one real number from their input history per hidden unit. We further find that for several tasks it is the per-task parameter capacity bound that determines performance. These results suggest that many previous results comparing RNN architectures are driven primarily by differences in training effectiveness, rather than differences in capacity. Supporting this observation, we compare training difficulty for several architectures, and show that vanilla RNNs are far more difficult to train, yet have slightly higher capacity. Finally, we propose two novel RNN architectures, one of which is easier to train than the LSTM or GRU for deeply stacked architectures. |
Tasks | |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09913v3 |
http://arxiv.org/pdf/1611.09913v3.pdf | |
PWC | https://paperswithcode.com/paper/capacity-and-trainability-in-recurrent-neural |
Repo | https://github.com/trevor-richardson/rnn_zoo |
Framework | pytorch |
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
Title | The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables |
Authors | Chris J. Maddison, Andriy Mnih, Yee Whye Teh |
Abstract | The reparameterization trick enables optimizing large scale stochastic computation graphs via gradient descent. The essence of the trick is to refactor each stochastic node into a differentiable function of its parameters and a random variable with fixed distribution. After refactoring, the gradients of the loss propagated by the chain rule through the graph are low variance unbiased estimators of the gradients of the expected loss. While many continuous random variables have such reparameterizations, discrete random variables lack useful reparameterizations due to the discontinuous nature of discrete states. In this work we introduce Concrete random variables—continuous relaxations of discrete random variables. The Concrete distribution is a new family of distributions with closed form densities and a simple reparameterization. Whenever a discrete stochastic node of a computation graph can be refactored into a one-hot bit representation that is treated continuously, Concrete stochastic nodes can be used with automatic differentiation to produce low-variance biased gradients of objectives (including objectives that depend on the log-probability of latent stochastic nodes) on the corresponding discrete graph. We demonstrate the effectiveness of Concrete relaxations on density estimation and structured prediction tasks using neural networks. |
Tasks | Density Estimation, Structured Prediction |
Published | 2016-11-02 |
URL | http://arxiv.org/abs/1611.00712v3 |
http://arxiv.org/pdf/1611.00712v3.pdf | |
PWC | https://paperswithcode.com/paper/the-concrete-distribution-a-continuous |
Repo | https://github.com/tensorflow/models |
Framework | tf |
Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning
Title | Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning |
Authors | Amit Shaked, Lior Wolf |
Abstract | We present an improved three-step pipeline for the stereo matching problem and introduce multiple novelties at each stage. We propose a new highway network architecture for computing the matching cost at each possible disparity, based on multilevel weighted residual shortcuts, trained with a hybrid loss that supports multilevel comparison of image patches. A novel post-processing step is then introduced, which employs a second deep convolutional neural network for pooling global information from multiple disparities. This network outputs both the image disparity map, which replaces the conventional “winner takes all” strategy, and a confidence in the prediction. The confidence score is achieved by training the network with a new technique that we call the reflective loss. Lastly, the learned confidence is employed in order to better detect outliers in the refinement step. The proposed pipeline achieves state of the art accuracy on the largest and most competitive stereo benchmarks, and the learned confidence is shown to outperform all existing alternatives. |
Tasks | Stereo Matching, Stereo Matching Hand |
Published | 2016-12-31 |
URL | http://arxiv.org/abs/1701.00165v1 |
http://arxiv.org/pdf/1701.00165v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-stereo-matching-with-constant |
Repo | https://github.com/amitshaked/resmatch |
Framework | torch |
Adaptive Computation Time for Recurrent Neural Networks
Title | Adaptive Computation Time for Recurrent Neural Networks |
Authors | Alex Graves |
Abstract | This paper introduces Adaptive Computation Time (ACT), an algorithm that allows recurrent neural networks to learn how many computational steps to take between receiving an input and emitting an output. ACT requires minimal changes to the network architecture, is deterministic and differentiable, and does not add any noise to the parameter gradients. Experimental results are provided for four synthetic problems: determining the parity of binary vectors, applying binary logic operations, adding integers, and sorting real numbers. Overall, performance is dramatically improved by the use of ACT, which successfully adapts the number of computational steps to the requirements of the problem. We also present character-level language modelling results on the Hutter prize Wikipedia dataset. In this case ACT does not yield large gains in performance; however it does provide intriguing insight into the structure of the data, with more computation allocated to harder-to-predict transitions, such as spaces between words and ends of sentences. This suggests that ACT or other adaptive computation methods could provide a generic method for inferring segment boundaries in sequence data. |
Tasks | Language Modelling |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1603.08983v6 |
http://arxiv.org/pdf/1603.08983v6.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-computation-time-for-recurrent |
Repo | https://github.com/ceyzaguirre4/adaptive_computation |
Framework | pytorch |
Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions
Title | Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions |
Authors | Marcin Junczys-Dowmunt, Tomasz Dwojak, Hieu Hoang |
Abstract | In this paper we provide the largest published comparison of translation quality for phrase-based SMT and neural machine translation across 30 translation directions. For ten directions we also include hierarchical phrase-based MT. Experiments are performed for the recently published United Nations Parallel Corpus v1.0 and its large six-way sentence-aligned subcorpus. In the second part of the paper we investigate aspects of translation speed, introducing AmuNMT, our efficient neural machine translation decoder. We demonstrate that current neural machine translation could already be used for in-production systems when comparing words-per-second ratios. |
Tasks | Machine Translation |
Published | 2016-10-04 |
URL | http://arxiv.org/abs/1610.01108v3 |
http://arxiv.org/pdf/1610.01108v3.pdf | |
PWC | https://paperswithcode.com/paper/is-neural-machine-translation-ready-for |
Repo | https://github.com/lkfo415579/amun |
Framework | none |
TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification
Title | TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification |
Authors | Georgios Balikas, Massih-Reza Amini |
Abstract | This paper describes the participation of the team “TwiSE” in the SemEval 2016 challenge. Specifically, we participated in Task 4, namely “Sentiment Analysis in Twitter” for which we implemented sentiment classification systems for subtasks A, B, C and D. Our approach consists of two steps. In the first step, we generate and validate diverse feature sets for twitter sentiment evaluation, inspired by the work of participants of previous editions of such challenges. In the second step, we focus on the optimization of the evaluation measures of the different subtasks. To this end, we examine different learning strategies by validating them on the data provided by the task organisers. For our final submissions we used an ensemble learning approach (stacked generalization) for Subtask A and single linear models for the rest of the subtasks. In the official leaderboard we were ranked 9/35, 8/19, 1/11 and 2/14 for subtasks A, B, C and D respectively.\footnote{We make the code available for research purposes at \url{https://github.com/balikasg/SemEval2016-Twitter_Sentiment_Evaluation}.} |
Tasks | Sentiment Analysis |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04351v1 |
http://arxiv.org/pdf/1606.04351v1.pdf | |
PWC | https://paperswithcode.com/paper/twise-at-semeval-2016-task-4-twitter |
Repo | https://github.com/balikasg/SemEval2016-Twitter_Sentiment_Evaluation |
Framework | none |
Geometric Mean Metric Learning
Title | Geometric Mean Metric Learning |
Authors | Pourya Habib Zadeh, Reshad Hosseini, Suvrit Sra |
Abstract | We revisit the task of learning a Euclidean metric from data. We approach this problem from first principles and formulate it as a surprisingly simple optimization problem. Indeed, our formulation even admits a closed form solution. This solution possesses several very attractive properties: (i) an innate geometric appeal through the Riemannian geometry of positive definite matrices; (ii) ease of interpretability; and (iii) computational speed several orders of magnitude faster than the widely used LMNN and ITML methods. Furthermore, on standard benchmark datasets, our closed-form solution consistently attains higher classification accuracy. |
Tasks | Metric Learning |
Published | 2016-07-18 |
URL | http://arxiv.org/abs/1607.05002v1 |
http://arxiv.org/pdf/1607.05002v1.pdf | |
PWC | https://paperswithcode.com/paper/geometric-mean-metric-learning |
Repo | https://github.com/PouriaZ/GMML |
Framework | none |