Paper Group ANR 790
Accelerated Primal-Dual Proximal Block Coordinate Updating Methods for Constrained Convex Optimization. Dataset Augmentation in Feature Space. Proceedings of the 2017 AdKDD & TargetAd Workshop. Light Field Super-Resolution Via Graph-Based Regularization. Strawman: an Ensemble of Deep Bag-of-Ngrams for Sentiment Analysis. Optical Character Recogniti …
Accelerated Primal-Dual Proximal Block Coordinate Updating Methods for Constrained Convex Optimization
Title | Accelerated Primal-Dual Proximal Block Coordinate Updating Methods for Constrained Convex Optimization |
Authors | Yangyang Xu, Shuzhong Zhang |
Abstract | Block Coordinate Update (BCU) methods enjoy low per-update computational complexity because every time only one or a few block variables would need to be updated among possibly a large number of blocks. They are also easily parallelized and thus have been particularly popular for solving problems involving large-scale dataset and/or variables. In this paper, we propose a primal-dual BCU method for solving linearly constrained convex program in multi-block variables. The method is an accelerated version of a primal-dual algorithm proposed by the authors, which applies randomization in selecting block variables to update and establishes an $O(1/t)$ convergence rate under weak convexity assumption. We show that the rate can be accelerated to $O(1/t^2)$ if the objective is strongly convex. In addition, if one block variable is independent of the others in the objective, we then show that the algorithm can be modified to achieve a linear rate of convergence. The numerical experiments show that the accelerated method performs stably with a single set of parameters while the original method needs to tune the parameters for different datasets in order to achieve a comparable level of performance. |
Tasks | |
Published | 2017-02-17 |
URL | http://arxiv.org/abs/1702.05423v3 |
http://arxiv.org/pdf/1702.05423v3.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-primal-dual-proximal-block |
Repo | |
Framework | |
Dataset Augmentation in Feature Space
Title | Dataset Augmentation in Feature Space |
Authors | Terrance DeVries, Graham W. Taylor |
Abstract | Dataset augmentation, the practice of applying a wide array of domain-specific transformations to synthetically expand a training set, is a standard tool in supervised learning. While effective in tasks such as visual recognition, the set of transformations must be carefully designed, implemented, and tested for every new domain, limiting its re-use and generality. In this paper, we adopt a simpler, domain-agnostic approach to dataset augmentation. We start with existing data points and apply simple transformations such as adding noise, interpolating, or extrapolating between them. Our main insight is to perform the transformation not in input space, but in a learned feature space. A re-kindling of interest in unsupervised representation learning makes this technique timely and more effective. It is a simple proposal, but to-date one that has not been tested empirically. Working in the space of context vectors generated by sequence-to-sequence models, we demonstrate a technique that is effective for both static and sequential data. |
Tasks | Representation Learning, Unsupervised Representation Learning |
Published | 2017-02-17 |
URL | http://arxiv.org/abs/1702.05538v1 |
http://arxiv.org/pdf/1702.05538v1.pdf | |
PWC | https://paperswithcode.com/paper/dataset-augmentation-in-feature-space |
Repo | |
Framework | |
Proceedings of the 2017 AdKDD & TargetAd Workshop
Title | Proceedings of the 2017 AdKDD & TargetAd Workshop |
Authors | Abraham Bagherjeiran, Nemanja Djuric, Mihajlo Grbovic, Kuang-Chih Lee, Kun Liu, Vladan Radosavljevic, Suju Rajan |
Abstract | Proceedings of the 2017 AdKDD and TargetAd Workshop held in conjunction with the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining Halifax, Nova Scotia, Canada. |
Tasks | |
Published | 2017-07-11 |
URL | http://arxiv.org/abs/1707.03471v1 |
http://arxiv.org/pdf/1707.03471v1.pdf | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-2017-adkdd-targetad |
Repo | |
Framework | |
Light Field Super-Resolution Via Graph-Based Regularization
Title | Light Field Super-Resolution Via Graph-Based Regularization |
Authors | Mattia Rossi, Pascal Frossard |
Abstract | Light field cameras capture the 3D information in a scene with a single exposure. This special feature makes light field cameras very appealing for a variety of applications: from post-capture refocus, to depth estimation and image-based rendering. However, light field cameras suffer by design from strong limitations in their spatial resolution, which should therefore be augmented by computational methods. On the one hand, off-the-shelf single-frame and multi-frame super-resolution algorithms are not ideal for light field data, as they do not consider its particular structure. On the other hand, the few super-resolution algorithms explicitly tailored for light field data exhibit significant limitations, such as the need to estimate an explicit disparity map at each view. In this work we propose a new light field super-resolution algorithm meant to address these limitations. We adopt a multi-frame alike super-resolution approach, where the complementary information in the different light field views is used to augment the spatial resolution of the whole light field. We show that coupling the multi-frame approach with a graph regularizer, that enforces the light field structure via nonlocal self similarities, permits to avoid the costly and challenging disparity estimation step for all the views. Extensive experiments show that the new algorithm compares favorably to the other state-of-the-art methods for light field super-resolution, both in terms of PSNR and visual quality. |
Tasks | Depth Estimation, Disparity Estimation, Multi-Frame Super-Resolution, Super-Resolution |
Published | 2017-01-09 |
URL | http://arxiv.org/abs/1701.02141v2 |
http://arxiv.org/pdf/1701.02141v2.pdf | |
PWC | https://paperswithcode.com/paper/light-field-super-resolution-via-graph-based |
Repo | |
Framework | |
Strawman: an Ensemble of Deep Bag-of-Ngrams for Sentiment Analysis
Title | Strawman: an Ensemble of Deep Bag-of-Ngrams for Sentiment Analysis |
Authors | Kyunghyun Cho |
Abstract | This paper describes a builder entry, named “strawman”, to the sentence-level sentiment analysis task of the “Build It, Break It” shared task of the First Workshop on Building Linguistically Generalizable NLP Systems. The goal of a builder is to provide an automated sentiment analyzer that would serve as a target for breakers whose goal is to find pairs of minimally-differing sentences that break the analyzer. |
Tasks | Sentiment Analysis |
Published | 2017-07-26 |
URL | http://arxiv.org/abs/1707.08939v1 |
http://arxiv.org/pdf/1707.08939v1.pdf | |
PWC | https://paperswithcode.com/paper/strawman-an-ensemble-of-deep-bag-of-ngrams |
Repo | |
Framework | |
Optical Character Recognition (OCR) for Telugu: Database, Algorithm and Application
Title | Optical Character Recognition (OCR) for Telugu: Database, Algorithm and Application |
Authors | Chandra Prakash Konkimalla, Manikanta Srikar Yellapragada, Trishal Gayam, Souraj Mandal, Sumohana S. Channappayya |
Abstract | Telugu is a Dravidian language spoken by more than 80 million people worldwide. The optical character recognition (OCR) of the Telugu script has wide ranging applications including education, health-care, administration etc. The beautiful Telugu script however is very different from Germanic scripts like English and German. This makes the use of transfer learning of Germanic OCR solutions to Telugu a non-trivial task. To address the challenge of OCR for Telugu, we make three contributions in this work: (i) a database of Telugu characters, (ii) a deep learning based OCR algorithm, and (iii) a client server solution for the online deployment of the algorithm. For the benefit of the Telugu people and the research community, we will make our code freely available at https://gayamtrishal.github.io/OCR_Telugu.github.io/ |
Tasks | Optical Character Recognition, Transfer Learning |
Published | 2017-11-20 |
URL | http://arxiv.org/abs/1711.07245v2 |
http://arxiv.org/pdf/1711.07245v2.pdf | |
PWC | https://paperswithcode.com/paper/optical-character-recognition-ocr-for-telugu |
Repo | |
Framework | |
Better Agnostic Clustering Via Relaxed Tensor Norms
Title | Better Agnostic Clustering Via Relaxed Tensor Norms |
Authors | Pravesh K. Kothari, Jacob Steinhardt |
Abstract | We develop a new family of convex relaxations for $k$-means clustering based on sum-of-squares norms, a relaxation of the injective tensor norm that is efficiently computable using the Sum-of-Squares algorithm. We give an algorithm based on this relaxation that recovers a faithful approximation to the true means in the given data whenever the low-degree moments of the points in each cluster have bounded sum-of-squares norms. We then prove a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the \emph{Poincare inequality}. The Poincare inequality is a central inequality in probability theory, and a large class of distributions satisfy it including Gaussians, product distributions, strongly log-concave distributions, and any sum or uniformly continuous transformation of such distributions. As an immediate corollary, for any $\gamma > 0$, we obtain an efficient algorithm for learning the means of a mixture of $k$ arbitrary \Poincare distributions in $\mathbb{R}^d$ in time $d^{O(1/\gamma)}$ so long as the means have separation $\Omega(k^{\gamma})$. This in particular yields an algorithm for learning Gaussian mixtures with separation $\Omega(k^{\gamma})$, thus partially resolving an open problem of Regev and Vijayaraghavan \citet{regev2017learning}. Our algorithm works even in the outlier-robust setting where an $\epsilon$ fraction of arbitrary outliers are added to the data, as long as the fraction of outliers is smaller than the smallest cluster. We, therefore, obtain results in the strong agnostic setting where, in addition to not knowing the distribution family, the data itself may be arbitrarily corrupted. |
Tasks | |
Published | 2017-11-20 |
URL | http://arxiv.org/abs/1711.07465v1 |
http://arxiv.org/pdf/1711.07465v1.pdf | |
PWC | https://paperswithcode.com/paper/better-agnostic-clustering-via-relaxed-tensor |
Repo | |
Framework | |
Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning
Title | Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning |
Authors | Frank E. Curtis, Katya Scheinberg |
Abstract | The goal of this tutorial is to introduce key models, algorithms, and open questions related to the use of optimization methods for solving problems arising in machine learning. It is written with an INFORMS audience in mind, specifically those readers who are familiar with the basics of optimization algorithms, but less familiar with machine learning. We begin by deriving a formulation of a supervised learning problem and show how it leads to various optimization problems, depending on the context and underlying assumptions. We then discuss some of the distinctive features of these optimization problems, focusing on the examples of logistic regression and the training of deep neural networks. The latter half of the tutorial focuses on optimization algorithms, first for convex logistic regression, for which we discuss the use of first-order methods, the stochastic gradient method, variance reducing stochastic methods, and second-order methods. Finally, we discuss how these approaches can be employed to the training of deep neural networks, emphasizing the difficulties that arise from the complex, nonconvex structure of these models. |
Tasks | |
Published | 2017-06-30 |
URL | http://arxiv.org/abs/1706.10207v1 |
http://arxiv.org/pdf/1706.10207v1.pdf | |
PWC | https://paperswithcode.com/paper/optimization-methods-for-supervised-machine |
Repo | |
Framework | |
Leveraging Conversation Structure on Social Media to Identify Potentially Influential Users
Title | Leveraging Conversation Structure on Social Media to Identify Potentially Influential Users |
Authors | Dario De Nart, Dante Degl’Innocenti, Marco Pavan |
Abstract | Social networks have a community providing feedback on comments that allows to identify opinion leaders and users whose positions are unwelcome. Other platforms are not backed by such tools. Having a picture of the community’s reactions to a published content is a non trivial problem. In this work we propose a novel approach using Abstract Argumentation Frameworks and machine learning to describe interactions between users. Our experiments provide evidence that modelling the flow of a conversation with the primitives of AAF can support the identification of users who produce consistently appreciated content without modelling such content. |
Tasks | Abstract Argumentation |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10768v1 |
http://arxiv.org/pdf/1711.10768v1.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-conversation-structure-on-social |
Repo | |
Framework | |
The Complete Extensions do not form a Complete Semilattice
Title | The Complete Extensions do not form a Complete Semilattice |
Authors | Anthony P. Young |
Abstract | In his seminal paper that inaugurated abstract argumentation, Dung proved that the set of complete extensions forms a complete semilattice with respect to set inclusion. In this note we demonstrate that this proof is incorrect with counterexamples. We then trace the error in the proof and explain why it arose. We then examine the implications for the grounded extension. [Reason for withdrawal continued] Page 4, Example 2 is not a counterexample to Dung 1995 Theorem 25(3). It was believed to be a counter-example because the author misunderstood glb'' to be set-theoretic intersection. But in this case, glb’’ is defined to be other than set-theoretic intersection such that Theorem 25(3) is true. The author was motivated to fully understand the lattice-theoretic claims of Dung 1995 in writing this note and was not aware that this issue is probably folklore; the author bears full responsibility for this error. |
Tasks | Abstract Argumentation |
Published | 2017-10-15 |
URL | http://arxiv.org/abs/1710.05341v2 |
http://arxiv.org/pdf/1710.05341v2.pdf | |
PWC | https://paperswithcode.com/paper/the-complete-extensions-do-not-form-a |
Repo | |
Framework | |
Algorithmic detectability threshold of the stochastic block model
Title | Algorithmic detectability threshold of the stochastic block model |
Authors | Tatsuro Kawamoto |
Abstract | The assumption that the values of model parameters are known or correctly learned, i.e., the Nishimori condition, is one of the requirements for the detectability analysis of the stochastic block model in statistical inference. In practice, however, there is no example demonstrating that we can know the model parameters beforehand, and there is no guarantee that the model parameters can be learned accurately. In this study, we consider the expectation–maximization (EM) algorithm with belief propagation (BP) and derive its algorithmic detectability threshold. Our analysis is not restricted to the community structure, but includes general modular structures. Because the algorithm cannot always learn the planted model parameters correctly, the algorithmic detectability threshold is qualitatively different from the one with the Nishimori condition. |
Tasks | |
Published | 2017-10-24 |
URL | http://arxiv.org/abs/1710.08841v2 |
http://arxiv.org/pdf/1710.08841v2.pdf | |
PWC | https://paperswithcode.com/paper/algorithmic-detectability-threshold-of-the |
Repo | |
Framework | |
Beyond Planar Symmetry: Modeling human perception of reflection and rotation symmetries in the wild
Title | Beyond Planar Symmetry: Modeling human perception of reflection and rotation symmetries in the wild |
Authors | Christopher Funk, Yanxi Liu |
Abstract | Humans take advantage of real world symmetries for various tasks, yet capturing their superb symmetry perception mechanism with a computational model remains elusive. Motivated by a new study demonstrating the extremely high inter-person accuracy of human perceived symmetries in the wild, we have constructed the first deep-learning neural network for reflection and rotation symmetry detection (Sym-NET), trained on photos from MS-COCO (Microsoft-Common Object in COntext) dataset with nearly 11K consistent symmetry-labels from more than 400 human observers. We employ novel methods to convert discrete human labels into symmetry heatmaps, capture symmetry densely in an image and quantitatively evaluate Sym-NET against multiple existing computer vision algorithms. On CVPR 2013 symmetry competition testsets and unseen MS-COCO photos, Sym-NET significantly outperforms all other competitors. Beyond mathematically well-defined symmetries on a plane, Sym-NET demonstrates abilities to identify viewpoint-varied 3D symmetries, partially occluded symmetrical objects, and symmetries at a semantic level. |
Tasks | |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03568v2 |
http://arxiv.org/pdf/1704.03568v2.pdf | |
PWC | https://paperswithcode.com/paper/beyond-planar-symmetry-modeling-human |
Repo | |
Framework | |
Neural Response Generation with Dynamic Vocabularies
Title | Neural Response Generation with Dynamic Vocabularies |
Authors | Yu Wu, Wei Wu, Dejian Yang, Can Xu, Zhoujun Li, Ming Zhou |
Abstract | We study response generation for open domain conversation in chatbots. Existing methods assume that words in responses are generated from an identical vocabulary regardless of their inputs, which not only makes them vulnerable to generic patterns and irrelevant noise, but also causes a high cost in decoding. We propose a dynamic vocabulary sequence-to-sequence (DVS2S) model which allows each input to possess their own vocabulary in decoding. In training, vocabulary construction and response generation are jointly learned by maximizing a lower bound of the true objective with a Monte Carlo sampling method. In inference, the model dynamically allocates a small vocabulary for an input with the word prediction model, and conducts decoding only with the small vocabulary. Because of the dynamic vocabulary mechanism, DVS2S eludes many generic patterns and irrelevant words in generation, and enjoys efficient decoding at the same time. Experimental results on both automatic metrics and human annotations show that DVS2S can significantly outperform state-of-the-art methods in terms of response quality, but only requires 60% decoding time compared to the most efficient baseline. |
Tasks | |
Published | 2017-11-30 |
URL | http://arxiv.org/abs/1711.11191v1 |
http://arxiv.org/pdf/1711.11191v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-response-generation-with-dynamic |
Repo | |
Framework | |
Probabilistic Active Learning of Functions in Structural Causal Models
Title | Probabilistic Active Learning of Functions in Structural Causal Models |
Authors | Paul K. Rubenstein, Ilya Tolstikhin, Philipp Hennig, Bernhard Schoelkopf |
Abstract | We consider the problem of learning the functions computing children from parents in a Structural Causal Model once the underlying causal graph has been identified. This is in some sense the second step after causal discovery. Taking a probabilistic approach to estimating these functions, we derive a natural myopic active learning scheme that identifies the intervention which is optimally informative about all of the unknown functions jointly, given previously observed data. We test the derived algorithms on simple examples, to demonstrate that they produce a structured exploration policy that significantly improves on unstructured base-lines. |
Tasks | Active Learning, Causal Discovery |
Published | 2017-06-30 |
URL | http://arxiv.org/abs/1706.10234v1 |
http://arxiv.org/pdf/1706.10234v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-active-learning-of-functions-in |
Repo | |
Framework | |
A Performance Evaluation of Local Features for Image Based 3D Reconstruction
Title | A Performance Evaluation of Local Features for Image Based 3D Reconstruction |
Authors | Bin Fan, Qingqun Kong, Xinchao Wang, Zhiheng Wang, Shiming Xiang, Chunhong Pan, Pascal Fua |
Abstract | This paper performs a comprehensive and comparative evaluation of the state of the art local features for the task of image based 3D reconstruction. The evaluated local features cover the recently developed ones by using powerful machine learning techniques and the elaborately designed handcrafted features. To obtain a comprehensive evaluation, we choose to include both float type features and binary ones. Meanwhile, two kinds of datasets have been used in this evaluation. One is a dataset of many different scene types with groundtruth 3D points, containing images of different scenes captured at fixed positions, for quantitative performance evaluation of different local features in the controlled image capturing situations. The other dataset contains Internet scale image sets of several landmarks with a lot of unrelated images, which is used for qualitative performance evaluation of different local features in the free image collection situations. Our experimental results show that binary features are competent to reconstruct scenes from controlled image sequences with only a fraction of processing time compared to use float type features. However, for the case of large scale image set with many distracting images, float type features show a clear advantage over binary ones. |
Tasks | 3D Reconstruction |
Published | 2017-12-14 |
URL | http://arxiv.org/abs/1712.05271v1 |
http://arxiv.org/pdf/1712.05271v1.pdf | |
PWC | https://paperswithcode.com/paper/a-performance-evaluation-of-local-features |
Repo | |
Framework | |