July 29, 2019

2641 words 13 mins read

Paper Group ANR 136

On the Limitation of Convolutional Neural Networks in Recognizing Negative Images. Robust Conditional Probabilities. Optimal Densification for Fast and Accurate Minwise Hashing. MORSE: Semantic-ally Drive-n MORpheme SEgment-er. A Gamut-Mapping Framework for Color-Accurate Reproduction of HDR Images. A causal framework for explaining the predictions …

On the Limitation of Convolutional Neural Networks in Recognizing Negative Images


Title	On the Limitation of Convolutional Neural Networks in Recognizing Negative Images
Authors	Hossein Hosseini, Baicen Xiao, Mayoore Jaiswal, Radha Poovendran
Abstract	Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance on a variety of computer vision tasks, particularly visual classification problems, where new algorithms reported to achieve or even surpass the human performance. In this paper, we examine whether CNNs are capable of learning the semantics of training data. To this end, we evaluate CNNs on negative images, since they share the same structure and semantics as regular images and humans can classify them correctly. Our experimental results indicate that when training on regular images and testing on negative images, the model accuracy is significantly lower than when it is tested on regular images. This leads us to the conjecture that current training methods do not effectively train models to generalize the concepts. We then introduce the notion of semantic adversarial examples - transformed inputs that semantically represent the same objects, but the model does not classify them correctly - and present negative images as one class of such inputs.
Tasks
Published	2017-03-20
URL	http://arxiv.org/abs/1703.06857v2
PDF	http://arxiv.org/pdf/1703.06857v2.pdf
PWC	https://paperswithcode.com/paper/on-the-limitation-of-convolutional-neural
Repo
Framework

Robust Conditional Probabilities


Title	Robust Conditional Probabilities
Authors	Yoav Wald, Amir Globerson
Abstract	Conditional probabilities are a core concept in machine learning. For example, optimal prediction of a label $Y$ given an input $X$ corresponds to maximizing the conditional probability of $Y$ given $X$. A common approach to inference tasks is learning a model of conditional probabilities. However, these models are often based on strong assumptions (e.g., log-linear models), and hence their estimate of conditional probabilities is not robust and is highly dependent on the validity of their assumptions. Here we propose a framework for reasoning about conditional probabilities without assuming anything about the underlying distributions, except knowledge of their second order marginals, which can be estimated from data. We show how this setting leads to guaranteed bounds on conditional probabilities, which can be calculated efficiently in a variety of settings, including structured-prediction. Finally, we apply them to semi-supervised deep learning, obtaining results competitive with variational autoencoders.
Tasks	Structured Prediction
Published	2017-08-08
URL	http://arxiv.org/abs/1708.02406v1
PDF	http://arxiv.org/pdf/1708.02406v1.pdf
PWC	https://paperswithcode.com/paper/robust-conditional-probabilities
Repo
Framework

Optimal Densification for Fast and Accurate Minwise Hashing


Title	Optimal Densification for Fast and Accurate Minwise Hashing
Authors	Anshumali Shrivastava
Abstract	Minwise hashing is a fundamental and one of the most successful hashing algorithm in the literature. Recent advances based on the idea of densification~\cite{Proc:OneHashLSH_ICML14,Proc:Shrivastava_UAI14} have shown that it is possible to compute $k$ minwise hashes, of a vector with $d$ nonzeros, in mere $(d + k)$ computations, a significant improvement over the classical $O(dk)$. These advances have led to an algorithmic improvement in the query complexity of traditional indexing algorithms based on minwise hashing. Unfortunately, the variance of the current densification techniques is unnecessarily high, which leads to significantly poor accuracy compared to vanilla minwise hashing, especially when the data is sparse. In this paper, we provide a novel densification scheme which relies on carefully tailored 2-universal hashes. We show that the proposed scheme is variance-optimal, and without losing the runtime efficiency, it is significantly more accurate than existing densification techniques. As a result, we obtain a significantly efficient hashing scheme which has the same variance and collision probability as minwise hashing. Experimental evaluations on real sparse and high-dimensional datasets validate our claims. We believe that given the significant advantages, our method will replace minwise hashing implementations in practice.
Tasks
Published	2017-03-14
URL	http://arxiv.org/abs/1703.04664v1
PDF	http://arxiv.org/pdf/1703.04664v1.pdf
PWC	https://paperswithcode.com/paper/optimal-densification-for-fast-and-accurate
Repo
Framework

MORSE: Semantic-ally Drive-n MORpheme SEgment-er


Title	MORSE: Semantic-ally Drive-n MORpheme SEgment-er
Authors	Tarek Sakakini, Suma Bhat, Pramod Viswanath
Abstract	We present in this paper a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. We validate our algorithm across datasets and present state-of-the-art results.
Tasks
Published	2017-02-07
URL	http://arxiv.org/abs/1702.02212v3
PDF	http://arxiv.org/pdf/1702.02212v3.pdf
PWC	https://paperswithcode.com/paper/morse-semantic-ally-drive-n-morpheme-segment
Repo
Framework

A Gamut-Mapping Framework for Color-Accurate Reproduction of HDR Images


Title	A Gamut-Mapping Framework for Color-Accurate Reproduction of HDR Images
Authors	E. Sikudova, T. Pouli, A. Artusi, A. O. Akyuz, F. Banterle, Z. M. Mazlumoglu, E. Reinhard
Abstract	Few tone mapping operators (TMOs) take color management into consideration, limiting compression to luminance values only. This may lead to changes in image chroma and hues which are typically managed with a post-processing step. However, current post-processing techniques for tone reproduction do not explicitly consider the target display gamut. Gamut mapping on the other hand, deals with mapping images from one color gamut to another, usually smaller, gamut but has traditionally focused on smaller scale, chromatic changes. In this context, we present a novel gamut and tone management framework for color-accurate reproduction of high dynamic range (HDR) images, which is conceptually and computationally simple, parameter-free, and compatible with existing TMOs. In the CIE LCh color space, we compress chroma to fit the gamut of the output color space. This prevents hue and luminance shifts while taking gamut boundaries into consideration. We also propose a compatible lightness compression scheme that minimizes the number of color space conversions. Our results show that our gamut management method effectively compresses the chroma of tone mapped images, respecting the target gamut and without reducing image quality.
Tasks
Published	2017-11-24
URL	http://arxiv.org/abs/1711.08925v1
PDF	http://arxiv.org/pdf/1711.08925v1.pdf
PWC	https://paperswithcode.com/paper/a-gamut-mapping-framework-for-color-accurate
Repo
Framework

A causal framework for explaining the predictions of black-box sequence-to-sequence models


Title	A causal framework for explaining the predictions of black-box sequence-to-sequence models
Authors	David Alvarez-Melis, Tommi S. Jaakkola
Abstract	We interpret the predictions of any black-box structured input-structured output model around a specific input-output pair. Our method returns an “explanation” consisting of groups of input-output tokens that are causally related. These dependencies are inferred by querying the black-box model with perturbed inputs, generating a graph over tokens from the responses, and solving a partitioning problem to select the most relevant components. We focus the general approach on sequence-to-sequence problems, adopting a variational autoencoder to yield meaningful input perturbations. We test our method across several NLP sequence generation tasks.
Tasks
Published	2017-07-06
URL	http://arxiv.org/abs/1707.01943v3
PDF	http://arxiv.org/pdf/1707.01943v3.pdf
PWC	https://paperswithcode.com/paper/a-causal-framework-for-explaining-the
Repo
Framework

Consistency of Dirichlet Partitions


Title	Consistency of Dirichlet Partitions
Authors	Braxton Osting, Todd Harry Reeb
Abstract	A Dirichlet $k$-partition of a domain $U \subseteq \mathbb{R}^d$ is a collection of $k$ pairwise disjoint open subsets such that the sum of their first Laplace-Dirichlet eigenvalues is minimal. A discrete version of Dirichlet partitions has been posed on graphs with applications in data analysis. Both versions admit variational formulations: solutions are characterized by minimizers of the Dirichlet energy of mappings from $U$ into a singular space $\Sigma_k \subseteq \mathbb{R}^k$. In this paper, we extend results of N.\ Garc'ia Trillos and D.\ Slep\v{c}ev to show that there exist solutions of the continuum problem arising as limits to solutions of a sequence of discrete problems. Specifically, a sequence of points ${x_i}{i \in \mathbb{N}}$ from $U$ is sampled i.i.d.\ with respect to a given probability measure $\nu$ on $U$ and for all $n \in \mathbb{N}$, a geometric graph $G_n$ is constructed from the first $n$ points $x_1, x_2, \ldots, x_n$ and the pairwise distances between the points. With probability one with respect to the choice of points ${x_i}{i \in \mathbb{N}}$, we show that as $n \to \infty$ the discrete Dirichlet energies for functions $G_n \to \Sigma_k$ $\Gamma$-converge to (a scalar multiple of) the continuum Dirichlet energy for functions $U \to \Sigma_k$ with respect to a metric coming from the theory of optimal transport. This, along with a compactness property for the aforementioned energies that we prove, implies the convergence of minimizers. When $\nu$ is the uniform distribution, our results also imply the statistical consistency statement that Dirichlet partitions of geometric graphs converge to partitions of the sampled space in the Hausdorff sense.
Tasks
Published	2017-08-18
URL	http://arxiv.org/abs/1708.05472v1
PDF	http://arxiv.org/pdf/1708.05472v1.pdf
PWC	https://paperswithcode.com/paper/consistency-of-dirichlet-partitions
Repo
Framework

On Nearest Neighbors in Non Local Means Denoising


Title	On Nearest Neighbors in Non Local Means Denoising
Authors	Iuri Frosio, Jan Kautz
Abstract	To denoise a reference patch, the Non-Local-Means denoising filter processes a set of neighbor patches. Few Nearest Neighbors (NN) are used to limit the computational burden of the algorithm. Here here we show analytically that the NN approach introduces a bias in the denoised patch, and we propose a different neighbors’ collection criterion, named Statistical NN (SNN), to alleviate this issue. Our approach outperforms the traditional one in case of both white and colored noise: fewer SNNs generate images of higher quality, at a lower computational cost.
Tasks	Denoising
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07568v1
PDF	http://arxiv.org/pdf/1711.07568v1.pdf
PWC	https://paperswithcode.com/paper/on-nearest-neighbors-in-non-local-means
Repo
Framework

An Aposteriorical Clusterability Criterion for $k$-Means++ and Simplicity of Clustering


Title	An Aposteriorical Clusterability Criterion for $k$-Means++ and Simplicity of Clustering
Authors	Mieczysław A. Kłopotek
Abstract	We define the notion of a well-clusterable data set combining the point of view of the objective of $k$-means clustering algorithm (minimising the centric spread of data elements) and common sense (clusters shall be separated by gaps). We identify conditions under which the optimum of $k$-means objective coincides with a clustering under which the data is separated by predefined gaps. We investigate two cases: when the whole clusters are separated by some gap and when only the cores of the clusters meet some separation condition. We overcome a major obstacle in using clusterability criteria due to the fact that known approaches to clusterability checking had the disadvantage that they are related to the optimal clustering which is NP hard to identify. Compared to other approaches to clusterability, the novelty consists in the possibility of an a posteriori (after running $k$-means) check if the data set is well-clusterable or not. As the $k$-means algorithm applied for this purpose has polynomial complexity so does therefore the appropriate check. Additionally, if $k$-means++ fails to identify a clustering that meets clusterability criteria, with high probability the data is not well-clusterable.
Tasks	Common Sense Reasoning
Published	2017-04-24
URL	http://arxiv.org/abs/1704.07139v2
PDF	http://arxiv.org/pdf/1704.07139v2.pdf
PWC	https://paperswithcode.com/paper/an-aposteriorical-clusterability-criterion
Repo
Framework

End-to-End Multi-View Networks for Text Classification


Title	End-to-End Multi-View Networks for Text Classification
Authors	Hongyu Guo, Colin Cherry, Jiang Su
Abstract	We propose a multi-view network for text classification. Our method automatically creates various views of its input text, each taking the form of soft attention weights that distribute the classifier’s focus among a set of base features. For a bag-of-words representation, each view focuses on a different subset of the text’s words. Aggregating many such views results in a more discriminative and robust representation. Through a novel architecture that both stacks and concatenates views, we produce a network that emphasizes both depth and width, allowing training to converge quickly. Using our multi-view architecture, we establish new state-of-the-art accuracies on two benchmark tasks.
Tasks	Text Classification
Published	2017-04-19
URL	http://arxiv.org/abs/1704.05907v1
PDF	http://arxiv.org/pdf/1704.05907v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-multi-view-networks-for-text
Repo
Framework

Ensemble Distillation for Neural Machine Translation


Title	Ensemble Distillation for Neural Machine Translation
Authors	Markus Freitag, Yaser Al-Onaizan, Baskaran Sankaran
Abstract	Knowledge distillation describes a method for training a student network to perform better by learning from a stronger teacher network. Translating a sentence with an Neural Machine Translation (NMT) engine is time expensive and having a smaller model speeds up this process. We demonstrate how to transfer the translation quality of an ensemble and an oracle BLEU teacher network into a single NMT system. Further, we present translation improvements from a teacher network that has the same architecture and dimensions of the student network. As the training of the student model is still expensive, we introduce a data filtering method based on the knowledge of the teacher model that not only speeds up the training, but also leads to better translation quality. Our techniques need no code change and can be easily reproduced with any NMT architecture to speed up the decoding process.
Tasks	Machine Translation
Published	2017-02-06
URL	http://arxiv.org/abs/1702.01802v2
PDF	http://arxiv.org/pdf/1702.01802v2.pdf
PWC	https://paperswithcode.com/paper/ensemble-distillation-for-neural-machine
Repo
Framework

Simple Classification using Binary Data


Title	Simple Classification using Binary Data
Authors	Deanna Needell, Rayan Saab, Tina Woolf
Abstract	Binary, or one-bit, representations of data arise naturally in many applications, and are appealing in both hardware implementations and algorithm design. In this work, we study the problem of data classification from binary data and propose a framework with low computation and resource costs. We illustrate the utility of the proposed approach through stylized and realistic numerical experiments, and provide a theoretical analysis for a simple case. We hope that our framework and analysis will serve as a foundation for studying similar types of approaches.
Tasks
Published	2017-07-06
URL	http://arxiv.org/abs/1707.01945v1
PDF	http://arxiv.org/pdf/1707.01945v1.pdf
PWC	https://paperswithcode.com/paper/simple-classification-using-binary-data
Repo
Framework

Semi-Supervised Learning with IPM-based GANs: an Empirical Study


Title	Semi-Supervised Learning with IPM-based GANs: an Empirical Study
Authors	Tom Sercu, Youssef Mroueh
Abstract	We present an empirical investigation of a recent class of Generative Adversarial Networks (GANs) using Integral Probability Metrics (IPM) and their performance for semi-supervised learning. IPM-based GANs like Wasserstein GAN, Fisher GAN and Sobolev GAN have desirable properties in terms of theoretical understanding, training stability, and a meaningful loss. In this work we investigate how the design of the critic (or discriminator) influences the performance in semi-supervised learning. We distill three key take-aways which are important for good SSL performance: (1) the K+1 formulation, (2) avoiding batch normalization in the critic and (3) avoiding gradient penalty constraints on the classification layer.
Tasks
Published	2017-12-07
URL	http://arxiv.org/abs/1712.02505v1
PDF	http://arxiv.org/pdf/1712.02505v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning-with-ipm-based-gans
Repo
Framework

What matters in a transferable neural network model for relation classification in the biomedical domain?


Title	What matters in a transferable neural network model for relation classification in the biomedical domain?
Authors	Sunil Kumar Sahu, Ashish Anand
Abstract	Lack of sufficient labeled data often limits the applicability of advanced machine learning algorithms to real life problems. However efficient use of Transfer Learning (TL) has been shown to be very useful across domains. TL utilizes valuable knowledge learned in one task (source task), where sufficient data is available, to the task of interest (target task). In biomedical and clinical domain, it is quite common that lack of sufficient training data do not allow to fully exploit machine learning models. In this work, we present two unified recurrent neural models leading to three transfer learning frameworks for relation classification tasks. We systematically investigate effectiveness of the proposed frameworks in transferring the knowledge under multiple aspects related to source and target tasks, such as, similarity or relatedness between source and target tasks, and size of training data for source task. Our empirical results show that the proposed frameworks in general improve the model performance, however these improvements do depend on aspects related to source and target tasks. This dependence then finally determine the choice of a particular TL framework.
Tasks	Relation Classification, Transfer Learning
Published	2017-08-11
URL	http://arxiv.org/abs/1708.03446v2
PDF	http://arxiv.org/pdf/1708.03446v2.pdf
PWC	https://paperswithcode.com/paper/what-matters-in-a-transferable-neural-network
Repo
Framework

No, This is not a Circle


Title	No, This is not a Circle
Authors	Zoltán Kovács
Abstract	A popular curve shown in introductory maths textbooks, seems like a circle. But it is actually a different curve. This paper discusses some elementary approaches to identify the geometric object, including novel technological means by using GeoGebra. We demonstrate two ways to refute the false impression, two suggestions to find a correct conjecture, and four ways to confirm the result by proving it rigorously. All of the discussed approaches can be introduced in classrooms at various levels from middle school to high school.
Tasks
Published	2017-04-27
URL	http://arxiv.org/abs/1704.08483v3
PDF	http://arxiv.org/pdf/1704.08483v3.pdf
PWC	https://paperswithcode.com/paper/no-this-is-not-a-circle
Repo
Framework