January 29, 2020

2966 words 14 mins read

Paper Group ANR 484

ID3 Learns Juntas for Smoothed Product Distributions. CloudifierNet – Deep Vision Models for Artificial Image Processing. SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval. Complex Evolution Recurrent Neural Networks (ceRNNs). Robust conditional GANs under missing or uncertain labels. Equiprobable mappings in weight …

ID3 Learns Juntas for Smoothed Product Distributions


Title	ID3 Learns Juntas for Smoothed Product Distributions
Authors	Alon Brutzkus, Amit Daniely, Eran Malach
Abstract	In recent years, there are many attempts to understand popular heuristics. An example of such a heuristic algorithm is the ID3 algorithm for learning decision trees. This algorithm is commonly used in practice, but there are very few theoretical works studying its behavior. In this paper, we analyze the ID3 algorithm, when the target function is a $k$-Junta, a function that depends on $k$ out of $n$ variables of the input. We prove that when $k = \log n$, the ID3 algorithm learns in polynomial time $k$-Juntas, in the smoothed analysis model of Kalai & Teng. That is, we show a learnability result when the observed distribution is a “noisy” variant of the original distribution.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08654v1
PDF	https://arxiv.org/pdf/1906.08654v1.pdf
PWC	https://paperswithcode.com/paper/id3-learns-juntas-for-smoothed-product
Repo
Framework

CloudifierNet – Deep Vision Models for Artificial Image Processing


Title	CloudifierNet – Deep Vision Models for Artificial Image Processing
Authors	Andrei Damian, Laurentiu Piciu, Alexandru Purdila, Nicolae Tapus
Abstract	Today, more and more, it is necessary that most applications and documents developed in previous or current technologies to be accessible online on cloud-based infrastructures. That is why the migration of legacy systems including their hosts of documents to new technologies and online infrastructures, using modern Artificial Intelligence techniques, is absolutely necessary. With the advancement of Artificial Intelligence and Deep Learning with its multitude of applications, a new area of research is emerging - that of automated systems development and maintenance. The underlying work objective that led to this paper aims to research and develop truly intelligent systems able to analyze user interfaces from various sources and generate real and usable inferences ranging from architecture analysis to actual code generation. One key element of such systems is that of artificial scene detection and analysis based on deep learning computer vision systems. Computer vision models and particularly deep directed acyclic graphs based on convolutional modules are generally constructed and trained based on natural images datasets. Due to this fact, the models will develop during the training process natural image feature detectors apart from the base graph modules that will learn basic primitive features. In the current paper, we will present the base principles of a deep neural pipeline for computer vision applied to artificial scenes (scenes generated by user interfaces or similar). Finally, we will present the conclusions based on experimental development and benchmarking against state-of-the-art transfer-learning implemented deep vision models.
Tasks	Code Generation, Transfer Learning
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01346v1
PDF	https://arxiv.org/pdf/1911.01346v1.pdf
PWC	https://paperswithcode.com/paper/cloudifiernet-deep-vision-models-for
Repo
Framework

SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval


Title	SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval
Authors	Liang Pang, Jun Xu, Qingyao Ai, Yanyan Lan, Xueqi Cheng, Jirong Wen
Abstract	In learning-to-rank for information retrieval, a ranking model is automatically learned from the data and then utilized to rank the sets of retrieved documents. Therefore, an ideal ranking model would be a mapping from a document set to a permutation on the set, and should satisfy two critical requirements: (1)~it should have the ability to model cross-document interactions so as to capture local context information in a query; (2)~it should be permutation-invariant, which means that any permutation of the inputted documents would not change the output ranking. Previous studies on learning-to-rank either design uni-variate scoring functions that score each document separately, and thus failed to model the cross-document interactions; or construct multivariate scoring functions that score documents sequentially, which inevitably sacrifice the permutation invariance requirement. In this paper, we propose a neural learning-to-rank model called SetRank which directly learns a permutation-invariant ranking model defined on document sets of any size. SetRank employs a stack of (induced) multi-head self attention blocks as its key component for learning the embeddings for all of the retrieved documents jointly. The self-attention mechanism not only helps SetRank to capture the local context information from cross-document interactions, but also to learn permutation-equivariant representations for the inputted documents, which therefore achieving a permutation-invariant ranking model. Experimental results on three large scale benchmarks showed that the SetRank significantly outperformed the baselines include the traditional learning-to-rank models and state-of-the-art Neural IR models.
Tasks	Information Retrieval, Learning-To-Rank
Published	2019-12-12
URL	https://arxiv.org/abs/1912.05891v1
PDF	https://arxiv.org/pdf/1912.05891v1.pdf
PWC	https://paperswithcode.com/paper/setrank-learning-a-permutation-invariant
Repo
Framework

Complex Evolution Recurrent Neural Networks (ceRNNs)


Title	Complex Evolution Recurrent Neural Networks (ceRNNs)
Authors	Izhak Shafran, Tom Bagby, R. J. Skerry-Ryan
Abstract	Unitary Evolution Recurrent Neural Networks (uRNNs) have three attractive properties: (a) the unitary property, (b) the complex-valued nature, and (c) their efficient linear operators. The literature so far does not address – how critical is the unitary property of the model? Furthermore, uRNNs have not been evaluated on large tasks. To study these shortcomings, we propose the complex evolution Recurrent Neural Networks (ceRNNs), which is similar to uRNNs but drops the unitary property selectively. On a simple multivariate linear regression task, we illustrate that dropping the constraints improves the learning trajectory. In copy memory task, ceRNNs and uRNNs perform identically, demonstrating that their superior performance over LSTMs is due to complex-valued nature and their linear operators. In a large scale real-world speech recognition, we find that pre-pending a uRNN degrades the performance of our baseline LSTM acoustic models, while pre-pending a ceRNN improves the performance over the baseline by 0.8% absolute WER.
Tasks	Speech Recognition
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02246v1
PDF	https://arxiv.org/pdf/1906.02246v1.pdf
PWC	https://paperswithcode.com/paper/complex-evolution-recurrent-neural-networks
Repo
Framework

Robust conditional GANs under missing or uncertain labels


Title	Robust conditional GANs under missing or uncertain labels
Authors	Kiran Koshy Thekumparampil, Sewoong Oh, Ashish Khetan
Abstract	Matching the performance of conditional Generative Adversarial Networks with little supervision is an important task, especially in venturing into new domains. We design a new training algorithm, which is robust to missing or ambiguous labels. The main idea is to intentionally corrupt the labels of generated examples to match the statistics of the real data, and have a discriminator process the real and generated examples with corrupted labels. We showcase the robustness of this proposed approach both theoretically and empirically. We show that minimizing the proposed loss is equivalent to minimizing true divergence between real and generated data up to a multiplicative factor, and characterize this multiplicative factor as a function of the statistics of the uncertain labels. Experiments on MNIST dataset demonstrates that proposed architecture is able to achieve high accuracy in generating examples faithful to the class even with only a few examples per class.
Tasks
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03579v1
PDF	https://arxiv.org/pdf/1906.03579v1.pdf
PWC	https://paperswithcode.com/paper/robust-conditional-gans-under-missing-or
Repo
Framework

Equiprobable mappings in weighted constraint grammars


Title	Equiprobable mappings in weighted constraint grammars
Authors	Arto Anttila, Scott Borgeson, Giorgio Magri
Abstract	We show that MaxEnt is so rich that it can distinguish between any two different mappings: there always exists a nonnegative weight vector which assigns them different MaxEnt probabilities. Stochastic HG instead does admit equiprobable mappings and we give a complete formal characterization of them. We compare these different predictions of the two frameworks on a test case of Finnish stress.
Tasks
Published	2019-07-12
URL	https://arxiv.org/abs/1907.05839v1
PDF	https://arxiv.org/pdf/1907.05839v1.pdf
PWC	https://paperswithcode.com/paper/equiprobable-mappings-in-weighted-constraint
Repo
Framework

Environment Reconstruction with Hidden Confounders for Reinforcement Learning based Recommendation


Title	Environment Reconstruction with Hidden Confounders for Reinforcement Learning based Recommendation
Authors	Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng, Jieping Ye
Abstract	Reinforcement learning aims at searching the best policy model for decision making, and has been shown powerful for sequential recommendations. The training of the policy by reinforcement learning, however, is placed in an environment. In many real-world applications, however, the policy training in the real environment can cause an unbearable cost, due to the exploration in the environment. Environment reconstruction from the past data is thus an appealing way to release the power of reinforcement learning in these applications. The reconstruction of the environment is, basically, to extract the casual effect model from the data. However, real-world applications are often too complex to offer fully observable environment information. Therefore, quite possibly there are unobserved confounding variables lying behind the data. The hidden confounder can obstruct an effective reconstruction of the environment. In this paper, by treating the hidden confounder as a hidden policy, we propose a deconfounded multi-agent environment reconstruction (DEMER) approach in order to learn the environment together with the hidden confounder. DEMER adopts a multi-agent generative adversarial imitation learning framework. It proposes to introduce the confounder embedded policy, and use the compatible discriminator for training the policies. We then apply DEMER in an application of driver program recommendation. We firstly use an artificial driver program recommendation environment, abstracted from the real application, to verify and analyze the effectiveness of DEMER. We then test DEMER in the real application of Didi Chuxing. Experiment results show that DEMER can effectively reconstruct the hidden confounder, and thus can build the environment better. DEMER also derives a recommendation policy with a significantly improved performance in the test phase of the real application.
Tasks	Decision Making, Imitation Learning
Published	2019-07-12
URL	https://arxiv.org/abs/1907.06584v1
PDF	https://arxiv.org/pdf/1907.06584v1.pdf
PWC	https://paperswithcode.com/paper/environment-reconstruction-with-hidden
Repo
Framework

Null Space Analysis for Class-Specific Discriminant Learning


Title	Null Space Analysis for Class-Specific Discriminant Learning
Authors	Jenni Raitoharju, Alexandros Iosifidis
Abstract	In this paper, we carry out null space analysis for Class-Specific Discriminant Analysis (CSDA) and formulate a number of solutions based on the analysis. We analyze both theoretically and experimentally the significance of each algorithmic step. The innate subspace dimensionality resulting from the proposed solutions is typically quite high and we discuss how the need for further dimensionality reduction changes the situation. Experimental evaluation of the proposed solutions shows that the straightforward extension of null space analysis approaches to the class-specific setting can outperform the standard CSDA method. Furthermore, by exploiting a recently proposed out-of-class scatter definition encoding the multi-modality of the negative class naturally appearing in class-specific problems, null space projections can lead to a performance comparable to or outperforming the most recent CSDA methods.
Tasks	Dimensionality Reduction
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04562v1
PDF	https://arxiv.org/pdf/1908.04562v1.pdf
PWC	https://paperswithcode.com/paper/null-space-analysis-for-class-specific
Repo
Framework

Template-Instance Loss for Offline Handwritten Chinese Character Recognition


Title	Template-Instance Loss for Offline Handwritten Chinese Character Recognition
Authors	Yao Xiao, Dan Meng, Cewu Lu, Chi-Keung Tang
Abstract	The long-standing challenges for offline handwritten Chinese character recognition (HCCR) are twofold: Chinese characters can be very diverse and complicated while similarly looking, and cursive handwriting (due to increased writing speed and infrequent pen lifting) makes strokes and even characters connected together in a flowing manner. In this paper, we propose the template and instance loss functions for the relevant machine learning tasks in offline handwritten Chinese character recognition. First, the character template is designed to deal with the intrinsic similarities among Chinese characters. Second, the instance loss can reduce category variance according to classification difficulty, giving a large penalty to the outlier instance of handwritten Chinese character. Trained with the new loss functions using our deep network architecture HCCR14Layer model consisting of simple layers, our extensive experiments show that it yields state-of-the-art performance and beyond for offline HCCR.
Tasks	Offline Handwritten Chinese Character Recognition
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05545v1
PDF	https://arxiv.org/pdf/1910.05545v1.pdf
PWC	https://paperswithcode.com/paper/template-instance-loss-for-offline
Repo
Framework

Slim-CNN: A Light-Weight CNN for Face Attribute Prediction


Title	Slim-CNN: A Light-Weight CNN for Face Attribute Prediction
Authors	Ankit Sharma, Hassan Foroosh
Abstract	We introduce a computationally-efficient CNN micro-architecture Slim Module to design a lightweight deep neural network Slim-Net for face attribute prediction. Slim Modules are constructed by assembling depthwise separable convolutions with pointwise convolution to produce a computationally efficient module. The problem of facial attribute prediction is challenging because of the large variations in pose, background, illumination, and dataset imbalance. We stack these Slim Modules to devise a compact CNN which still maintains very high accuracy. Additionally, the neural network has a very low memory footprint which makes it suitable for mobile and embedded applications. Experiments on the CelebA dataset show that Slim-Net achieves an accuracy of 91.24% with at least 25 times fewer parameters than comparably performing methods, which reduces the memory storage requirement of Slim-net by at least 87%.
Tasks
Published	2019-07-03
URL	https://arxiv.org/abs/1907.02157v1
PDF	https://arxiv.org/pdf/1907.02157v1.pdf
PWC	https://paperswithcode.com/paper/slim-cnn-a-light-weight-cnn-for-face
Repo
Framework

Multimedia Search and Temporal Reasoning


Title	Multimedia Search and Temporal Reasoning
Authors	Marcio Ferreira Moreno, Rodrigo Costa Mesquita Santos, Wallas Henrique Sousa dos Santos, Sandro Rama Fiorini, Reinaldo Mozart da Gama Silva
Abstract	Properly modelling dynamic information that changes over time still is an open issue. Most modern knowledge bases are unable to represent relationships that are valid only during a given time interval. In this work, we revisit a previous extension to the hyperknowledge framework to deal with temporal facts and propose a temporal query language and engine. We validate our proposal by discussing a qualitative analysis of the modelling of a real-world use case in the Oil & Gas industry.
Tasks
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08225v1
PDF	https://arxiv.org/pdf/1911.08225v1.pdf
PWC	https://paperswithcode.com/paper/multimedia-search-and-temporal-reasoning
Repo
Framework

From Digitalization to Data-Driven Decision Making in Container Terminals


Title	From Digitalization to Data-Driven Decision Making in Container Terminals
Authors	Leonard Heilig, Robert Stahlbock, Stefan Voß
Abstract	With the new opportunities emerging from the current wave of digitalization, terminal planning and management need to be revisited by taking a data-driven perspective. Business analytics, as a practice of extracting insights from operational data, assists in reducing uncertainties using predictions and helps to identify and understand causes of inefficiencies, disruptions, and anomalies in intra- and inter-organizational terminal operations. Despite the growing complexity of data within and around container terminals, a lack of data-driven approaches in the context of container terminals can be identified. In this chapter, the concept of business analytics for supporting terminal planning and management is introduced. The chapter specifically focuses on data mining approaches and provides a comprehensive overview on applications in container terminals and related research. As such, we aim to establish a data-driven perspective on terminal planning and management, complementing the traditional optimization perspective.
Tasks	Decision Making
Published	2019-04-29
URL	http://arxiv.org/abs/1904.13251v1
PDF	http://arxiv.org/pdf/1904.13251v1.pdf
PWC	https://paperswithcode.com/paper/from-digitalization-to-data-driven-decision
Repo
Framework

Similarity-Preserving Knowledge Distillation


Title	Similarity-Preserving Knowledge Distillation
Authors	Frederick Tung, Greg Mori
Abstract	Knowledge distillation is a widely applicable technique for training a student neural network under the guidance of a trained teacher network. For example, in neural network compression, a high-capacity teacher is distilled to train a compact student; in privileged learning, a teacher trained with privileged data is distilled to train a student without access to that data. The distillation loss determines how a teacher’s knowledge is captured and transferred to the student. In this paper, we propose a new form of knowledge distillation loss that is inspired by the observation that semantically similar inputs tend to elicit similar activation patterns in a trained network. Similarity-preserving knowledge distillation guides the training of a student network such that input pairs that produce similar (dissimilar) activations in the teacher network produce similar (dissimilar) activations in the student network. In contrast to previous distillation methods, the student is not required to mimic the representation space of the teacher, but rather to preserve the pairwise similarities in its own representation space. Experiments on three public datasets demonstrate the potential of our approach.
Tasks	Neural Network Compression
Published	2019-07-23
URL	https://arxiv.org/abs/1907.09682v2
PDF	https://arxiv.org/pdf/1907.09682v2.pdf
PWC	https://paperswithcode.com/paper/similarity-preserving-knowledge-distillation
Repo
Framework

Identification of Model Uncertainty via Optimal Design of Experiments applied to a Mechanical Press


Title	Identification of Model Uncertainty via Optimal Design of Experiments applied to a Mechanical Press
Authors	Tristan Gally, Peter Groche, Florian Hoppe, Anja Kuttich, Alexander Matei, Marc E. Pfetsch, Martin Rakowitsch, Stefan Ulbrich
Abstract	In engineering applications almost all processes are described with the aid of models. Especially forming machines heavily rely on mathematical models for control and condition monitoring. Inaccuracies during the modeling, manufacturing and assembly of these machines induce model uncertainty which impairs the controller’s performance. In this paper we propose an approach to identify model uncertainty using parameter identification and optimal design of experiments. The experimental setup is characterized by optimal sensor positions such that specific model parameters can be determined with minimal variance. This allows for the computation of confidence regions, in which the real parameters or the parameter estimates from different test sets have to lie. We claim that inconsistencies in the estimated parameter values, considering their approximated confidence ellipsoids as well, cannot be explained by data uncertainty but are indicators of model uncertainty. The proposed method is demonstrated using a component of the 3D Servo Press, a multi-technology forming machine that combines spindles with eccentric servo drives.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08408v2
PDF	https://arxiv.org/pdf/1910.08408v2.pdf
PWC	https://paperswithcode.com/paper/identification-of-model-uncertainty-via
Repo
Framework

DeepHashing using TripletLoss


Title	DeepHashing using TripletLoss
Authors	Jithin James
Abstract	Hashing is one of the most efficient techniques for approximate nearest neighbour search for large scale image retrieval. Most of the techniques are based on hand-engineered features and do not give optimal results all the time. Deep Convolutional Neural Networks have proven to generate very effective representation of images that are used for various computer vision tasks and inspired by this there have been several Deep Hashing models like Wang et al. (2016) have been proposed. These models train on the triplet loss function which can be used to train models with superior representation capabilities. Taking the latest advancements in training using the triplet loss I propose new techniques that help the Deep Hash-ing models train more faster and efficiently. Experiment result1show that using the more efficient techniques for training on the triplet loss, we have obtained a 5%percent improvement in our model compared to the original work of Wang et al.(2016). Using a larger model and more training data we can drastically improve the performance using the techniques we propose
Tasks	Image Retrieval
Published	2019-12-17
URL	https://arxiv.org/abs/1912.10822v1
PDF	https://arxiv.org/pdf/1912.10822v1.pdf
PWC	https://paperswithcode.com/paper/deephashing-using-tripletloss
Repo
Framework