October 18, 2019

2959 words 14 mins read

Paper Group ANR 535

Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning. Recovering Trees with Convex Clustering. Generating Synthetic Data for Neural Keyword-to-Question Models. Column generation based math-heuristic for classification trees. The Time-SIFT method : detecting 3-D changes from archival photogrammetric analysis with …

Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning


Title	Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning
Authors	Hassan Rafique, Mingrui Liu, Qihang Lin, Tianbao Yang
Abstract	Min-max saddle-point problems have broad applications in many tasks in machine learning, e.g., distributionally robust learning, learning with non-decomposable loss, or learning with uncertain data. Although convex-concave saddle-point problems have been broadly studied with efficient algorithms and solid theories available, it remains a challenge to design provably efficient algorithms for non-convex saddle-point problems, especially when the objective function involves an expectation or a large-scale finite sum. Motivated by recent literature on non-convex non-smooth minimization, this paper studies a family of non-convex min-max problems where the minimization component is non-convex (weakly convex) and the maximization component is concave. We propose a proximally guided stochastic subgradient method and a proximally guided stochastic variance-reduced method for expected and finite-sum saddle-point problems, respectively. We establish the computation complexities of both methods for finding a nearly stationary point of the corresponding minimization problem.
Tasks
Published	2018-10-04
URL	http://arxiv.org/abs/1810.02060v3
PDF	http://arxiv.org/pdf/1810.02060v3.pdf
PWC	https://paperswithcode.com/paper/non-convex-min-max-optimization-provable
Repo
Framework

Recovering Trees with Convex Clustering


Title	Recovering Trees with Convex Clustering
Authors	Eric C. Chi, Stefan Steinerberger
Abstract	Convex clustering refers, for given $\left{x_1, \dots, x_n\right} \subset \mathbb{R}^p$, to the minimization of \begin{eqnarray} u(\gamma) & = & \underset{u_1, \dots, u_n }{\arg\min};\sum_{i=1}^{n}{\lVert x_i - u_i \rVert^2} + \gamma \sum_{i,j=1}^{n}{w_{ij} \lVert u_i - u_j\rVert},\ \end{eqnarray} where $w_{ij} \geq 0$ is an affinity that quantifies the similarity between $x_i$ and $x_j$. We prove that if the affinities $w_{ij}$ reflect a tree structure in the $\left{x_1, \dots, x_n\right}$, then the convex clustering solution path reconstructs the tree exactly. The main technical ingredient implies the following combinatorial byproduct: for every set $\left{x_1, \dots, x_n \right} \subset \mathbb{R}^p$ of $n \geq 2$ distinct points, there exist at least $n/6$ points with the property that for any of these points $x$ there is a unit vector $v \in \mathbb{R}^p$ such that, when viewed from $x$, `most’ points lie in the direction $v$ \begin{eqnarray} \frac{1}{n-1}\sum_{i=1 \atop x_i \neq x}^{n}{ \left\langle \frac{x_i - x}{\lVert x_i - x \rVert}, v \right\rangle} & \geq & \frac{1}{4}. \end{eqnarray} \|
Tasks
Published	2018-06-28
URL	http://arxiv.org/abs/1806.11096v2
PDF	http://arxiv.org/pdf/1806.11096v2.pdf
PWC	https://paperswithcode.com/paper/recovering-trees-with-convex-clustering
Repo
Framework

Generating Synthetic Data for Neural Keyword-to-Question Models


Title	Generating Synthetic Data for Neural Keyword-to-Question Models
Authors	Heng Ding, Krisztian Balog
Abstract	Search typically relies on keyword queries, but these are often semantically ambiguous. We propose to overcome this by offering users natural language questions, based on their keyword queries, to disambiguate their intent. This keyword-to-question task may be addressed using neural machine translation techniques. Neural translation models, however, require massive amounts of training data (keyword-question pairs), which is unavailable for this task. The main idea of this paper is to generate large amounts of synthetic training data from a small seed set of hand-labeled keyword-question pairs. Since natural language questions are available in large quantities, we develop models to automatically generate the corresponding keyword queries. Further, we introduce various filtering mechanisms to ensure that synthetic training data is of high quality. We demonstrate the feasibility of our approach using both automatic and manual evaluation. This is an extended version of the article published with the same title in the Proceedings of ICTIR’18.
Tasks	Machine Translation
Published	2018-07-14
URL	http://arxiv.org/abs/1807.05324v1
PDF	http://arxiv.org/pdf/1807.05324v1.pdf
PWC	https://paperswithcode.com/paper/generating-synthetic-data-for-neural-keyword
Repo
Framework

Column generation based math-heuristic for classification trees


Title	Column generation based math-heuristic for classification trees
Authors	Murat Firat, Guillaume Crognier, Adriana F. Gabor, C. A. J. Hurkens, Yingqian Zhang
Abstract	This paper explores the use of Column Generation (CG) techniques in constructing univariate binary decision trees for classification tasks. We propose a novel Integer Linear Programming (ILP) formulation, based on root-to-leaf paths in decision trees. The model is solved via a Column Generation based heuristic. To speed up the heuristic, we use a restricted instance data by considering a subset of decision splits, sampled from the solutions of the well-known CART algorithm. Extensive numerical experiments show that our approach is competitive with the state-of-the-art ILP-based algorithms. In particular, the proposed approach is capable of handling big data sets with tens of thousands of data rows. Moreover, for large data sets, it finds solutions competitive to CART.
Tasks
Published	2018-10-15
URL	https://arxiv.org/abs/1810.06684v2
PDF	https://arxiv.org/pdf/1810.06684v2.pdf
PWC	https://paperswithcode.com/paper/constructing-classification-trees-using
Repo
Framework

The Time-SIFT method : detecting 3-D changes from archival photogrammetric analysis with almost exclusively image information


Title	The Time-SIFT method : detecting 3-D changes from archival photogrammetric analysis with almost exclusively image information
Authors	Denis Feurer, Fabrice Vinatier
Abstract	Archival aerial imagery is a source of worldwide very high resolution data for documenting paste 3-D changes. However, external information is required so that accurate 3-D models can be computed from archival aerial imagery. In this research, we propose and test a new method, termed Time-SIFT (Scale Invariant Feature Transform), which allows for computing coherent multi-temporal Digital Elevation Models (DEMs) with almost exclusively image information. This method is based on the invariance properties of the SIFT-like methods which are at the root of the Structure from Motion (SfM) algorithms. On a test site of 170 km2, we applied SfM algorithms to a unique image block with all the images of four different dates covering forty years. We compared this method to more classical methods based on the use of affordable additional data such as ground control points collected in recent orthophotos. We did extensive tests to determine which processing choices were most impacting on the final result. With these tests, we aimed at evaluating the potential of the proposed Time-SIFT method for the detection and mapping of 3-D changes. Our study showed that the Time-SIFT method was the prime criteria that allowed for computing informative DEMs of difference with almost exclusively image information and limited photogrammetric expertise and human intervention. Due to the fact that the proposed Time-SIFT method can be automatically applied with exclusively image information, our results pave the way to a systematic processing of the archival aerial imagery on very large spatio-temporal windows, and should hence greatly help the unlocking of archival aerial imagery for the documenting of past 3-D changes.
Tasks
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09700v1
PDF	http://arxiv.org/pdf/1807.09700v1.pdf
PWC	https://paperswithcode.com/paper/the-time-sift-method-detecting-3-d-changes
Repo
Framework

LORM: Learning to Optimize for Resource Management in Wireless Networks with Few Training Samples


Title	LORM: Learning to Optimize for Resource Management in Wireless Networks with Few Training Samples
Authors	Yifei Shen, Yuanming Shi, Jun Zhang, Khaled B. Letaief
Abstract	Effective resource management plays a pivotal role in wireless networks, which, unfortunately, results in challenging mixed-integer nonlinear programming (MINLP) problems in most cases. Machine learning-based methods have recently emerged as a disruptive way to obtain near-optimal performance for MINLPs with affordable computational complexity. There have been some attempts in applying such methods to resource management in wireless networks, but these attempts require huge amounts of training samples and lack the capability to handle constrained problems. Furthermore, they suffer from severe performance deterioration when the network parameters change, which commonly happens and is referred to as the task mismatch problem. In this paper, to reduce the sample complexity and address the feasibility issue, we propose a framework of Learning to Optimize for Resource Management (LORM). Instead of the end-to-end learning approach adopted in previous studies, LORM learns the optimal pruning policy in the branch-and-bound algorithm for MINLPs via a sample-efficient method, namely, imitation learning. To further address the task mismatch problem, we develop a transfer learning method via self-imitation in LORM, named LORM-TL, which can quickly adapt a pre-trained machine learning model to the new task with only a few additional unlabeled training samples. Numerical simulations will demonstrate that LORM outperforms specialized state-of-the-art algorithms and achieves near-optimal performance, while achieving significant speedup compared with the branch-and-bound algorithm. Moreover, LORM-TL, by relying on a few unlabeled samples, achieves comparable performance with the model trained from scratch with sufficient labeled samples.
Tasks	Imitation Learning, Transfer Learning
Published	2018-12-18
URL	https://arxiv.org/abs/1812.07998v2
PDF	https://arxiv.org/pdf/1812.07998v2.pdf
PWC	https://paperswithcode.com/paper/lora-learning-to-optimize-for-resource
Repo
Framework

Learning to Remember, Forget and Ignore using Attention Control in Memory


Title	Learning to Remember, Forget and Ignore using Attention Control in Memory
Authors	T. S. Jayram, Younes Bouhadjar, Ryan L. McAvoy, Tomasz Kornuta, Alexis Asseman, Kamil Rocki, Ahmet S. Ozcan
Abstract	Typical neural networks with external memory do not effectively separate capacity for episodic and working memory as is required for reasoning in humans. Applying knowledge gained from psychological studies, we designed a new model called Differentiable Working Memory (DWM) in order to specifically emulate human working memory. As it shows the same functional characteristics as working memory, it robustly learns psychology inspired tasks and converges faster than comparable state-of-the-art models. Moreover, the DWM model successfully generalizes to sequences two orders of magnitude longer than the ones used in training. Our in-depth analysis shows that the behavior of DWM is interpretable and that it learns to have fine control over memory, allowing it to retain, ignore or forget information based on its relevance.
Tasks
Published	2018-09-28
URL	http://arxiv.org/abs/1809.11087v1
PDF	http://arxiv.org/pdf/1809.11087v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-remember-forget-and-ignore-using
Repo
Framework

AI4AI: Quantitative Methods for Classifying Host Species from Avian Influenza DNA Sequence


Title	AI4AI: Quantitative Methods for Classifying Host Species from Avian Influenza DNA Sequence
Authors	Woo Yong Choi, Kyu Ye Song, Chan Woo Lee
Abstract	Avian Influenza breakouts cause millions of dollars in damage each year globally, especially in Asian countries such as China and South Korea. The impact magnitude of a breakout directly correlates to time required to fully understand the influenza virus, particularly the interspecies pathogenicity. The procedure requires laboratory tests that require resources typically lacking in a breakout emergency. In this study, we propose new quantitative methods utilizing machine learning and deep learning to correctly classify host species given raw DNA sequence data of the influenza virus, and provide probabilities for each classification. The best deep learning models achieve top-1 classification accuracy of 47%, and top-3 classification accuracy of 82%, on a dataset of 11 host species classes.
Tasks
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09197v1
PDF	http://arxiv.org/pdf/1802.09197v1.pdf
PWC	https://paperswithcode.com/paper/ai4ai-quantitative-methods-for-classifying
Repo
Framework

Learning On-Road Visual Control for Self-Driving Vehicles with Auxiliary Tasks


Title	Learning On-Road Visual Control for Self-Driving Vehicles with Auxiliary Tasks
Authors	Yilun Chen, Praveen Palanisamy, Priyantha Mudalige, Katharina Muelling, John M. Dolan
Abstract	A safe and robust on-road navigation system is a crucial component of achieving fully automated vehicles. NVIDIA recently proposed an End-to-End algorithm that can directly learn steering commands from raw pixels of a front camera by using one convolutional neural network. In this paper, we leverage auxiliary information aside from raw images and design a novel network structure, called Auxiliary Task Network (ATN), to help boost the driving performance while maintaining the advantage of minimal training data and an End-to-End training method. In this network, we introduce human prior knowledge into vehicle navigation by transferring features from image recognition tasks. Image semantic segmentation is applied as an auxiliary task for navigation. We consider temporal information by introducing an LSTM module and optical flow to the network. Finally, we combine vehicle kinematics with a sensor fusion step. We discuss the benefits of our method over state-of-the-art visual navigation methods both in the Udacity simulation environment and on the real-world Comma.ai dataset.
Tasks	Optical Flow Estimation, Semantic Segmentation, Sensor Fusion, Visual Navigation
Published	2018-12-19
URL	http://arxiv.org/abs/1812.07760v1
PDF	http://arxiv.org/pdf/1812.07760v1.pdf
PWC	https://paperswithcode.com/paper/learning-on-road-visual-control-for-self
Repo
Framework

Lamarckian Evolution of Convolutional Neural Networks


Title	Lamarckian Evolution of Convolutional Neural Networks
Authors	Jonas Prellberg, Oliver Kramer
Abstract	Convolutional neural networks belong to the most successul image classifiers, but the adaptation of their network architecture to a particular problem is computationally expensive. We show that an evolutionary algorithm saves training time during the network architecture optimization, if learned network weights are inherited over generations by Lamarckian evolution. Experiments on typical image datasets show similar or significantly better test accuracies and improved convergence speeds compared to two different baselines without weight inheritance. On CIFAR-10 and CIFAR-100 a 75 % improvement in data efficiency is observed.
Tasks
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08099v2
PDF	http://arxiv.org/pdf/1806.08099v2.pdf
PWC	https://paperswithcode.com/paper/lamarckian-evolution-of-convolutional-neural
Repo
Framework

Zero-Shot Kernel Learning


Title	Zero-Shot Kernel Learning
Authors	Hongguang Zhang, Piotr Koniusz
Abstract	In this paper, we address an open problem of zero-shot learning. Its principle is based on learning a mapping that associates feature vectors extracted from i.e. images and attribute vectors that describe objects and/or scenes of interest. In turns, this allows classifying unseen object classes and/or scenes by matching feature vectors via mapping to a newly defined attribute vector describing a new class. Due to importance of such a learning task, there exist many methods that learn semantic, probabilistic, linear or piece-wise linear mappings. In contrast, we apply well-established kernel methods to learn a non-linear mapping between the feature and attribute spaces. We propose an easy learning objective inspired by the Linear Discriminant Analysis, Kernel-Target Alignment and Kernel Polarization methods that promotes incoherence. We evaluate performance of our algorithm on the Polynomial as well as shift-invariant Gaussian and Cauchy kernels. Despite simplicity of our approach, we obtain state-of-the-art results on several zero-shot learning datasets and benchmarks including a recent AWA2 dataset.
Tasks	Zero-Shot Learning
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01279v2
PDF	http://arxiv.org/pdf/1802.01279v2.pdf
PWC	https://paperswithcode.com/paper/zero-shot-kernel-learning
Repo
Framework

Risk-averse Behavior Planning for Autonomous Driving under Uncertainty


Title	Risk-averse Behavior Planning for Autonomous Driving under Uncertainty
Authors	Mohammad Naghshvar, Ahmed K. Sadek, Auke J. Wiggers
Abstract	Autonomous vehicles have to navigate the surrounding environment with partial observability of other objects sharing the road. Sources of uncertainty in autonomous vehicle measurements include sensor fusion errors, limited sensor range due to weather or object detection latency, occlusion, and hidden parameters such as other human driver intentions. Behavior planning must consider all sources of uncertainty in deciding future vehicle maneuvers. This paper presents a scalable framework for risk-averse behavior planning under uncertainty by incorporating QMDP, unscented transform, and Monte Carlo tree search (MCTS). It is shown that upper confidence bound (UCB) for expanding the tree results in noisy Q-value estimates by the MCTS and a degraded performance of QMDP. A modification to action selection procedure in MCTS is proposed to achieve robust performance.
Tasks	Autonomous Driving, Autonomous Vehicles, Object Detection, Sensor Fusion
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01254v1
PDF	http://arxiv.org/pdf/1812.01254v1.pdf
PWC	https://paperswithcode.com/paper/risk-averse-behavior-planning-for-autonomous
Repo
Framework

Graph Bayesian Optimization: Algorithms, Evaluations and Applications


Title	Graph Bayesian Optimization: Algorithms, Evaluations and Applications
Authors	Jiaxu Cui, Bo Yang
Abstract	Network structure optimization is a fundamental task in complex network analysis. However, almost all the research on Bayesian optimization is aimed at optimizing the objective functions with vectorial inputs. In this work, we first present a flexible framework, denoted graph Bayesian optimization, to handle arbitrary graphs in the Bayesian optimization community. By combining the proposed framework with graph kernels, it can take full advantage of implicit graph structural features to supplement explicit features guessed according to the experience, such as tags of nodes and any attributes of graphs. The proposed framework can identify which features are more important during the optimization process. We apply the framework to solve four problems including two evaluations and two applications to demonstrate its efficacy and potential applications.
Tasks
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01157v4
PDF	http://arxiv.org/pdf/1805.01157v4.pdf
PWC	https://paperswithcode.com/paper/graph-bayesian-optimization-algorithms
Repo
Framework

Unsupervised Intuitive Physics from Visual Observations


Title	Unsupervised Intuitive Physics from Visual Observations
Authors	Sebastien Ehrhardt, Aron Monszpart, Niloy Mitra, Andrea Vedaldi
Abstract	While learning models of intuitive physics is an increasingly active area of research, current approaches still fall short of natural intelligences in one important regard: they require external supervision, such as explicit access to physical states, at training and sometimes even at test times. Some authors have relaxed such requirements by supplementing the model with an handcrafted physical simulator. Still, the resulting methods are unable to automatically learn new complex environments and to understand physical interactions within them. In this work, we demonstrated for the first time learning such predictors directly from raw visual observations and without relying on simulators. We do so in two steps: first, we learn to track mechanically-salient objects in videos using causality and equivariance, two unsupervised learning principles that do not require auto-encoding. Second, we demonstrate that the extracted positions are sufficient to successfully train visual motion predictors that can take the underlying environment into account. We validate our predictors on synthetic datasets; then, we introduce a new dataset, ROLL4REAL, consisting of real objects rolling on complex terrains (pool table, elliptical bowl, and random height-field). We show that in all such cases it is possible to learn reliable extrapolators of the object trajectories from raw videos alone, without any form of external supervision and with no more prior knowledge than the choice of a convolutional neural network architecture.
Tasks
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05086v2
PDF	http://arxiv.org/pdf/1805.05086v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-intuitive-physics-from-visual
Repo
Framework

Learning Optimal Policies from Observational Data


Title	Learning Optimal Policies from Observational Data
Authors	Onur Atan, William R. Zame, M van der Schaar
Abstract	Choosing optimal (or at least better) policies is an important problem in domains from medicine to education to finance and many others. One approach to this problem is through controlled experiments/trials - but controlled experiments are expensive. Hence it is important to choose the best policies on the basis of observational data. This presents two difficult challenges: (i) missing counterfactuals, and (ii) selection bias. This paper presents theoretical bounds on estimation errors of counterfactuals from observational data by making connections to domain adaptation theory. It also presents a principled way of choosing optimal policies using domain adversarial neural networks. We illustrate the effectiveness of domain adversarial training together with various features of our algorithm on a semi-synthetic breast cancer dataset and a supervised UCI dataset (Statlog).
Tasks	Domain Adaptation
Published	2018-02-23
URL	http://arxiv.org/abs/1802.08679v1
PDF	http://arxiv.org/pdf/1802.08679v1.pdf
PWC	https://paperswithcode.com/paper/learning-optimal-policies-from-observational
Repo
Framework