Paper Group ANR 535
Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning. Recovering Trees with Convex Clustering. Generating Synthetic Data for Neural Keyword-to-Question Models. Column generation based math-heuristic for classification trees. The Time-SIFT method : detecting 3-D changes from archival photogrammetric analysis with …
Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning
Title | Non-Convex Min-Max Optimization: Provable Algorithms and Applications in Machine Learning |
Authors | Hassan Rafique, Mingrui Liu, Qihang Lin, Tianbao Yang |
Abstract | Min-max saddle-point problems have broad applications in many tasks in machine learning, e.g., distributionally robust learning, learning with non-decomposable loss, or learning with uncertain data. Although convex-concave saddle-point problems have been broadly studied with efficient algorithms and solid theories available, it remains a challenge to design provably efficient algorithms for non-convex saddle-point problems, especially when the objective function involves an expectation or a large-scale finite sum. Motivated by recent literature on non-convex non-smooth minimization, this paper studies a family of non-convex min-max problems where the minimization component is non-convex (weakly convex) and the maximization component is concave. We propose a proximally guided stochastic subgradient method and a proximally guided stochastic variance-reduced method for expected and finite-sum saddle-point problems, respectively. We establish the computation complexities of both methods for finding a nearly stationary point of the corresponding minimization problem. |
Tasks | |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02060v3 |
http://arxiv.org/pdf/1810.02060v3.pdf | |
PWC | https://paperswithcode.com/paper/non-convex-min-max-optimization-provable |
Repo | |
Framework | |
Recovering Trees with Convex Clustering
Title | Recovering Trees with Convex Clustering |
Authors | Eric C. Chi, Stefan Steinerberger |
Abstract | Convex clustering refers, for given $\left{x_1, \dots, x_n\right} \subset \mathbb{R}^p$, to the minimization of \begin{eqnarray*} u(\gamma) & = & \underset{u_1, \dots, u_n }{\arg\min};\sum_{i=1}^{n}{\lVert x_i - u_i \rVert^2} + \gamma \sum_{i,j=1}^{n}{w_{ij} \lVert u_i - u_j\rVert},\ \end{eqnarray*} where $w_{ij} \geq 0$ is an affinity that quantifies the similarity between $x_i$ and $x_j$. We prove that if the affinities $w_{ij}$ reflect a tree structure in the $\left{x_1, \dots, x_n\right}$, then the convex clustering solution path reconstructs the tree exactly. The main technical ingredient implies the following combinatorial byproduct: for every set $\left{x_1, \dots, x_n \right} \subset \mathbb{R}^p$ of $n \geq 2$ distinct points, there exist at least $n/6$ points with the property that for any of these points $x$ there is a unit vector $v \in \mathbb{R}^p$ such that, when viewed from $x$, `most’ points lie in the direction $v$ \begin{eqnarray*} \frac{1}{n-1}\sum_{i=1 \atop x_i \neq x}^{n}{ \left\langle \frac{x_i - x}{\lVert x_i - x \rVert}, v \right\rangle} & \geq & \frac{1}{4}. \end{eqnarray*} | |
Tasks | |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.11096v2 |
http://arxiv.org/pdf/1806.11096v2.pdf | |
PWC | https://paperswithcode.com/paper/recovering-trees-with-convex-clustering |
Repo | |
Framework | |
Generating Synthetic Data for Neural Keyword-to-Question Models
Title | Generating Synthetic Data for Neural Keyword-to-Question Models |
Authors | Heng Ding, Krisztian Balog |
Abstract | Search typically relies on keyword queries, but these are often semantically ambiguous. We propose to overcome this by offering users natural language questions, based on their keyword queries, to disambiguate their intent. This keyword-to-question task may be addressed using neural machine translation techniques. Neural translation models, however, require massive amounts of training data (keyword-question pairs), which is unavailable for this task. The main idea of this paper is to generate large amounts of synthetic training data from a small seed set of hand-labeled keyword-question pairs. Since natural language questions are available in large quantities, we develop models to automatically generate the corresponding keyword queries. Further, we introduce various filtering mechanisms to ensure that synthetic training data is of high quality. We demonstrate the feasibility of our approach using both automatic and manual evaluation. This is an extended version of the article published with the same title in the Proceedings of ICTIR’18. |
Tasks | Machine Translation |
Published | 2018-07-14 |
URL | http://arxiv.org/abs/1807.05324v1 |
http://arxiv.org/pdf/1807.05324v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-synthetic-data-for-neural-keyword |
Repo | |
Framework | |
Column generation based math-heuristic for classification trees
Title | Column generation based math-heuristic for classification trees |
Authors | Murat Firat, Guillaume Crognier, Adriana F. Gabor, C. A. J. Hurkens, Yingqian Zhang |
Abstract | This paper explores the use of Column Generation (CG) techniques in constructing univariate binary decision trees for classification tasks. We propose a novel Integer Linear Programming (ILP) formulation, based on root-to-leaf paths in decision trees. The model is solved via a Column Generation based heuristic. To speed up the heuristic, we use a restricted instance data by considering a subset of decision splits, sampled from the solutions of the well-known CART algorithm. Extensive numerical experiments show that our approach is competitive with the state-of-the-art ILP-based algorithms. In particular, the proposed approach is capable of handling big data sets with tens of thousands of data rows. Moreover, for large data sets, it finds solutions competitive to CART. |
Tasks | |
Published | 2018-10-15 |
URL | https://arxiv.org/abs/1810.06684v2 |
https://arxiv.org/pdf/1810.06684v2.pdf | |
PWC | https://paperswithcode.com/paper/constructing-classification-trees-using |
Repo | |
Framework | |
The Time-SIFT method : detecting 3-D changes from archival photogrammetric analysis with almost exclusively image information
Title | The Time-SIFT method : detecting 3-D changes from archival photogrammetric analysis with almost exclusively image information |
Authors | Denis Feurer, Fabrice Vinatier |
Abstract | Archival aerial imagery is a source of worldwide very high resolution data for documenting paste 3-D changes. However, external information is required so that accurate 3-D models can be computed from archival aerial imagery. In this research, we propose and test a new method, termed Time-SIFT (Scale Invariant Feature Transform), which allows for computing coherent multi-temporal Digital Elevation Models (DEMs) with almost exclusively image information. This method is based on the invariance properties of the SIFT-like methods which are at the root of the Structure from Motion (SfM) algorithms. On a test site of 170 km2, we applied SfM algorithms to a unique image block with all the images of four different dates covering forty years. We compared this method to more classical methods based on the use of affordable additional data such as ground control points collected in recent orthophotos. We did extensive tests to determine which processing choices were most impacting on the final result. With these tests, we aimed at evaluating the potential of the proposed Time-SIFT method for the detection and mapping of 3-D changes. Our study showed that the Time-SIFT method was the prime criteria that allowed for computing informative DEMs of difference with almost exclusively image information and limited photogrammetric expertise and human intervention. Due to the fact that the proposed Time-SIFT method can be automatically applied with exclusively image information, our results pave the way to a systematic processing of the archival aerial imagery on very large spatio-temporal windows, and should hence greatly help the unlocking of archival aerial imagery for the documenting of past 3-D changes. |
Tasks | |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09700v1 |
http://arxiv.org/pdf/1807.09700v1.pdf | |
PWC | https://paperswithcode.com/paper/the-time-sift-method-detecting-3-d-changes |
Repo | |
Framework | |
LORM: Learning to Optimize for Resource Management in Wireless Networks with Few Training Samples
Title | LORM: Learning to Optimize for Resource Management in Wireless Networks with Few Training Samples |
Authors | Yifei Shen, Yuanming Shi, Jun Zhang, Khaled B. Letaief |
Abstract | Effective resource management plays a pivotal role in wireless networks, which, unfortunately, results in challenging mixed-integer nonlinear programming (MINLP) problems in most cases. Machine learning-based methods have recently emerged as a disruptive way to obtain near-optimal performance for MINLPs with affordable computational complexity. There have been some attempts in applying such methods to resource management in wireless networks, but these attempts require huge amounts of training samples and lack the capability to handle constrained problems. Furthermore, they suffer from severe performance deterioration when the network parameters change, which commonly happens and is referred to as the task mismatch problem. In this paper, to reduce the sample complexity and address the feasibility issue, we propose a framework of Learning to Optimize for Resource Management (LORM). Instead of the end-to-end learning approach adopted in previous studies, LORM learns the optimal pruning policy in the branch-and-bound algorithm for MINLPs via a sample-efficient method, namely, imitation learning. To further address the task mismatch problem, we develop a transfer learning method via self-imitation in LORM, named LORM-TL, which can quickly adapt a pre-trained machine learning model to the new task with only a few additional unlabeled training samples. Numerical simulations will demonstrate that LORM outperforms specialized state-of-the-art algorithms and achieves near-optimal performance, while achieving significant speedup compared with the branch-and-bound algorithm. Moreover, LORM-TL, by relying on a few unlabeled samples, achieves comparable performance with the model trained from scratch with sufficient labeled samples. |
Tasks | Imitation Learning, Transfer Learning |
Published | 2018-12-18 |
URL | https://arxiv.org/abs/1812.07998v2 |
https://arxiv.org/pdf/1812.07998v2.pdf | |
PWC | https://paperswithcode.com/paper/lora-learning-to-optimize-for-resource |
Repo | |
Framework | |
Learning to Remember, Forget and Ignore using Attention Control in Memory
Title | Learning to Remember, Forget and Ignore using Attention Control in Memory |
Authors | T. S. Jayram, Younes Bouhadjar, Ryan L. McAvoy, Tomasz Kornuta, Alexis Asseman, Kamil Rocki, Ahmet S. Ozcan |
Abstract | Typical neural networks with external memory do not effectively separate capacity for episodic and working memory as is required for reasoning in humans. Applying knowledge gained from psychological studies, we designed a new model called Differentiable Working Memory (DWM) in order to specifically emulate human working memory. As it shows the same functional characteristics as working memory, it robustly learns psychology inspired tasks and converges faster than comparable state-of-the-art models. Moreover, the DWM model successfully generalizes to sequences two orders of magnitude longer than the ones used in training. Our in-depth analysis shows that the behavior of DWM is interpretable and that it learns to have fine control over memory, allowing it to retain, ignore or forget information based on its relevance. |
Tasks | |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1809.11087v1 |
http://arxiv.org/pdf/1809.11087v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-remember-forget-and-ignore-using |
Repo | |
Framework | |
AI4AI: Quantitative Methods for Classifying Host Species from Avian Influenza DNA Sequence
Title | AI4AI: Quantitative Methods for Classifying Host Species from Avian Influenza DNA Sequence |
Authors | Woo Yong Choi, Kyu Ye Song, Chan Woo Lee |
Abstract | Avian Influenza breakouts cause millions of dollars in damage each year globally, especially in Asian countries such as China and South Korea. The impact magnitude of a breakout directly correlates to time required to fully understand the influenza virus, particularly the interspecies pathogenicity. The procedure requires laboratory tests that require resources typically lacking in a breakout emergency. In this study, we propose new quantitative methods utilizing machine learning and deep learning to correctly classify host species given raw DNA sequence data of the influenza virus, and provide probabilities for each classification. The best deep learning models achieve top-1 classification accuracy of 47%, and top-3 classification accuracy of 82%, on a dataset of 11 host species classes. |
Tasks | |
Published | 2018-02-26 |
URL | http://arxiv.org/abs/1802.09197v1 |
http://arxiv.org/pdf/1802.09197v1.pdf | |
PWC | https://paperswithcode.com/paper/ai4ai-quantitative-methods-for-classifying |
Repo | |
Framework | |
Learning On-Road Visual Control for Self-Driving Vehicles with Auxiliary Tasks
Title | Learning On-Road Visual Control for Self-Driving Vehicles with Auxiliary Tasks |
Authors | Yilun Chen, Praveen Palanisamy, Priyantha Mudalige, Katharina Muelling, John M. Dolan |
Abstract | A safe and robust on-road navigation system is a crucial component of achieving fully automated vehicles. NVIDIA recently proposed an End-to-End algorithm that can directly learn steering commands from raw pixels of a front camera by using one convolutional neural network. In this paper, we leverage auxiliary information aside from raw images and design a novel network structure, called Auxiliary Task Network (ATN), to help boost the driving performance while maintaining the advantage of minimal training data and an End-to-End training method. In this network, we introduce human prior knowledge into vehicle navigation by transferring features from image recognition tasks. Image semantic segmentation is applied as an auxiliary task for navigation. We consider temporal information by introducing an LSTM module and optical flow to the network. Finally, we combine vehicle kinematics with a sensor fusion step. We discuss the benefits of our method over state-of-the-art visual navigation methods both in the Udacity simulation environment and on the real-world Comma.ai dataset. |
Tasks | Optical Flow Estimation, Semantic Segmentation, Sensor Fusion, Visual Navigation |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.07760v1 |
http://arxiv.org/pdf/1812.07760v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-on-road-visual-control-for-self |
Repo | |
Framework | |
Lamarckian Evolution of Convolutional Neural Networks
Title | Lamarckian Evolution of Convolutional Neural Networks |
Authors | Jonas Prellberg, Oliver Kramer |
Abstract | Convolutional neural networks belong to the most successul image classifiers, but the adaptation of their network architecture to a particular problem is computationally expensive. We show that an evolutionary algorithm saves training time during the network architecture optimization, if learned network weights are inherited over generations by Lamarckian evolution. Experiments on typical image datasets show similar or significantly better test accuracies and improved convergence speeds compared to two different baselines without weight inheritance. On CIFAR-10 and CIFAR-100 a 75 % improvement in data efficiency is observed. |
Tasks | |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08099v2 |
http://arxiv.org/pdf/1806.08099v2.pdf | |
PWC | https://paperswithcode.com/paper/lamarckian-evolution-of-convolutional-neural |
Repo | |
Framework | |
Zero-Shot Kernel Learning
Title | Zero-Shot Kernel Learning |
Authors | Hongguang Zhang, Piotr Koniusz |
Abstract | In this paper, we address an open problem of zero-shot learning. Its principle is based on learning a mapping that associates feature vectors extracted from i.e. images and attribute vectors that describe objects and/or scenes of interest. In turns, this allows classifying unseen object classes and/or scenes by matching feature vectors via mapping to a newly defined attribute vector describing a new class. Due to importance of such a learning task, there exist many methods that learn semantic, probabilistic, linear or piece-wise linear mappings. In contrast, we apply well-established kernel methods to learn a non-linear mapping between the feature and attribute spaces. We propose an easy learning objective inspired by the Linear Discriminant Analysis, Kernel-Target Alignment and Kernel Polarization methods that promotes incoherence. We evaluate performance of our algorithm on the Polynomial as well as shift-invariant Gaussian and Cauchy kernels. Despite simplicity of our approach, we obtain state-of-the-art results on several zero-shot learning datasets and benchmarks including a recent AWA2 dataset. |
Tasks | Zero-Shot Learning |
Published | 2018-02-05 |
URL | http://arxiv.org/abs/1802.01279v2 |
http://arxiv.org/pdf/1802.01279v2.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-kernel-learning |
Repo | |
Framework | |
Risk-averse Behavior Planning for Autonomous Driving under Uncertainty
Title | Risk-averse Behavior Planning for Autonomous Driving under Uncertainty |
Authors | Mohammad Naghshvar, Ahmed K. Sadek, Auke J. Wiggers |
Abstract | Autonomous vehicles have to navigate the surrounding environment with partial observability of other objects sharing the road. Sources of uncertainty in autonomous vehicle measurements include sensor fusion errors, limited sensor range due to weather or object detection latency, occlusion, and hidden parameters such as other human driver intentions. Behavior planning must consider all sources of uncertainty in deciding future vehicle maneuvers. This paper presents a scalable framework for risk-averse behavior planning under uncertainty by incorporating QMDP, unscented transform, and Monte Carlo tree search (MCTS). It is shown that upper confidence bound (UCB) for expanding the tree results in noisy Q-value estimates by the MCTS and a degraded performance of QMDP. A modification to action selection procedure in MCTS is proposed to achieve robust performance. |
Tasks | Autonomous Driving, Autonomous Vehicles, Object Detection, Sensor Fusion |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01254v1 |
http://arxiv.org/pdf/1812.01254v1.pdf | |
PWC | https://paperswithcode.com/paper/risk-averse-behavior-planning-for-autonomous |
Repo | |
Framework | |
Graph Bayesian Optimization: Algorithms, Evaluations and Applications
Title | Graph Bayesian Optimization: Algorithms, Evaluations and Applications |
Authors | Jiaxu Cui, Bo Yang |
Abstract | Network structure optimization is a fundamental task in complex network analysis. However, almost all the research on Bayesian optimization is aimed at optimizing the objective functions with vectorial inputs. In this work, we first present a flexible framework, denoted graph Bayesian optimization, to handle arbitrary graphs in the Bayesian optimization community. By combining the proposed framework with graph kernels, it can take full advantage of implicit graph structural features to supplement explicit features guessed according to the experience, such as tags of nodes and any attributes of graphs. The proposed framework can identify which features are more important during the optimization process. We apply the framework to solve four problems including two evaluations and two applications to demonstrate its efficacy and potential applications. |
Tasks | |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01157v4 |
http://arxiv.org/pdf/1805.01157v4.pdf | |
PWC | https://paperswithcode.com/paper/graph-bayesian-optimization-algorithms |
Repo | |
Framework | |
Unsupervised Intuitive Physics from Visual Observations
Title | Unsupervised Intuitive Physics from Visual Observations |
Authors | Sebastien Ehrhardt, Aron Monszpart, Niloy Mitra, Andrea Vedaldi |
Abstract | While learning models of intuitive physics is an increasingly active area of research, current approaches still fall short of natural intelligences in one important regard: they require external supervision, such as explicit access to physical states, at training and sometimes even at test times. Some authors have relaxed such requirements by supplementing the model with an handcrafted physical simulator. Still, the resulting methods are unable to automatically learn new complex environments and to understand physical interactions within them. In this work, we demonstrated for the first time learning such predictors directly from raw visual observations and without relying on simulators. We do so in two steps: first, we learn to track mechanically-salient objects in videos using causality and equivariance, two unsupervised learning principles that do not require auto-encoding. Second, we demonstrate that the extracted positions are sufficient to successfully train visual motion predictors that can take the underlying environment into account. We validate our predictors on synthetic datasets; then, we introduce a new dataset, ROLL4REAL, consisting of real objects rolling on complex terrains (pool table, elliptical bowl, and random height-field). We show that in all such cases it is possible to learn reliable extrapolators of the object trajectories from raw videos alone, without any form of external supervision and with no more prior knowledge than the choice of a convolutional neural network architecture. |
Tasks | |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05086v2 |
http://arxiv.org/pdf/1805.05086v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-intuitive-physics-from-visual |
Repo | |
Framework | |
Learning Optimal Policies from Observational Data
Title | Learning Optimal Policies from Observational Data |
Authors | Onur Atan, William R. Zame, M van der Schaar |
Abstract | Choosing optimal (or at least better) policies is an important problem in domains from medicine to education to finance and many others. One approach to this problem is through controlled experiments/trials - but controlled experiments are expensive. Hence it is important to choose the best policies on the basis of observational data. This presents two difficult challenges: (i) missing counterfactuals, and (ii) selection bias. This paper presents theoretical bounds on estimation errors of counterfactuals from observational data by making connections to domain adaptation theory. It also presents a principled way of choosing optimal policies using domain adversarial neural networks. We illustrate the effectiveness of domain adversarial training together with various features of our algorithm on a semi-synthetic breast cancer dataset and a supervised UCI dataset (Statlog). |
Tasks | Domain Adaptation |
Published | 2018-02-23 |
URL | http://arxiv.org/abs/1802.08679v1 |
http://arxiv.org/pdf/1802.08679v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-optimal-policies-from-observational |
Repo | |
Framework | |