Paper Group ANR 42
GANtruth - an unpaired image-to-image translation method for driving scenarios. Towards Robust Interpretability with Self-Explaining Neural Networks. Identification of Seed Cells in Multispectral Images for GrowCut Segmentation. Integrating omics and MRI data with kernel-based tests and CNNs to identify rare genetic markers for Alzheimer’s disease. …
GANtruth - an unpaired image-to-image translation method for driving scenarios
Title | GANtruth - an unpaired image-to-image translation method for driving scenarios |
Authors | Sebastian Bujwid, Miquel Martí, Hossein Azizpour, Alessandro Pieropan |
Abstract | Synthetic image translation has significant potentials in autonomous transportation systems. That is due to the expense of data collection and annotation as well as the unmanageable diversity of real-words situations. The main issue with unpaired image-to-image translation is the ill-posed nature of the problem. In this work, we propose a novel method for constraining the output space of unpaired image-to-image translation. We make the assumption that the environment of the source domain is known (e.g. synthetically generated), and we propose to explicitly enforce preservation of the ground-truth labels on the translated images. We experiment on preserving ground-truth information such as semantic segmentation, disparity, and instance segmentation. We show significant evidence that our method achieves improved performance over the state-of-the-art model of UNIT for translating images from SYNTHIA to Cityscapes. The generated images are perceived as more realistic in human surveys and outperforms UNIT when used in a domain adaptation scenario for semantic segmentation. |
Tasks | Domain Adaptation, Image-to-Image Translation, Instance Segmentation, Semantic Segmentation |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1812.01710v1 |
http://arxiv.org/pdf/1812.01710v1.pdf | |
PWC | https://paperswithcode.com/paper/gantruth-an-unpaired-image-to-image |
Repo | |
Framework | |
Towards Robust Interpretability with Self-Explaining Neural Networks
Title | Towards Robust Interpretability with Self-Explaining Neural Networks |
Authors | David Alvarez-Melis, Tommi S. Jaakkola |
Abstract | Most recent work on interpretability of complex machine learning models has focused on estimating $\textit{a posteriori}$ explanations for previously trained models around specific predictions. $\textit{Self-explaining}$ models where interpretability plays a key role already during learning have received much less attention. We propose three desiderata for explanations in general – explicitness, faithfulness, and stability – and show that existing methods do not satisfy them. In response, we design self-explaining models in stages, progressively generalizing linear classifiers to complex yet architecturally explicit models. Faithfulness and stability are enforced via regularization specifically tailored to such models. Experimental results across various benchmark datasets show that our framework offers a promising direction for reconciling model complexity and interpretability. |
Tasks | |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07538v2 |
http://arxiv.org/pdf/1806.07538v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-interpretability-with-self-1 |
Repo | |
Framework | |
Identification of Seed Cells in Multispectral Images for GrowCut Segmentation
Title | Identification of Seed Cells in Multispectral Images for GrowCut Segmentation |
Authors | Wuilan Torres, Antonio Rueda-Toicen |
Abstract | The segmentation of satellite images is a necessary step to perform object-oriented image classification, which has become relevant due to its applicability on images with a high spatial resolution. To perform object-oriented image classification, the studied image must first be segmented in uniform regions. This segmentation requires manual work by an expert user, who must exhaustively explore the image to establish thresholds that generate useful and representative segments without oversegmenting and without discarding representative segments. We propose a technique that automatically segments the multispectral image while facing these issues. We identify in the image homogenous zones according to their spectral signatures through the use of morphological filters. These homogenous zones are representatives of different types of land coverings in the image and are used as seeds for the GrowCut multispectral segmentation algorithm. GrowCut is a cellular automaton with competitive region growth, its cells are linked to every pixel in the image through three parameters: the pixel’s spectral signature, a label, and a strength factor that represents the strength with which a cell defends its label. The seed cells possess maximum strength and maintain their state throughout the automaton’s evolution. Starting from seed cells, each cell in the image is iteratively attacked by its neighboring cells. When the automaton stops updating its states, we obtain a segmented image where each pixel has taken the label of one of its cells. In this paper the algorithm was applied in an image acquired by Landsat8 on agricultural land of Calabozo, Guarico, Venezuela where there are different types of land coverings: agriculture, urban regions, water bodies, and savannas with different degrees of human intervention. The segmentation obtained is presented as irregular polygons enclosing geographical objects. |
Tasks | Image Classification |
Published | 2018-01-17 |
URL | http://arxiv.org/abs/1801.05525v1 |
http://arxiv.org/pdf/1801.05525v1.pdf | |
PWC | https://paperswithcode.com/paper/identification-of-seed-cells-in-multispectral |
Repo | |
Framework | |
Integrating omics and MRI data with kernel-based tests and CNNs to identify rare genetic markers for Alzheimer’s disease
Title | Integrating omics and MRI data with kernel-based tests and CNNs to identify rare genetic markers for Alzheimer’s disease |
Authors | Stefan Konigorski, Shahryar Khorasani, Christoph Lippert |
Abstract | For precision medicine and personalized treatment, we need to identify predictive markers of disease. We focus on Alzheimer’s disease (AD), where magnetic resonance imaging scans provide information about the disease status. By combining imaging with genome sequencing, we aim at identifying rare genetic markers associated with quantitative traits predicted from convolutional neural networks (CNNs), which traditionally have been derived manually by experts. Kernel-based tests are a powerful tool for associating sets of genetic variants, but how to optimally model rare genetic variants is still an open research question. We propose a generalized set of kernels that incorporate prior information from various annotations and multi-omics data. In the analysis of data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), we evaluate whether (i) CNNs yield precise and reliable brain traits, and (ii) the novel kernel-based tests can help to identify loci associated with AD. The results indicate that CNNs provide a fast, scalable and precise tool to derive quantitative AD traits and that new kernels integrating domain knowledge can yield higher power in association tests of very rare variants. |
Tasks | |
Published | 2018-12-02 |
URL | http://arxiv.org/abs/1812.00448v2 |
http://arxiv.org/pdf/1812.00448v2.pdf | |
PWC | https://paperswithcode.com/paper/integrating-omics-and-mri-data-with-kernel |
Repo | |
Framework | |
Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning
Title | Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning |
Authors | Mitsuru Kusumoto, Keisuke Yahata, Masahiro Sakai |
Abstract | The problem-solving in automated theorem proving (ATP) can be interpreted as a search problem where the prover constructs a proof tree step by step. In this paper, we propose a deep reinforcement learning algorithm for proof search in intuitionistic propositional logic. The most significant challenge in the application of deep learning to the ATP is the absence of large, public theorem database. We, however, overcame this issue by applying a novel data augmentation procedure at each iteration of the reinforcement learning. We also improve the efficiency of the algorithm by representing the syntactic structure of formulas by a novel compact graph representation. Using the large volume of augmented data, we train highly accurate graph neural networks that approximate the value function for the set of the syntactic structures of formulas. Our method is also cost-efficient in terms of computational time. We will show that our prover outperforms Coq’s $\texttt{tauto}$ tactic, a prover based on human-engineered heuristics. Within the specified time limit, our prover solved 84% of the theorems in a benchmark library, while $\texttt{tauto}$ was able to solve only 52%. |
Tasks | Automated Theorem Proving, Data Augmentation |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00796v1 |
http://arxiv.org/pdf/1811.00796v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-theorem-proving-in-intuitionistic |
Repo | |
Framework | |
Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network
Title | Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network |
Authors | Kouhei Nishida, Hernan Aguirre, Shota Saito, Shinichi Shirakawa, Youhei Akimoto |
Abstract | Black box discrete optimization (BBDO) appears in wide range of engineering tasks. Evolutionary or other BBDO approaches have been applied, aiming at automating necessary tuning of system parameters, such as hyper parameter tuning of machine learning based systems when being installed for a specific task. However, automation is often jeopardized by the need of strategy parameter tuning for BBDO algorithms. An expert with the domain knowledge must undergo time-consuming strategy parameter tuning. This paper proposes a parameterless BBDO algorithm based on information geometric optimization, a recent framework for black box optimization using stochastic natural gradient. Inspired by some theoretical implications, we develop an adaptation mechanism for strategy parameters of the stochastic natural gradient method for discrete search domains. The proposed algorithm is evaluated on commonly used test problems. It is further extended to two examples of simultaneous optimization of the hyper parameters and the connection weights of deep learning models, leading to a faster optimization than the existing approaches without any effort of parameter tuning. |
Tasks | |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06517v1 |
http://arxiv.org/pdf/1809.06517v1.pdf | |
PWC | https://paperswithcode.com/paper/parameterless-stochastic-natural-gradient |
Repo | |
Framework | |
From the EM Algorithm to the CM-EM Algorithm for Global Convergence of Mixture Models
Title | From the EM Algorithm to the CM-EM Algorithm for Global Convergence of Mixture Models |
Authors | Chenguang Lu |
Abstract | The Expectation-Maximization (EM) algorithm for mixture models often results in slow or invalid convergence. The popular convergence proof affirms that the likelihood increases with Q; Q is increasing in the M -step and non-decreasing in the E-step. The author found that (1) Q may and should decrease in some E-steps; (2) The Shannon channel from the E-step is improper and hence the expectation is improper. The author proposed the CM-EM algorithm (CM means Channel’s Matching), which adds a step to optimize the mixture ratios for the proper Shannon channel and maximizes G, average log-normalized-likelihood, in the M-step. Neal and Hinton’s Maximization-Maximization (MM) algorithm use F instead of Q to speed the convergence. Maximizing G is similar to maximizing F. The new convergence proof is similar to Beal’s proof with the variational method. It first proves that the minimum relative entropy equals the minimum R-G (R is mutual information), then uses variational and iterative methods that Shannon et al. use for rate-distortion functions to prove the global convergence. Some examples show that Q and F should and may decrease in some E-steps. For the same example, the EM, MM, and CM-EM algorithms need about 36, 18, and 9 iterations respectively. |
Tasks | |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11227v1 |
http://arxiv.org/pdf/1810.11227v1.pdf | |
PWC | https://paperswithcode.com/paper/from-the-em-algorithm-to-the-cm-em-algorithm |
Repo | |
Framework | |
Exploiting Anti-monotonicity of Multi-label Evaluation Measures for Inducing Multi-label Rules
Title | Exploiting Anti-monotonicity of Multi-label Evaluation Measures for Inducing Multi-label Rules |
Authors | Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz |
Abstract | Exploiting dependencies between labels is considered to be crucial for multi-label classification. Rules are able to expose label dependencies such as implications, subsumptions or exclusions in a human-comprehensible and interpretable manner. However, the induction of rules with multiple labels in the head is particularly challenging, as the number of label combinations which must be taken into account for each rule grows exponentially with the number of available labels. To overcome this limitation, algorithms for exhaustive rule mining typically use properties such as anti-monotonicity or decomposability in order to prune the search space. In the present paper, we examine whether commonly used multi-label evaluation metrics satisfy these properties and therefore are suited to prune the search space for multi-label heads. |
Tasks | Multi-Label Classification |
Published | 2018-12-14 |
URL | http://arxiv.org/abs/1812.06833v1 |
http://arxiv.org/pdf/1812.06833v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-anti-monotonicity-of-multi-label |
Repo | |
Framework | |
A multidisciplinary task-based perspective for evaluating the impact of AI autonomy and generality on the future of work
Title | A multidisciplinary task-based perspective for evaluating the impact of AI autonomy and generality on the future of work |
Authors | Enrique Fernández-Macías, Emilia Gómez, José Hernández-Orallo, Bao Sheng Loe, Bertin Martens, Fernando Martínez-Plumed, Songül Tolan |
Abstract | This paper presents a multidisciplinary task approach for assessing the impact of artificial intelligence on the future of work. We provide definitions of a task from two main perspectives: socio-economic and computational. We propose to explore ways in which we can integrate or map these perspectives, and link them with the skills or capabilities required by them, for humans and AI systems. Finally, we argue that in order to understand the dynamics of tasks, we have to explore the relevance of autonomy and generality of AI systems for the automation or alteration of the workplace. |
Tasks | |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02416v1 |
http://arxiv.org/pdf/1807.02416v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multidisciplinary-task-based-perspective |
Repo | |
Framework | |
Victory Probability in the Fire Emblem Arena
Title | Victory Probability in the Fire Emblem Arena |
Authors | Andrew Brockmann |
Abstract | We demonstrate how to efficiently compute the probability of victory in Fire Emblem arena battles. The probability can be expressed in terms of a multivariate recurrence relation which lends itself to a straightforward dynamic programming solution. Some implementation issues are addressed, and a full implementation is provided in code. |
Tasks | |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.10750v1 |
http://arxiv.org/pdf/1808.10750v1.pdf | |
PWC | https://paperswithcode.com/paper/victory-probability-in-the-fire-emblem-arena |
Repo | |
Framework | |
Direct Runge-Kutta Discretization Achieves Acceleration
Title | Direct Runge-Kutta Discretization Achieves Acceleration |
Authors | Jingzhao Zhang, Aryan Mokhtari, Suvrit Sra, Ali Jadbabaie |
Abstract | We study gradient-based optimization methods obtained by directly discretizing a second-order ordinary differential equation (ODE) related to the continuous limit of Nesterov’s accelerated gradient method. When the function is smooth enough, we show that acceleration can be achieved by a stable discretization of this ODE using standard Runge-Kutta integrators. Specifically, we prove that under Lipschitz-gradient, convexity and order-$(s+2)$ differentiability assumptions, the sequence of iterates generated by discretizing the proposed second-order ODE converges to the optimal solution at a rate of $\mathcal{O}({N^{-2\frac{s}{s+1}}})$, where $s$ is the order of the Runge-Kutta numerical integrator. Furthermore, we introduce a new local flatness condition on the objective, under which rates even faster than $\mathcal{O}(N^{-2})$ can be achieved with low-order integrators and only gradient information. Notably, this flatness condition is satisfied by several standard loss functions used in machine learning. We provide numerical experiments that verify the theoretical rates predicted by our results. |
Tasks | |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00521v5 |
http://arxiv.org/pdf/1805.00521v5.pdf | |
PWC | https://paperswithcode.com/paper/direct-runge-kutta-discretization-achieves |
Repo | |
Framework | |
A Study of Question Effectiveness Using Reddit “Ask Me Anything” Threads
Title | A Study of Question Effectiveness Using Reddit “Ask Me Anything” Threads |
Authors | Kristjan Arumae, Guo-Jun Qi, Fei Liu |
Abstract | Asking effective questions is a powerful social skill. In this paper we seek to build computational models that learn to discriminate effective questions from ineffective ones. Armed with such a capability, future advanced systems can evaluate the quality of questions and provide suggestions for effective question wording. We create a large-scale, real-world dataset that contains over 400,000 questions collected from Reddit “Ask Me Anything” threads. Each thread resembles an online press conference where questions compete with each other for attention from the host. This dataset enables the development of a class of computational models for predicting whether a question will be answered. We develop a new convolutional neural network architecture with variable-length context and demonstrate the efficacy of the model by comparing it with state-of-the-art baselines and human judges. |
Tasks | |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.10389v1 |
http://arxiv.org/pdf/1805.10389v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-of-question-effectiveness-using |
Repo | |
Framework | |
Neural Algebra of Classifiers
Title | Neural Algebra of Classifiers |
Authors | Rodrigo Santa Cruz, Basura Fernando, Anoop Cherian, Stephen Gould |
Abstract | The world is fundamentally compositional, so it is natural to think of visual recognition as the recognition of basic visually primitives that are composed according to well-defined rules. This strategy allows us to recognize unseen complex concepts from simple visual primitives. However, the current trend in visual recognition follows a data greedy approach where huge amounts of data are required to learn models for any desired visual concept. In this paper, we build on the compositionality principle and develop an “algebra” to compose classifiers for complex visual concepts. To this end, we learn neural network modules to perform boolean algebra operations on simple visual classifiers. Since these modules form a complete functional set, a classifier for any complex visual concept defined as a boolean expression of primitives can be obtained by recursively applying the learned modules, even if we do not have a single training sample. As our experiments show, using such a framework, we can compose classifiers for complex visual concepts outperforming standard baselines on two well-known visual recognition benchmarks. Finally, we present a qualitative analysis of our method and its properties. |
Tasks | |
Published | 2018-01-26 |
URL | http://arxiv.org/abs/1801.08676v1 |
http://arxiv.org/pdf/1801.08676v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-algebra-of-classifiers |
Repo | |
Framework | |
Greedy Frank-Wolfe Algorithm for Exemplar Selection
Title | Greedy Frank-Wolfe Algorithm for Exemplar Selection |
Authors | Gary Cheng, Armin Askari, Kannan Ramchandran, Laurent El Ghaoui |
Abstract | In this paper, we consider the problem of selecting representatives from a data set for arbitrary supervised/unsupervised learning tasks. We identify a subset $S$ of a data set $A$ such that 1) the size of $S$ is much smaller than $A$ and 2) $S$ efficiently describes the entire data set, in a way formalized via convex optimization. In order to generate $S = k$ exemplars, our kernelizable algorithm, Frank-Wolfe Sparse Representation (FWSR), only needs to execute $\approx k$ iterations with a per-iteration cost that is quadratic in the size of $A$. This is in contrast to other state of the art methods which need to execute until convergence with each iteration costing an extra factor of $d$ (dimension of the data). Moreover, we also provide a proof of linear convergence for our method. We support our results with empirical experiments; we test our algorithm against current methods in three different experimental setups on four different data sets. FWSR outperforms other exemplar finding methods both in speed and accuracy in almost all scenarios. |
Tasks | Dictionary Learning |
Published | 2018-11-06 |
URL | https://arxiv.org/abs/1811.02702v3 |
https://arxiv.org/pdf/1811.02702v3.pdf | |
PWC | https://paperswithcode.com/paper/greedy-frank-wolfe-algorithm-for-exemplar |
Repo | |
Framework | |
Byzantine Stochastic Gradient Descent
Title | Byzantine Stochastic Gradient Descent |
Authors | Dan Alistarh, Zeyuan Allen-Zhu, Jerry Li |
Abstract | This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the $m$ machines which allegedly compute stochastic gradients every iteration, an $\alpha$-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds $\varepsilon$-approximate minimizers of convex functions in $T = \tilde{O}\big( \frac{1}{\varepsilon^2 m} + \frac{\alpha^2}{\varepsilon^2} \big)$ iterations. In contrast, traditional mini-batch SGD needs $T = O\big( \frac{1}{\varepsilon^2 m} \big)$ iterations, but cannot tolerate Byzantine failures. Further, we provide a lower bound showing that, up to logarithmic factors, our algorithm is information-theoretically optimal both in terms of sampling complexity and time complexity. |
Tasks | Stochastic Optimization |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08917v1 |
http://arxiv.org/pdf/1803.08917v1.pdf | |
PWC | https://paperswithcode.com/paper/byzantine-stochastic-gradient-descent |
Repo | |
Framework | |