July 29, 2019

3017 words 15 mins read

Paper Group ANR 124

Probability Reversal and the Disjunction Effect in Reasoning Systems. An Online Development Environment for Answer Set Programming. Memory Augmented Control Networks. Catalyst Acceleration for Gradient-Based Non-Convex Optimization. Towards Neural Machine Translation with Partially Aligned Corpora. Learning Aerial Image Segmentation from Online Map …

Probability Reversal and the Disjunction Effect in Reasoning Systems


Title	Probability Reversal and the Disjunction Effect in Reasoning Systems
Authors	Subhash Kak
Abstract	Data based judgments go into artificial intelligence applications but they undergo paradoxical reversal when seemingly unnecessary additional data is provided. Examples of this are Simpson’s reversal and the disjunction effect where the beliefs about the data change once it is presented or aggregated differently. Sometimes the significance of the difference can be evaluated using statistical tests such as Pearson’s chi-squared or Fisher’s exact test, but this may not be helpful in threshold-based decision systems that operate with incomplete information. To mitigate risks in the use of algorithms in decision-making, we consider the question of modeling of beliefs. We argue that evidence supports that beliefs are not classical statistical variables and they should, in the general case, be considered as superposition states of disjoint or polar outcomes. We analyze the disjunction effect from the perspective of the belief as a quantum vector.
Tasks	Decision Making
Published	2017-09-12
URL	http://arxiv.org/abs/1709.04029v1
PDF	http://arxiv.org/pdf/1709.04029v1.pdf
PWC	https://paperswithcode.com/paper/probability-reversal-and-the-disjunction
Repo
Framework

An Online Development Environment for Answer Set Programming


Title	An Online Development Environment for Answer Set Programming
Authors	Elias Marcopoulos, Christian Reotutar, Yuanlin Zhang
Abstract	Recent progress in logic programming (e.g., the development of the Answer Set Programming paradigm) has made it possible to teach it to general undergraduate and even high school students. Given the limited exposure of these students to computer science, the complexity of downloading, installing and using tools for writing logic programs could be a major barrier for logic programming to reach a much wider audience. We developed an online answer set programming environment with a self contained file system and a simple interface, allowing users to write logic programs and perform several tasks over the programs.
Tasks
Published	2017-06-20
URL	http://arxiv.org/abs/1707.01865v1
PDF	http://arxiv.org/pdf/1707.01865v1.pdf
PWC	https://paperswithcode.com/paper/an-online-development-environment-for-answer
Repo
Framework

Memory Augmented Control Networks


Title	Memory Augmented Control Networks
Authors	Arbaaz Khan, Clark Zhang, Nikolay Atanasov, Konstantinos Karydis, Vijay Kumar, Daniel D. Lee
Abstract	Planning problems in partially observable environments cannot be solved directly with convolutional networks and require some form of memory. But, even memory networks with sophisticated addressing schemes are unable to learn intelligent reasoning satisfactorily due to the complexity of simultaneously learning to access memory and plan. To mitigate these challenges we introduce the Memory Augmented Control Network (MACN). The proposed network architecture consists of three main parts. The first part uses convolutions to extract features and the second part uses a neural network-based planning module to pre-plan in the environment. The third part uses a network controller that learns to store those specific instances of past information that are necessary for planning. The performance of the network is evaluated in discrete grid world environments for path planning in the presence of simple and complex obstacles. We show that our network learns to plan and can generalize to new environments.
Tasks
Published	2017-09-17
URL	http://arxiv.org/abs/1709.05706v6
PDF	http://arxiv.org/pdf/1709.05706v6.pdf
PWC	https://paperswithcode.com/paper/memory-augmented-control-networks
Repo
Framework

Catalyst Acceleration for Gradient-Based Non-Convex Optimization


Title	Catalyst Acceleration for Gradient-Based Non-Convex Optimization
Authors	Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal, Zaid Harchaoui
Abstract	We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. Even though these methods may originally require convexity to operate, the proposed approach allows one to use them on weakly convex objectives, which covers a large class of non-convex functions typically appearing in machine learning and signal processing. In general, the scheme is guaranteed to produce a stationary point with a worst-case efficiency typical of first-order methods, and when the objective turns out to be convex, it automatically accelerates in the sense of Nesterov and achieves near-optimal convergence rate in function values. These properties are achieved without assuming any knowledge about the convexity of the objective, by automatically adapting to the unknown weak convexity constant. We conclude the paper by showing promising experimental results obtained by applying our approach to incremental algorithms such as SVRG and SAGA for sparse matrix factorization and for learning neural networks.
Tasks
Published	2017-03-31
URL	http://arxiv.org/abs/1703.10993v3
PDF	http://arxiv.org/pdf/1703.10993v3.pdf
PWC	https://paperswithcode.com/paper/catalyst-acceleration-for-gradient-based-non
Repo
Framework

Towards Neural Machine Translation with Partially Aligned Corpora


Title	Towards Neural Machine Translation with Partially Aligned Corpora
Authors	Yining Wang, Yang Zhao, Jiajun Zhang, Chengqing Zong, Zhengshan Xue
Abstract	While neural machine translation (NMT) has become the new paradigm, the parameter optimization requires large-scale parallel data which is scarce in many domains and language pairs. In this paper, we address a new translation scenario in which there only exists monolingual corpora and phrase pairs. We propose a new method towards translation with partially aligned sentence pairs which are derived from the phrase pairs and monolingual corpora. To make full use of the partially aligned corpora, we adapt the conventional NMT training method in two aspects. On one hand, different generation strategies are designed for aligned and unaligned target words. On the other hand, a different objective function is designed to model the partially aligned parts. The experiments demonstrate that our method can achieve a relatively good result in such a translation scenario, and tiny bitexts can boost translation quality to a large extent.
Tasks	Machine Translation
Published	2017-11-03
URL	http://arxiv.org/abs/1711.01006v1
PDF	http://arxiv.org/pdf/1711.01006v1.pdf
PWC	https://paperswithcode.com/paper/towards-neural-machine-translation-with-1
Repo
Framework

Learning Aerial Image Segmentation from Online Maps


Title	Learning Aerial Image Segmentation from Online Maps
Authors	Pascal Kaiser, Jan Dirk Wegner, Aurelien Lucchi, Martin Jaggi, Thomas Hofmann, Konrad Schindler
Abstract	This study deals with semantic segmentation of high-resolution (aerial) images where a semantic class label is assigned to each pixel via supervised classification as a basis for automatic map generation. Recently, deep convolutional neural networks (CNNs) have shown impressive performance and have quickly become the de-facto standard for semantic segmentation, with the added benefit that task-specific feature design is no longer necessary. However, a major downside of deep learning methods is that they are extremely data-hungry, thus aggravating the perennial bottleneck of supervised classification, to obtain enough annotated training data. On the other hand, it has been observed that they are rather robust against noise in the training labels. This opens up the intriguing possibility to avoid annotating huge amounts of training data, and instead train the classifier from existing legacy data or crowd-sourced maps which can exhibit high levels of noise. The question addressed in this paper is: can training with large-scale, publicly available labels replace a substantial part of the manual labeling effort and still achieve sufficient performance? Such data will inevitably contain a significant portion of errors, but in return virtually unlimited quantities of it are available in larger parts of the world. We adapt a state-of-the-art CNN architecture for semantic segmentation of buildings and roads in aerial images, and compare its performance when using different training data sets, ranging from manually labeled, pixel-accurate ground truth of the same city to automatic training data derived from OpenStreetMap data from distant locations. We report our results that indicate that satisfying performance can be obtained with significantly less manual annotation effort, by exploiting noisy large-scale training data.
Tasks	Semantic Segmentation
Published	2017-07-21
URL	http://arxiv.org/abs/1707.06879v1
PDF	http://arxiv.org/pdf/1707.06879v1.pdf
PWC	https://paperswithcode.com/paper/learning-aerial-image-segmentation-from
Repo
Framework

Microaneurysm Detection in Fundus Images Using a Two-step Convolutional Neural Networks


Title	Microaneurysm Detection in Fundus Images Using a Two-step Convolutional Neural Networks
Authors	Noushin Eftekheri, Mojtaba Masoudi, Hamidreza Pourreza, Kamaledin Ghiasi Shirazi, Ehsan Saeedi
Abstract	Diabetic Retinopathy (DR) is a prominent cause of blindness in the world. The early treatment of DR can be conducted from detection of microaneurysms (MAs) which appears as reddish spots in retinal images. An automated microaneurysm detection can be a helpful system for ophthalmologists. In this paper, deep learning, in particular convolutional neural network (CNN), is used as a powerful tool to efficiently detect MAs from fundus images. In our method a new technique is used to utilise a two-stage training process which results in an accurate detection, while decreasing computational complexity in comparison with previous works. To validate our proposed method, an experiment is conducted using Keras library to implement our proposed CNN on two standard publicly available datasets. Our results show a promising sensitivity value of about 0.8 at the average number of false positive per image greater than 6 which is a competitive value with the state-of-the-art approaches.
Tasks
Published	2017-10-14
URL	http://arxiv.org/abs/1710.05191v2
PDF	http://arxiv.org/pdf/1710.05191v2.pdf
PWC	https://paperswithcode.com/paper/microaneurysm-detection-in-fundus-images
Repo
Framework

Multi-Task Label Embedding for Text Classification


Title	Multi-Task Label Embedding for Text Classification
Authors	Honglun Zhang, Liqiang Xiao, Wenqing Chen, Yongkun Wang, Yaohui Jin
Abstract	Multi-task learning in text classification leverages implicit correlations among related tasks to extract common features and yield performance gains. However, most previous works treat labels of each task as independent and meaningless one-hot vectors, which cause a loss of potential information and makes it difficult for these models to jointly learn three or more tasks. In this paper, we propose Multi-Task Label Embedding to convert labels in text classification into semantic vectors, thereby turning the original tasks into vector matching tasks. We implement unsupervised, supervised and semi-supervised models of Multi-Task Label Embedding, all utilizing semantic correlations among tasks and making it particularly convenient to scale and transfer as more tasks are involved. Extensive experiments on five benchmark datasets for text classification show that our models can effectively improve performances of related tasks with semantic representations of labels and additional information from each other.
Tasks	Multi-Task Learning, Text Classification
Published	2017-10-17
URL	http://arxiv.org/abs/1710.07210v1
PDF	http://arxiv.org/pdf/1710.07210v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-label-embedding-for-text
Repo
Framework

3DOF Pedestrian Trajectory Prediction Learned from Long-Term Autonomous Mobile Robot Deployment Data


Title	3DOF Pedestrian Trajectory Prediction Learned from Long-Term Autonomous Mobile Robot Deployment Data
Authors	Li Sun, Zhi Yan, Sergi Molina Mellado, Marc Hanheide, Tom Duckett
Abstract	This paper presents a novel 3DOF pedestrian trajectory prediction approach for autonomous mobile service robots. While most previously reported methods are based on learning of 2D positions in monocular camera images, our approach uses range-finder sensors to learn and predict 3DOF pose trajectories (i.e. 2D position plus 1D rotation within the world coordinate system). Our approach, T-Pose-LSTM (Temporal 3DOF-Pose Long-Short-Term Memory), is trained using long-term data from real-world robot deployments and aims to learn context-dependent (environment- and time-specific) human activities. Our approach incorporates long-term temporal information (i.e. date and time) with short-term pose observations as input. A sequence-to-sequence LSTM encoder-decoder is trained, which encodes observations into LSTM and then decodes as predictions. For deployment, it can perform on-the-fly prediction in real-time. Instead of using manually annotated data, we rely on a robust human detection, tracking and SLAM system, providing us with examples in a global coordinate system. We validate the approach using more than 15K pedestrian trajectories recorded in a care home environment over a period of three months. The experiment shows that the proposed T-Pose-LSTM model advances the state-of-the-art 2D-based method for human trajectory prediction in long-term mobile robot deployments.
Tasks	Human Detection, Trajectory Prediction
Published	2017-09-30
URL	http://arxiv.org/abs/1710.00126v1
PDF	http://arxiv.org/pdf/1710.00126v1.pdf
PWC	https://paperswithcode.com/paper/3dof-pedestrian-trajectory-prediction-learned
Repo
Framework

Stoic Ethics for Artificial Agents


Title	Stoic Ethics for Artificial Agents
Authors	Gabriel Murray
Abstract	We present a position paper advocating the notion that Stoic philosophy and ethics can inform the development of ethical A.I. systems. This is in sharp contrast to most work on building ethical A.I., which has focused on Utilitarian or Deontological ethical theories. We relate ethical A.I. to several core Stoic notions, including the dichotomy of control, the four cardinal virtues, the ideal Sage, Stoic practices, and Stoic perspectives on emotion or affect. More generally, we put forward an ethical view of A.I. that focuses more on internal states of the artificial agent rather than on external actions of the agent. We provide examples relating to near-term A.I. systems as well as hypothetical superintelligent agents.
Tasks
Published	2017-01-09
URL	http://arxiv.org/abs/1701.02388v2
PDF	http://arxiv.org/pdf/1701.02388v2.pdf
PWC	https://paperswithcode.com/paper/stoic-ethics-for-artificial-agents
Repo
Framework

Compositional Approaches for Representing Relations Between Words: A Comparative Study


Title	Compositional Approaches for Representing Relations Between Words: A Comparative Study
Authors	Huda Hakami, Danushka Bollegala
Abstract	Identifying the relations that exist between words (or entities) is important for various natural language processing tasks such as, relational search, noun-modifier classification and analogy detection. A popular approach to represent the relations between a pair of words is to extract the patterns in which the words co-occur with from a corpus, and assign each word-pair a vector of pattern frequencies. Despite the simplicity of this approach, it suffers from data sparseness, information scalability and linguistic creativity as the model is unable to handle previously unseen word pairs in a corpus. In contrast, a compositional approach for representing relations between words overcomes these issues by using the attributes of each individual word to indirectly compose a representation for the common relations that hold between the two words. This study aims to compare different operations for creating relation representations from word-level representations. We investigate the performance of the compositional methods by measuring the relational similarities using several benchmark datasets for word analogy. Moreover, we evaluate the different relation representations in a knowledge base completion task.
Tasks	Knowledge Base Completion
Published	2017-09-04
URL	http://arxiv.org/abs/1709.01193v1
PDF	http://arxiv.org/pdf/1709.01193v1.pdf
PWC	https://paperswithcode.com/paper/compositional-approaches-for-representing
Repo
Framework

One Model for the Learning of Language


Title	One Model for the Learning of Language
Authors	Yuan Yang
Abstract	A major target of linguistics and cognitive science has been to understand what class of learning systems can acquire the key structures of natural language. Until recently, the computational requirements of language have been used to argue that learning is impossible without a highly constrained hypothesis space. Here, we describe a learning system that is maximally unconstrained, operating over the space of all computations, and is able to acquire several of the key structures present natural language from positive evidence alone. The model successfully acquires regular (e.g. $(ab)^n$), context-free (e.g. $a^n b^n$, $x x^R$), and context-sensitive (e.g. $a^nb^nc^n$, $a^nb^mc^nd^m$, $xx$) formal languages. Our approach develops the concept of factorized programs in Bayesian program induction in order to help manage the complexity of representation. We show in learning, the model predicts several phenomena empirically observed in human grammar acquisition experiments.
Tasks
Published	2017-11-16
URL	http://arxiv.org/abs/1711.06301v2
PDF	http://arxiv.org/pdf/1711.06301v2.pdf
PWC	https://paperswithcode.com/paper/one-model-for-the-learning-of-language
Repo
Framework

A Systematic Review of Hindi Prosody


Title	A Systematic Review of Hindi Prosody
Authors	Somnath Roy
Abstract	Prosody describes both form and function of a sentence using the suprasegmental features of speech. Prosody phenomena are explored in the domain of higher phonological constituents such as word, phonological phrase and intonational phrase. The study of prosody at the word level is called word prosody and above word level is called sentence prosody. Word Prosody describes stress pattern by comparing the prosodic features of its constituent syllables. Sentence Prosody involves the study on phrasing pattern and intonatonal pattern of a language. The aim of this study is to summarize the existing works on Hindi prosody carried out in different domain of language and speech processing. The review is presented in a systematic fashion so that it could be a useful resource for one who wants to build on the existing works.
Tasks
Published	2017-05-09
URL	http://arxiv.org/abs/1705.03247v1
PDF	http://arxiv.org/pdf/1705.03247v1.pdf
PWC	https://paperswithcode.com/paper/a-systematic-review-of-hindi-prosody
Repo
Framework

FlashProfile: A Framework for Synthesizing Data Profiles


Title	FlashProfile: A Framework for Synthesizing Data Profiles
Authors	Saswat Padhi, Prateek Jain, Daniel Perelman, Oleksandr Polozov, Sumit Gulwani, Todd Millstein
Abstract	We address the problem of learning a syntactic profile for a collection of strings, i.e. a set of regex-like patterns that succinctly describe the syntactic variations in the strings. Real-world datasets, typically curated from multiple sources, often contain data in various syntactic formats. Thus, any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify the different formats is infeasible in standard big-data scenarios. Prior techniques are restricted to a small set of pre-defined patterns (e.g. digits, letters, words, etc.), and provide no control over granularity of profiles. We define syntactic profiling as a problem of clustering strings based on syntactic similarity, followed by identifying patterns that succinctly describe each cluster. We present a technique for synthesizing such profiles over a given language of patterns, that also allows for interactive refinement by requesting a desired number of clusters. Using a state-of-the-art inductive synthesis framework, PROSE, we have implemented our technique as FlashProfile. Across $153$ tasks over $75$ large real datasets, we observe a median profiling time of only $\sim,0.7,$s. Furthermore, we show that access to syntactic profiles may allow for more accurate synthesis of programs, i.e. using fewer examples, in programming-by-example (PBE) workflows such as FlashFill.
Tasks
Published	2017-09-17
URL	http://arxiv.org/abs/1709.05725v2
PDF	http://arxiv.org/pdf/1709.05725v2.pdf
PWC	https://paperswithcode.com/paper/flashprofile-interactive-synthesis-of
Repo
Framework

Revisit Fuzzy Neural Network: Demystifying Batch Normalization and ReLU with Generalized Hamming Network


Title	Revisit Fuzzy Neural Network: Demystifying Batch Normalization and ReLU with Generalized Hamming Network
Authors	Lixin Fan
Abstract	We revisit fuzzy neural network with a cornerstone notion of generalized hamming distance, which provides a novel and theoretically justified framework to re-interpret many useful neural network techniques in terms of fuzzy logic. In particular, we conjecture and empirically illustrate that, the celebrated batch normalization (BN) technique actually adapts the normalized bias such that it approximates the rightful bias induced by the generalized hamming distance. Once the due bias is enforced analytically, neither the optimization of bias terms nor the sophisticated batch normalization is needed. Also in the light of generalized hamming distance, the popular rectified linear units (ReLU) can be treated as setting a minimal hamming distance threshold between network inputs and weights. This thresholding scheme, on the one hand, can be improved by introducing double thresholding on both extremes of neuron outputs. On the other hand, ReLUs turn out to be non-essential and can be removed from networks trained for simple tasks like MNIST classification. The proposed generalized hamming network (GHN) as such not only lends itself to rigorous analysis and interpretation within the fuzzy logic theory but also demonstrates fast learning speed, well-controlled behaviour and state-of-the-art performances on a variety of learning tasks.
Tasks
Published	2017-10-27
URL	http://arxiv.org/abs/1710.10328v1
PDF	http://arxiv.org/pdf/1710.10328v1.pdf
PWC	https://paperswithcode.com/paper/revisit-fuzzy-neural-network-demystifying
Repo
Framework