Paper Group ANR 124
Probability Reversal and the Disjunction Effect in Reasoning Systems. An Online Development Environment for Answer Set Programming. Memory Augmented Control Networks. Catalyst Acceleration for Gradient-Based Non-Convex Optimization. Towards Neural Machine Translation with Partially Aligned Corpora. Learning Aerial Image Segmentation from Online Map …
Probability Reversal and the Disjunction Effect in Reasoning Systems
Title | Probability Reversal and the Disjunction Effect in Reasoning Systems |
Authors | Subhash Kak |
Abstract | Data based judgments go into artificial intelligence applications but they undergo paradoxical reversal when seemingly unnecessary additional data is provided. Examples of this are Simpson’s reversal and the disjunction effect where the beliefs about the data change once it is presented or aggregated differently. Sometimes the significance of the difference can be evaluated using statistical tests such as Pearson’s chi-squared or Fisher’s exact test, but this may not be helpful in threshold-based decision systems that operate with incomplete information. To mitigate risks in the use of algorithms in decision-making, we consider the question of modeling of beliefs. We argue that evidence supports that beliefs are not classical statistical variables and they should, in the general case, be considered as superposition states of disjoint or polar outcomes. We analyze the disjunction effect from the perspective of the belief as a quantum vector. |
Tasks | Decision Making |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.04029v1 |
http://arxiv.org/pdf/1709.04029v1.pdf | |
PWC | https://paperswithcode.com/paper/probability-reversal-and-the-disjunction |
Repo | |
Framework | |
An Online Development Environment for Answer Set Programming
Title | An Online Development Environment for Answer Set Programming |
Authors | Elias Marcopoulos, Christian Reotutar, Yuanlin Zhang |
Abstract | Recent progress in logic programming (e.g., the development of the Answer Set Programming paradigm) has made it possible to teach it to general undergraduate and even high school students. Given the limited exposure of these students to computer science, the complexity of downloading, installing and using tools for writing logic programs could be a major barrier for logic programming to reach a much wider audience. We developed an online answer set programming environment with a self contained file system and a simple interface, allowing users to write logic programs and perform several tasks over the programs. |
Tasks | |
Published | 2017-06-20 |
URL | http://arxiv.org/abs/1707.01865v1 |
http://arxiv.org/pdf/1707.01865v1.pdf | |
PWC | https://paperswithcode.com/paper/an-online-development-environment-for-answer |
Repo | |
Framework | |
Memory Augmented Control Networks
Title | Memory Augmented Control Networks |
Authors | Arbaaz Khan, Clark Zhang, Nikolay Atanasov, Konstantinos Karydis, Vijay Kumar, Daniel D. Lee |
Abstract | Planning problems in partially observable environments cannot be solved directly with convolutional networks and require some form of memory. But, even memory networks with sophisticated addressing schemes are unable to learn intelligent reasoning satisfactorily due to the complexity of simultaneously learning to access memory and plan. To mitigate these challenges we introduce the Memory Augmented Control Network (MACN). The proposed network architecture consists of three main parts. The first part uses convolutions to extract features and the second part uses a neural network-based planning module to pre-plan in the environment. The third part uses a network controller that learns to store those specific instances of past information that are necessary for planning. The performance of the network is evaluated in discrete grid world environments for path planning in the presence of simple and complex obstacles. We show that our network learns to plan and can generalize to new environments. |
Tasks | |
Published | 2017-09-17 |
URL | http://arxiv.org/abs/1709.05706v6 |
http://arxiv.org/pdf/1709.05706v6.pdf | |
PWC | https://paperswithcode.com/paper/memory-augmented-control-networks |
Repo | |
Framework | |
Catalyst Acceleration for Gradient-Based Non-Convex Optimization
Title | Catalyst Acceleration for Gradient-Based Non-Convex Optimization |
Authors | Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal, Zaid Harchaoui |
Abstract | We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. Even though these methods may originally require convexity to operate, the proposed approach allows one to use them on weakly convex objectives, which covers a large class of non-convex functions typically appearing in machine learning and signal processing. In general, the scheme is guaranteed to produce a stationary point with a worst-case efficiency typical of first-order methods, and when the objective turns out to be convex, it automatically accelerates in the sense of Nesterov and achieves near-optimal convergence rate in function values. These properties are achieved without assuming any knowledge about the convexity of the objective, by automatically adapting to the unknown weak convexity constant. We conclude the paper by showing promising experimental results obtained by applying our approach to incremental algorithms such as SVRG and SAGA for sparse matrix factorization and for learning neural networks. |
Tasks | |
Published | 2017-03-31 |
URL | http://arxiv.org/abs/1703.10993v3 |
http://arxiv.org/pdf/1703.10993v3.pdf | |
PWC | https://paperswithcode.com/paper/catalyst-acceleration-for-gradient-based-non |
Repo | |
Framework | |
Towards Neural Machine Translation with Partially Aligned Corpora
Title | Towards Neural Machine Translation with Partially Aligned Corpora |
Authors | Yining Wang, Yang Zhao, Jiajun Zhang, Chengqing Zong, Zhengshan Xue |
Abstract | While neural machine translation (NMT) has become the new paradigm, the parameter optimization requires large-scale parallel data which is scarce in many domains and language pairs. In this paper, we address a new translation scenario in which there only exists monolingual corpora and phrase pairs. We propose a new method towards translation with partially aligned sentence pairs which are derived from the phrase pairs and monolingual corpora. To make full use of the partially aligned corpora, we adapt the conventional NMT training method in two aspects. On one hand, different generation strategies are designed for aligned and unaligned target words. On the other hand, a different objective function is designed to model the partially aligned parts. The experiments demonstrate that our method can achieve a relatively good result in such a translation scenario, and tiny bitexts can boost translation quality to a large extent. |
Tasks | Machine Translation |
Published | 2017-11-03 |
URL | http://arxiv.org/abs/1711.01006v1 |
http://arxiv.org/pdf/1711.01006v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-neural-machine-translation-with-1 |
Repo | |
Framework | |
Learning Aerial Image Segmentation from Online Maps
Title | Learning Aerial Image Segmentation from Online Maps |
Authors | Pascal Kaiser, Jan Dirk Wegner, Aurelien Lucchi, Martin Jaggi, Thomas Hofmann, Konrad Schindler |
Abstract | This study deals with semantic segmentation of high-resolution (aerial) images where a semantic class label is assigned to each pixel via supervised classification as a basis for automatic map generation. Recently, deep convolutional neural networks (CNNs) have shown impressive performance and have quickly become the de-facto standard for semantic segmentation, with the added benefit that task-specific feature design is no longer necessary. However, a major downside of deep learning methods is that they are extremely data-hungry, thus aggravating the perennial bottleneck of supervised classification, to obtain enough annotated training data. On the other hand, it has been observed that they are rather robust against noise in the training labels. This opens up the intriguing possibility to avoid annotating huge amounts of training data, and instead train the classifier from existing legacy data or crowd-sourced maps which can exhibit high levels of noise. The question addressed in this paper is: can training with large-scale, publicly available labels replace a substantial part of the manual labeling effort and still achieve sufficient performance? Such data will inevitably contain a significant portion of errors, but in return virtually unlimited quantities of it are available in larger parts of the world. We adapt a state-of-the-art CNN architecture for semantic segmentation of buildings and roads in aerial images, and compare its performance when using different training data sets, ranging from manually labeled, pixel-accurate ground truth of the same city to automatic training data derived from OpenStreetMap data from distant locations. We report our results that indicate that satisfying performance can be obtained with significantly less manual annotation effort, by exploiting noisy large-scale training data. |
Tasks | Semantic Segmentation |
Published | 2017-07-21 |
URL | http://arxiv.org/abs/1707.06879v1 |
http://arxiv.org/pdf/1707.06879v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-aerial-image-segmentation-from |
Repo | |
Framework | |
Microaneurysm Detection in Fundus Images Using a Two-step Convolutional Neural Networks
Title | Microaneurysm Detection in Fundus Images Using a Two-step Convolutional Neural Networks |
Authors | Noushin Eftekheri, Mojtaba Masoudi, Hamidreza Pourreza, Kamaledin Ghiasi Shirazi, Ehsan Saeedi |
Abstract | Diabetic Retinopathy (DR) is a prominent cause of blindness in the world. The early treatment of DR can be conducted from detection of microaneurysms (MAs) which appears as reddish spots in retinal images. An automated microaneurysm detection can be a helpful system for ophthalmologists. In this paper, deep learning, in particular convolutional neural network (CNN), is used as a powerful tool to efficiently detect MAs from fundus images. In our method a new technique is used to utilise a two-stage training process which results in an accurate detection, while decreasing computational complexity in comparison with previous works. To validate our proposed method, an experiment is conducted using Keras library to implement our proposed CNN on two standard publicly available datasets. Our results show a promising sensitivity value of about 0.8 at the average number of false positive per image greater than 6 which is a competitive value with the state-of-the-art approaches. |
Tasks | |
Published | 2017-10-14 |
URL | http://arxiv.org/abs/1710.05191v2 |
http://arxiv.org/pdf/1710.05191v2.pdf | |
PWC | https://paperswithcode.com/paper/microaneurysm-detection-in-fundus-images |
Repo | |
Framework | |
Multi-Task Label Embedding for Text Classification
Title | Multi-Task Label Embedding for Text Classification |
Authors | Honglun Zhang, Liqiang Xiao, Wenqing Chen, Yongkun Wang, Yaohui Jin |
Abstract | Multi-task learning in text classification leverages implicit correlations among related tasks to extract common features and yield performance gains. However, most previous works treat labels of each task as independent and meaningless one-hot vectors, which cause a loss of potential information and makes it difficult for these models to jointly learn three or more tasks. In this paper, we propose Multi-Task Label Embedding to convert labels in text classification into semantic vectors, thereby turning the original tasks into vector matching tasks. We implement unsupervised, supervised and semi-supervised models of Multi-Task Label Embedding, all utilizing semantic correlations among tasks and making it particularly convenient to scale and transfer as more tasks are involved. Extensive experiments on five benchmark datasets for text classification show that our models can effectively improve performances of related tasks with semantic representations of labels and additional information from each other. |
Tasks | Multi-Task Learning, Text Classification |
Published | 2017-10-17 |
URL | http://arxiv.org/abs/1710.07210v1 |
http://arxiv.org/pdf/1710.07210v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-label-embedding-for-text |
Repo | |
Framework | |
3DOF Pedestrian Trajectory Prediction Learned from Long-Term Autonomous Mobile Robot Deployment Data
Title | 3DOF Pedestrian Trajectory Prediction Learned from Long-Term Autonomous Mobile Robot Deployment Data |
Authors | Li Sun, Zhi Yan, Sergi Molina Mellado, Marc Hanheide, Tom Duckett |
Abstract | This paper presents a novel 3DOF pedestrian trajectory prediction approach for autonomous mobile service robots. While most previously reported methods are based on learning of 2D positions in monocular camera images, our approach uses range-finder sensors to learn and predict 3DOF pose trajectories (i.e. 2D position plus 1D rotation within the world coordinate system). Our approach, T-Pose-LSTM (Temporal 3DOF-Pose Long-Short-Term Memory), is trained using long-term data from real-world robot deployments and aims to learn context-dependent (environment- and time-specific) human activities. Our approach incorporates long-term temporal information (i.e. date and time) with short-term pose observations as input. A sequence-to-sequence LSTM encoder-decoder is trained, which encodes observations into LSTM and then decodes as predictions. For deployment, it can perform on-the-fly prediction in real-time. Instead of using manually annotated data, we rely on a robust human detection, tracking and SLAM system, providing us with examples in a global coordinate system. We validate the approach using more than 15K pedestrian trajectories recorded in a care home environment over a period of three months. The experiment shows that the proposed T-Pose-LSTM model advances the state-of-the-art 2D-based method for human trajectory prediction in long-term mobile robot deployments. |
Tasks | Human Detection, Trajectory Prediction |
Published | 2017-09-30 |
URL | http://arxiv.org/abs/1710.00126v1 |
http://arxiv.org/pdf/1710.00126v1.pdf | |
PWC | https://paperswithcode.com/paper/3dof-pedestrian-trajectory-prediction-learned |
Repo | |
Framework | |
Stoic Ethics for Artificial Agents
Title | Stoic Ethics for Artificial Agents |
Authors | Gabriel Murray |
Abstract | We present a position paper advocating the notion that Stoic philosophy and ethics can inform the development of ethical A.I. systems. This is in sharp contrast to most work on building ethical A.I., which has focused on Utilitarian or Deontological ethical theories. We relate ethical A.I. to several core Stoic notions, including the dichotomy of control, the four cardinal virtues, the ideal Sage, Stoic practices, and Stoic perspectives on emotion or affect. More generally, we put forward an ethical view of A.I. that focuses more on internal states of the artificial agent rather than on external actions of the agent. We provide examples relating to near-term A.I. systems as well as hypothetical superintelligent agents. |
Tasks | |
Published | 2017-01-09 |
URL | http://arxiv.org/abs/1701.02388v2 |
http://arxiv.org/pdf/1701.02388v2.pdf | |
PWC | https://paperswithcode.com/paper/stoic-ethics-for-artificial-agents |
Repo | |
Framework | |
Compositional Approaches for Representing Relations Between Words: A Comparative Study
Title | Compositional Approaches for Representing Relations Between Words: A Comparative Study |
Authors | Huda Hakami, Danushka Bollegala |
Abstract | Identifying the relations that exist between words (or entities) is important for various natural language processing tasks such as, relational search, noun-modifier classification and analogy detection. A popular approach to represent the relations between a pair of words is to extract the patterns in which the words co-occur with from a corpus, and assign each word-pair a vector of pattern frequencies. Despite the simplicity of this approach, it suffers from data sparseness, information scalability and linguistic creativity as the model is unable to handle previously unseen word pairs in a corpus. In contrast, a compositional approach for representing relations between words overcomes these issues by using the attributes of each individual word to indirectly compose a representation for the common relations that hold between the two words. This study aims to compare different operations for creating relation representations from word-level representations. We investigate the performance of the compositional methods by measuring the relational similarities using several benchmark datasets for word analogy. Moreover, we evaluate the different relation representations in a knowledge base completion task. |
Tasks | Knowledge Base Completion |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.01193v1 |
http://arxiv.org/pdf/1709.01193v1.pdf | |
PWC | https://paperswithcode.com/paper/compositional-approaches-for-representing |
Repo | |
Framework | |
One Model for the Learning of Language
Title | One Model for the Learning of Language |
Authors | Yuan Yang |
Abstract | A major target of linguistics and cognitive science has been to understand what class of learning systems can acquire the key structures of natural language. Until recently, the computational requirements of language have been used to argue that learning is impossible without a highly constrained hypothesis space. Here, we describe a learning system that is maximally unconstrained, operating over the space of all computations, and is able to acquire several of the key structures present natural language from positive evidence alone. The model successfully acquires regular (e.g. $(ab)^n$), context-free (e.g. $a^n b^n$, $x x^R$), and context-sensitive (e.g. $a^nb^nc^n$, $a^nb^mc^nd^m$, $xx$) formal languages. Our approach develops the concept of factorized programs in Bayesian program induction in order to help manage the complexity of representation. We show in learning, the model predicts several phenomena empirically observed in human grammar acquisition experiments. |
Tasks | |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.06301v2 |
http://arxiv.org/pdf/1711.06301v2.pdf | |
PWC | https://paperswithcode.com/paper/one-model-for-the-learning-of-language |
Repo | |
Framework | |
A Systematic Review of Hindi Prosody
Title | A Systematic Review of Hindi Prosody |
Authors | Somnath Roy |
Abstract | Prosody describes both form and function of a sentence using the suprasegmental features of speech. Prosody phenomena are explored in the domain of higher phonological constituents such as word, phonological phrase and intonational phrase. The study of prosody at the word level is called word prosody and above word level is called sentence prosody. Word Prosody describes stress pattern by comparing the prosodic features of its constituent syllables. Sentence Prosody involves the study on phrasing pattern and intonatonal pattern of a language. The aim of this study is to summarize the existing works on Hindi prosody carried out in different domain of language and speech processing. The review is presented in a systematic fashion so that it could be a useful resource for one who wants to build on the existing works. |
Tasks | |
Published | 2017-05-09 |
URL | http://arxiv.org/abs/1705.03247v1 |
http://arxiv.org/pdf/1705.03247v1.pdf | |
PWC | https://paperswithcode.com/paper/a-systematic-review-of-hindi-prosody |
Repo | |
Framework | |
FlashProfile: A Framework for Synthesizing Data Profiles
Title | FlashProfile: A Framework for Synthesizing Data Profiles |
Authors | Saswat Padhi, Prateek Jain, Daniel Perelman, Oleksandr Polozov, Sumit Gulwani, Todd Millstein |
Abstract | We address the problem of learning a syntactic profile for a collection of strings, i.e. a set of regex-like patterns that succinctly describe the syntactic variations in the strings. Real-world datasets, typically curated from multiple sources, often contain data in various syntactic formats. Thus, any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify the different formats is infeasible in standard big-data scenarios. Prior techniques are restricted to a small set of pre-defined patterns (e.g. digits, letters, words, etc.), and provide no control over granularity of profiles. We define syntactic profiling as a problem of clustering strings based on syntactic similarity, followed by identifying patterns that succinctly describe each cluster. We present a technique for synthesizing such profiles over a given language of patterns, that also allows for interactive refinement by requesting a desired number of clusters. Using a state-of-the-art inductive synthesis framework, PROSE, we have implemented our technique as FlashProfile. Across $153$ tasks over $75$ large real datasets, we observe a median profiling time of only $\sim,0.7,$s. Furthermore, we show that access to syntactic profiles may allow for more accurate synthesis of programs, i.e. using fewer examples, in programming-by-example (PBE) workflows such as FlashFill. |
Tasks | |
Published | 2017-09-17 |
URL | http://arxiv.org/abs/1709.05725v2 |
http://arxiv.org/pdf/1709.05725v2.pdf | |
PWC | https://paperswithcode.com/paper/flashprofile-interactive-synthesis-of |
Repo | |
Framework | |
Revisit Fuzzy Neural Network: Demystifying Batch Normalization and ReLU with Generalized Hamming Network
Title | Revisit Fuzzy Neural Network: Demystifying Batch Normalization and ReLU with Generalized Hamming Network |
Authors | Lixin Fan |
Abstract | We revisit fuzzy neural network with a cornerstone notion of generalized hamming distance, which provides a novel and theoretically justified framework to re-interpret many useful neural network techniques in terms of fuzzy logic. In particular, we conjecture and empirically illustrate that, the celebrated batch normalization (BN) technique actually adapts the normalized bias such that it approximates the rightful bias induced by the generalized hamming distance. Once the due bias is enforced analytically, neither the optimization of bias terms nor the sophisticated batch normalization is needed. Also in the light of generalized hamming distance, the popular rectified linear units (ReLU) can be treated as setting a minimal hamming distance threshold between network inputs and weights. This thresholding scheme, on the one hand, can be improved by introducing double thresholding on both extremes of neuron outputs. On the other hand, ReLUs turn out to be non-essential and can be removed from networks trained for simple tasks like MNIST classification. The proposed generalized hamming network (GHN) as such not only lends itself to rigorous analysis and interpretation within the fuzzy logic theory but also demonstrates fast learning speed, well-controlled behaviour and state-of-the-art performances on a variety of learning tasks. |
Tasks | |
Published | 2017-10-27 |
URL | http://arxiv.org/abs/1710.10328v1 |
http://arxiv.org/pdf/1710.10328v1.pdf | |
PWC | https://paperswithcode.com/paper/revisit-fuzzy-neural-network-demystifying |
Repo | |
Framework | |