Paper Group ANR 182
Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation. Question Answering for Privacy Policies: Combining Computational and Legal Perspectives. Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications. GreyReID: A Two-stream Deep Framework with RGB-grey Information f …
Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation
Title | Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation |
Authors | Yu Chen, Lingfei Wu, Mohammed J. Zaki |
Abstract | Natural question generation (QG) aims to generate questions from a passage and an answer. Previous works on QG either (i) ignore the rich structure information hidden in text, (ii) solely rely on cross-entropy loss that leads to issues like exposure bias and inconsistency between train/test measurement, or (iii) fail to fully exploit the answer information. To address these limitations, in this paper, we propose a reinforcement learning (RL) based graph-to-sequence (Graph2Seq) model for QG. Our model consists of a Graph2Seq generator with a novel Bidirectional Gated Graph Neural Network based encoder to embed the passage, and a hybrid evaluator with a mixed objective combining both cross-entropy and RL losses to ensure the generation of syntactically and semantically valid text. We also introduce an effective Deep Alignment Network for incorporating the answer information into the passage at both the word and contextual levels. Our model is end-to-end trainable and achieves new state-of-the-art scores, outperforming existing methods by a significant margin on the standard SQuAD benchmark. |
Tasks | Graph-to-Sequence, Question Generation |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.04942v3 |
https://arxiv.org/pdf/1908.04942v3.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-based-graph-to |
Repo | |
Framework | |
Question Answering for Privacy Policies: Combining Computational and Legal Perspectives
Title | Question Answering for Privacy Policies: Combining Computational and Legal Perspectives |
Authors | Abhilasha Ravichander, Alan W Black, Shomir Wilson, Thomas Norton, Norman Sadeh |
Abstract | Privacy policies are long and complex documents that are difficult for users to read and understand, and yet, they have legal effects on how user data is collected, managed and used. Ideally, we would like to empower users to inform themselves about issues that matter to them, and enable them to selectively explore those issues. We present PrivacyQA, a corpus consisting of 1750 questions about the privacy policies of mobile applications, and over 3500 expert annotations of relevant answers. We observe that a strong neural baseline underperforms human performance by almost 0.3 F1 on PrivacyQA, suggesting considerable room for improvement for future systems. Further, we use this dataset to shed light on challenges to question answerability, with domain-general implications for any question answering system. The PrivacyQA corpus offers a challenging corpus for question answering, with genuine real-world utility. |
Tasks | Question Answering |
Published | 2019-11-03 |
URL | https://arxiv.org/abs/1911.00841v1 |
https://arxiv.org/pdf/1911.00841v1.pdf | |
PWC | https://paperswithcode.com/paper/question-answering-for-privacy-policies-1 |
Repo | |
Framework | |
Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications
Title | Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications |
Authors | Songtao Lu, Ioannis Tsaknakis, Mingyi Hong, Yongxin Chen |
Abstract | The min-max problem, also known as the saddle point problem, is a class of optimization problems in which we minimize and maximize two subsets of variables simultaneously. This class of problems can be used to formulate a wide range of signal processing and communication (SPCOM) problems. Despite its popularity, existing theory for this class has been mainly developed for problems with certain special convex-concave structure. Therefore, it cannot be used to guide the algorithm design for many interesting problems in SPCOM, where some kind of non-convexity often arises. In this work, we consider a general block-wise one-sided non-convex min-max problem, in which the minimization problem consists of multiple blocks and is non-convex, while the maximization problem is (strongly) concave. We propose a class of simple algorithms named Hybrid Block Successive Approximation (HiBSA), which alternatingly performs gradient descent-type steps for the minimization blocks and one gradient ascent-type step for the maximization problem. A key element in the proposed algorithm is the introduction of certain properly designed regularization and penalty terms, which are used to stabilize the algorithm and ensure convergence. For the first time, we show that such simple alternating min-max algorithms converge to first-order stationary solutions, with quantifiable global rates. To validate the efficiency of the proposed algorithms, we conduct numerical tests on a number of information processing and wireless communication problems, including the robust learning problem, the non-convex min-utility maximization problems, and certain wireless jamming problem arising in interfering channels. |
Tasks | |
Published | 2019-02-21 |
URL | http://arxiv.org/abs/1902.08294v1 |
http://arxiv.org/pdf/1902.08294v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-block-successive-approximation-for-one |
Repo | |
Framework | |
GreyReID: A Two-stream Deep Framework with RGB-grey Information for Person Re-identification
Title | GreyReID: A Two-stream Deep Framework with RGB-grey Information for Person Re-identification |
Authors | Lei Qi, Lei Wang, Jing Huo, Yinghuan Shi, Yang Gao |
Abstract | In this paper, we observe that most false positive images (i.e., different identities with query images) in the top ranking list usually have the similar color information with the query image in person re-identification (Re-ID). Meanwhile, when we use the greyscale images generated from RGB images to conduct the person Re-ID task, some hard query images can obtain better performance compared with using RGB images. Therefore, RGB and greyscale images seem to be complementary to each other for person Re-ID. In this paper, we aim to utilize both RGB and greyscale images to improve the person Re-ID performance. To this end, we propose a novel two-stream deep neural network with RGB-grey information, which can effectively fuse RGB and greyscale feature representations to enhance the generalization ability of Re-ID. Firstly, we convert RGB images to greyscale images in each training batch. Based on these RGB and greyscale images, we train the RGB and greyscale branches, respectively. Secondly, to build up connections between RGB and greyscale branches, we merge the RGB and greyscale branches into a new joint branch. Finally, we concatenate the features of all three branches as the final feature representation for Re-ID. Moreover, in the training process, we adopt the joint learning scheme to simultaneously train each branch by the independent loss function, which can enhance the generalization ability of each branch. Besides, a global loss function is utilized to further fine-tune the final concatenated feature. The extensive experiments on multiple benchmark datasets fully show that the proposed method can outperform the state-of-the-art person Re-ID methods. Furthermore, using greyscale images can indeed improve the person Re-ID performance. |
Tasks | Person Re-Identification |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05142v1 |
https://arxiv.org/pdf/1908.05142v1.pdf | |
PWC | https://paperswithcode.com/paper/greyreid-a-two-stream-deep-framework-with-rgb |
Repo | |
Framework | |
Trainable Time Warping: Aligning Time-Series in the Continuous-Time Domain
Title | Trainable Time Warping: Aligning Time-Series in the Continuous-Time Domain |
Authors | Soheil Khorram, Melvin G McInnis, Emily Mower Provost |
Abstract | DTW calculates the similarity or alignment between two signals, subject to temporal warping. However, its computational complexity grows exponentially with the number of time-series. Although there have been algorithms developed that are linear in the number of time-series, they are generally quadratic in time-series length. The exception is generalized time warping (GTW), which has linear computational cost. Yet, it can only identify simple time warping functions. There is a need for a new fast, high-quality multisequence alignment algorithm. We introduce trainable time warping (TTW), whose complexity is linear in both the number and the length of time-series. TTW performs alignment in the continuous-time domain using a sinc convolutional kernel and a gradient-based optimization technique. We compare TTW and GTW on 85 UCR datasets in time-series averaging and classification. TTW outperforms GTW on 67.1% of the datasets for the averaging tasks, and 61.2% of the datasets for the classification tasks. |
Tasks | Time Series, Time Series Averaging |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.09245v1 |
http://arxiv.org/pdf/1903.09245v1.pdf | |
PWC | https://paperswithcode.com/paper/trainable-time-warping-aligning-time-series |
Repo | |
Framework | |
Learning to Reason in Large Theories without Imitation
Title | Learning to Reason in Large Theories without Imitation |
Authors | Kshitij Bansal, Sarah M. Loos, Markus N. Rabe, Christian Szegedy |
Abstract | Automated theorem proving in large theories can be learned via reinforcement learning over an indefinitely growing action space. In order to select actions, one performs nearest neighbor lookups in the knowledge base to find premises to be applied. Here we address the exploration for reinforcement learning in this space. Approaches (like epsilon-greedy strategy) that sample actions uniformly do not scale to this scenario as most actions lead to dead ends and unsuccessful proofs which are not useful for training our models. In this paper, we compare approaches that select premises using randomly initialized similarity measures and mixing them with the proposals of the learned model. We evaluate these on the HOList benchmark for tactics based higher order theorem proving. We implement an automated theorem prover named DeepHOL-Zero that does not use any of the human proofs and show that our improved exploration method manages to expand the training set continuously. DeepHOL-Zero outperforms the best theorem prover trained by imitation learning alone. |
Tasks | Automated Theorem Proving, Imitation Learning |
Published | 2019-05-25 |
URL | https://arxiv.org/abs/1905.10501v2 |
https://arxiv.org/pdf/1905.10501v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-reason-in-large-theories-without |
Repo | |
Framework | |
Complexity-Scalable Neural Network Based MIMO Detection With Learnable Weight Scaling
Title | Complexity-Scalable Neural Network Based MIMO Detection With Learnable Weight Scaling |
Authors | Abdullahi Mohammad, Christos Masouros, Yiannis Andreopoulos |
Abstract | This paper introduces a framework for systematic complexity scaling of deep neural network (DNN) based MIMO detectors. The model uses a fraction of the DNN inputs by scaling their values through weights that follow monotonically non-increasing functions. This allows for weight scaling across and within the different DNN layers in order to achieve scalable complexity-accuracy results. To reduce complexity further, we introduce a regularization constraint on the layer weights such that, at inference, parts (or the entirety) of network layers can be removed with minimal impact on the detection accuracy. We also introduce trainable weight-scaling functions for increased robustness to changes in the activation patterns and a further improvement in the detection accuracy at the same inference complexity. Numerical results show that our approach is 10 and 100-fold less complex than classical approaches based on semi-definite relaxation and ML detection, respectively. |
Tasks | |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.06943v1 |
https://arxiv.org/pdf/1909.06943v1.pdf | |
PWC | https://paperswithcode.com/paper/complexity-scalable-neural-network-based-mimo |
Repo | |
Framework | |
Learning Deformable Point Set Registration with Regularized Dynamic Graph CNNs for Large Lung Motion in COPD Patients
Title | Learning Deformable Point Set Registration with Regularized Dynamic Graph CNNs for Large Lung Motion in COPD Patients |
Authors | Lasse Hansen, Doris Dittmer, Mattias P. Heinrich |
Abstract | Deformable registration continues to be one of the key challenges in medical image analysis. While iconic registration methods have started to benefit from the recent advances in medical deep learning, the same does not yet apply for the registration of point sets, e.g. registration based on surfaces, keypoints or landmarks. This is mainly due to the restriction of the convolution operator in modern CNNs to densely gridded input. However, with the newly developed methods from the field of geometric deep learning suitable tools are now emerging, which enable powerful analysis of medical data on irregular domains. In this work, we present a new method that enables the learning of regularized feature descriptors with dynamic graph CNNs. By incorporating the learned geometric features as prior probabilities into the well-established coherent point drift (CPD) algorithm, formulated as differentiable network layer, we establish an end-to-end framework for robust registration of two point sets. Our approach is evaluated on the challenging task of aligning keypoints extracted from lung CT scans in inhale and exhale states with large deformations and without any additional intensity information. Our results indicate that the inherent geometric structure of the extracted keypoints is sufficient to establish descriptive point features, which yield a significantly improved performance and robustness of our registration framework. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07818v1 |
https://arxiv.org/pdf/1909.07818v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-deformable-point-set-registration |
Repo | |
Framework | |
A linear bound on the k-rendezvous time for primitive sets of NZ matrices
Title | A linear bound on the k-rendezvous time for primitive sets of NZ matrices |
Authors | Costanza Catalano, Umer Azfar, Ludovic Charlier, Raphaël Jungers |
Abstract | A set of nonnegative matrices is called primitive if there exists a product of these matrices that is entrywise positive. Motivated by recent results relating synchronizing automata and primitive sets, we study the length of the shortest product of a primitive set having a column or a row with k positive entries (the k-RT). We prove that this value is at most linear w.r.t. the matrix size n for small k, while the problem is still open for synchronizing automata. We provide two upper bounds on the k-RT: the second is an improvement of the first one, although the latter can be written in closed form. We then report numerical results comparing our upper bounds on the k-RT with heuristic approximation methods. |
Tasks | |
Published | 2019-03-25 |
URL | https://arxiv.org/abs/1903.10421v2 |
https://arxiv.org/pdf/1903.10421v2.pdf | |
PWC | https://paperswithcode.com/paper/a-linear-bound-on-the-k-rendezvous-time-for |
Repo | |
Framework | |
From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices
Title | From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices |
Authors | Jessica Morley, Luciano Floridi, Libby Kinsey, Anat Elhalal |
Abstract | The debate about the ethical implications of Artificial Intelligence dates from the 1960s. However, in recent years symbolic AI has been complemented and sometimes replaced by Neural Networks and Machine Learning techniques. This has vastly increased its potential utility and impact on society, with the consequence that the ethical debate has gone mainstream. Such debate has primarily focused on principles - the what of AI ethics - rather than on practices, the how. Awareness of the potential issues is increasing at a fast rate, but the AI community’s ability to take action to mitigate the associated risks is still at its infancy. Therefore, our intention in presenting this research is to contribute to closing the gap between principles and practices by constructing a typology that may help practically-minded developers apply ethics at each stage of the pipeline, and to signal to researchers where further work is needed. The focus is exclusively on Machine Learning, but it is hoped that the results of this research may be easily applicable to other branches of AI. The article outlines the research method for creating this typology, the initial findings, and provides a summary of future research needs. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06876v2 |
https://arxiv.org/pdf/1905.06876v2.pdf | |
PWC | https://paperswithcode.com/paper/from-what-to-how-an-overview-of-ai-ethics |
Repo | |
Framework | |
Adaptive Kernel Learning in Heterogeneous Networks
Title | Adaptive Kernel Learning in Heterogeneous Networks |
Authors | Hrusikesha Pradhan, Amrit Singh Bedi, Alec Koppel, Ketan Rajawat |
Abstract | We consider the framework of learning over decentralized networks, where nodes observe unique, possibly correlated, observation streams. We focus on the case where agents learn a regression \emph{function} that belongs to a reproducing kernel Hilbert space (RKHS). In this setting, a decentralized network aims to learn nonlinear statistical models that are optimal in terms of a global stochastic convex functional that aggregates data across the network, with only access to a local data stream. We incentivize coordination while respecting network heterogeneity through the introduction of nonlinear proximity constraints. To solve it, we propose applying a functional variant of stochastic primal-dual (Arrow-Hurwicz) method which yields a decentralized algorithm. To handle the fact that the RKHS parameterization has complexity proportionate with the iteration index, we project the primal iterates onto Hilbert subspaces that are greedily constructed from the observation sequence of each node. The resulting proximal stochastic variant of Arrow-Hurwicz, dubbed Heterogeneous Adaptive Learning with Kernels (HALK), is shown to converge in expectation, both in terms of primal sub-optimality and constraint violation to a neighborhood that depends on a given constant step-size selection. Simulations on a correlated spatio-temporal random field estimation problem validate our theoretical results, which are born out in practice for networked oceanic sensing buoys estimating temperature and salinity from depth measurements. |
Tasks | |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00510v1 |
https://arxiv.org/pdf/1908.00510v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-kernel-learning-in-heterogeneous |
Repo | |
Framework | |
An Optimized Recurrent Unit for Ultra-Low-Power Keyword Spotting
Title | An Optimized Recurrent Unit for Ultra-Low-Power Keyword Spotting |
Authors | Justice Amoh, Kofi Odame |
Abstract | There is growing interest in being able to run neural networks on sensors, wearables and internet-of-things (IoT) devices. However, the computational demands of neural networks make them difficult to deploy on resource-constrained edge devices. To meet this need, our work introduces a new recurrent unit architecture that is specifically adapted for on-device low power acoustic event detection (AED). The proposed architecture is based on the gated recurrent unit (`GRU’) but features optimizations that make it implementable on ultra-low power micro-controllers such as the Arm Cortex M0+. Our new architecture, the Embedded Gated Recurrent Unit (eGRU) is demonstrated to be highly efficient and suitable for short-duration AED and keyword spotting tasks. A single eGRU cell is 60x faster and 10x smaller than a GRU cell. Despite its optimizations, eGRU compares well with GRU across tasks of varying complexities. The practicality of eGRU is investigated in a wearable acoustic event detection application. An eGRU model is implemented and tested on the Arm Cortex M0-based Atmel ATSAMD21E18 processor. The Arm M0+ implementation of the eGRU model compares favorably with a full precision GRU that is running on a workstation. The embedded eGRU model achieves a classification accuracy 95.3%, which is only 2% less than the full precision GRU. | |
Tasks | Keyword Spotting |
Published | 2019-02-13 |
URL | http://arxiv.org/abs/1902.05026v1 |
http://arxiv.org/pdf/1902.05026v1.pdf | |
PWC | https://paperswithcode.com/paper/an-optimized-recurrent-unit-for-ultra-low |
Repo | |
Framework | |
Answering Comparative Questions: Better than Ten-Blue-Links?
Title | Answering Comparative Questions: Better than Ten-Blue-Links? |
Authors | Matthias Schildwächter, Alexander Bondarenko, Julian Zenker, Matthias Hagen, Chris Biemann, Alexander Panchenko |
Abstract | We present CAM (comparative argumentative machine), a novel open-domain IR system to argumentatively compare objects with respect to information extracted from the Common Crawl. In a user study, the participants obtained 15% more accurate answers using CAM compared to a “traditional” keyword-based search and were 20% faster in finding the answer to comparative questions. |
Tasks | |
Published | 2019-01-15 |
URL | http://arxiv.org/abs/1901.05041v1 |
http://arxiv.org/pdf/1901.05041v1.pdf | |
PWC | https://paperswithcode.com/paper/answering-comparative-questions-better-than |
Repo | |
Framework | |
First steps to a constructor theory of cognition
Title | First steps to a constructor theory of cognition |
Authors | Riccardo Franco |
Abstract | This article applies the conceptual framework of constructor theory of information to cognition theory. The main result of this work is that cognition theory, in specific situations concerning for example the conjunction fallacy heuristic, requires the use of superinformation media, just as quantum theory. This result entails that quantum and cognition theories can be considered as elements of a general class of superinformation-based subsidiary theories. |
Tasks | |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1904.09829v1 |
http://arxiv.org/pdf/1904.09829v1.pdf | |
PWC | https://paperswithcode.com/paper/190409829 |
Repo | |
Framework | |
Towards Testing Monotonicity of Distributions Over General Posets
Title | Towards Testing Monotonicity of Distributions Over General Posets |
Authors | Maryam Aliakbarpour, Themis Gouleakis, John Peebles, Ronitt Rubinfeld, Anak Yodpinyanee |
Abstract | In this work, we consider the sample complexity required for testing the monotonicity of distributions over partial orders. A distribution $p$ over a poset is monotone if, for any pair of domain elements $x$ and $y$ such that $x \preceq y$, $p(x) \leq p(y)$. To understand the sample complexity of this problem, we introduce a new property called bigness over a finite domain, where the distribution is $T$-big if the minimum probability for any domain element is at least $T$. We establish a lower bound of $\Omega(n/\log n)$ for testing bigness of distributions on domains of size $n$. We then build on these lower bounds to give $\Omega(n/\log{n})$ lower bounds for testing monotonicity over a matching poset of size $n$ and significantly improved lower bounds over the hypercube poset. We give sublinear sample complexity bounds for testing bigness and for testing monotonicity over the matching poset. We then give a number of tools for analyzing upper bounds on the sample complexity of the monotonicity testing problem. |
Tasks | |
Published | 2019-07-06 |
URL | https://arxiv.org/abs/1907.03182v1 |
https://arxiv.org/pdf/1907.03182v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-testing-monotonicity-of-distributions |
Repo | |
Framework | |