Paper Group ANR 839
Performance Analysis and Dynamic Evolution of Deep Convolutional Neural Network for Nonlinear Inverse Scattering. An Argument-Marker Model for Syntax-Agnostic Proto-Role Labeling. Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement. When and Why Metaheuristics Researchers Can Ignore “No Free Lunch” Theorems. PoD: Positi …
Performance Analysis and Dynamic Evolution of Deep Convolutional Neural Network for Nonlinear Inverse Scattering
Title | Performance Analysis and Dynamic Evolution of Deep Convolutional Neural Network for Nonlinear Inverse Scattering |
Authors | Lianlin Li, Long Gang Wang, Fernando L. Teixeira |
Abstract | The solution of nonlinear electromagnetic (EM) inverse scattering problems is typically hindered by several challenges such as ill-posedness, strong nonlinearity, and high computational costs. Recently, deep learning has been demonstrated to be a promising tool in addressing these challenges. In particular, it is possible to establish a connection between a deep convolutional neural network (CNN) and iterative solution methods of nonlinear EM inverse scattering. This has led to the development of an efficient CNN-based solution to nonlinear EM inverse problems, termed DeepNIS. It has been shown that DeepNIS can outperform conventional nonlinear inverse scattering methods in terms of both image quality and computational time. In this work, we quantitatively evaluate the performance of DeepNIS as a function of the number of layers using structure similarity measure (SSIM) and mean-square error (MSE) metrics. In addition, we probe the dynamic evolution behavior of DeepNIS by examining its near-isometry property. It is shown that after a proper training stage the proposed CNN is near optimal in terms of the stability and generalization ability. |
Tasks | |
Published | 2019-01-09 |
URL | http://arxiv.org/abs/1901.02610v1 |
http://arxiv.org/pdf/1901.02610v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-analysis-and-dynamic-evolution-of |
Repo | |
Framework | |
An Argument-Marker Model for Syntax-Agnostic Proto-Role Labeling
Title | An Argument-Marker Model for Syntax-Agnostic Proto-Role Labeling |
Authors | Juri Opitz, Anette Frank |
Abstract | Semantic proto-role labeling (SPRL) is an alternative to semantic role labeling (SRL) that moves beyond a categorical definition of roles, following Dowty’s feature-based view of proto-roles. This theory determines agenthood vs. patienthood based on a participant’s instantiation of more or less typical agent vs. patient properties, such as, for example, volition in an event. To perform SPRL, we develop an ensemble of hierarchical models with self-attention and concurrently learned predicate-argument-markers. Our method is competitive with the state-of-the art, overall outperforming previous work in two formulations of the task (multi-label and multi-variate Likert scale prediction). In contrast to previous work, our results do not depend on gold argument heads derived from supplementary gold tree banks. |
Tasks | Semantic Role Labeling |
Published | 2019-02-04 |
URL | http://arxiv.org/abs/1902.01349v2 |
http://arxiv.org/pdf/1902.01349v2.pdf | |
PWC | https://paperswithcode.com/paper/an-argument-marker-model-for-syntax-agnostic |
Repo | |
Framework | |
Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement
Title | Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement |
Authors | Chao Yang, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Huaping Liu, Junzhou Huang, Chuang Gan |
Abstract | This paper studies Learning from Observations (LfO) for imitation learning with access to state-only demonstrations. In contrast to Learning from Demonstration (LfD) that involves both action and state supervision, LfO is more practical in leveraging previously inapplicable resources (e.g. videos), yet more challenging due to the incomplete expert guidance. In this paper, we investigate LfO and its difference with LfD in both theoretical and practical perspectives. We first prove that the gap between LfD and LfO actually lies in the disagreement of inverse dynamics models between the imitator and the expert, if following the modeling approach of GAIL. More importantly, the upper bound of this gap is revealed by a negative causal entropy which can be minimized in a model-free way. We term our method as Inverse-Dynamics-Disagreement-Minimization (IDDM) which enhances the conventional LfO method through further bridging the gap to LfD. Considerable empirical results on challenging benchmarks indicate that our method attains consistent improvements over other LfO counterparts. |
Tasks | Imitation Learning |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04417v4 |
https://arxiv.org/pdf/1910.04417v4.pdf | |
PWC | https://paperswithcode.com/paper/imitation-learning-from-observations-by |
Repo | |
Framework | |
When and Why Metaheuristics Researchers Can Ignore “No Free Lunch” Theorems
Title | When and Why Metaheuristics Researchers Can Ignore “No Free Lunch” Theorems |
Authors | James McDermott |
Abstract | The No Free Lunch (NFL) theorem for search and optimisation states that averaged across all possible objective functions on a fixed search space, all search algorithms perform equally well. Several refined versions of the theorem find a similar outcome when averaging across smaller sets of functions. This paper argues that NFL results continue to be misunderstood by many researchers, and addresses this issue in several ways. Existing arguments against real-world implications of NFL results are collected and re-stated for accessibility, and new ones are added. Specific misunderstandings extant in the literature are identified, with speculation as to how they may have arisen. This paper presents an argument against a common paraphrase of NFL findings – that algorithms must be specialised to problem domains in order to do well – after problematising the usually undefined term “domain”. It provides novel concrete counter-examples illustrating cases where NFL theorems do not apply. In conclusion it offers a novel view of the real meaning of NFL, incorporating the anthropic principle and justifying the position that in many common situations researchers can ignore NFL. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.03280v1 |
https://arxiv.org/pdf/1906.03280v1.pdf | |
PWC | https://paperswithcode.com/paper/when-and-why-metaheuristics-researchers-can |
Repo | |
Framework | |
PoD: Positional Dependency-Based Word Embedding for Aspect Term Extraction
Title | PoD: Positional Dependency-Based Word Embedding for Aspect Term Extraction |
Authors | Yichun Yin, Chenguang Wang, Ming Zhang |
Abstract | Dependency context-based word embedding jointly learns the representations of word and dependency context, and has been proved effective in aspect term extraction. In this paper, we design the positional dependency-based word embedding (PoD) which considers both dependency context and positional context for aspect term extraction. Specifically, the positional context is modeled via relative position encoding. Besides, we enhance the dependency context by integrating more lexical information (e.g., POS tags) along dependency paths. Experiments on SemEval 2014/2015/2016 datasets show that our approach outperforms other embedding methods in aspect term extraction. The source code will be publicly available soon. |
Tasks | |
Published | 2019-11-09 |
URL | https://arxiv.org/abs/1911.03785v1 |
https://arxiv.org/pdf/1911.03785v1.pdf | |
PWC | https://paperswithcode.com/paper/pod-positional-dependency-based-word |
Repo | |
Framework | |
Equilibrated Recurrent Neural Network: Neuronal Time-Delayed Self-Feedback Improves Accuracy and Stability
Title | Equilibrated Recurrent Neural Network: Neuronal Time-Delayed Self-Feedback Improves Accuracy and Stability |
Authors | Ziming Zhang, Anil Kag, Alan Sullivan, Venkatesh Saligrama |
Abstract | We propose a novel {\it Equilibrated Recurrent Neural Network} (ERNN) to combat the issues of inaccuracy and instability in conventional RNNs. Drawing upon the concept of autapse in neuroscience, we propose augmenting an RNN with a time-delayed self-feedback loop. Our sole purpose is to modify the dynamics of each internal RNN state and, at any time, enforce it to evolve close to the equilibrium point associated with the input signal at that time. We show that such self-feedback helps stabilize the hidden state transitions leading to fast convergence during training while efficiently learning discriminative latent features that result in state-of-the-art results on several benchmark datasets at test-time. We propose a novel inexact Newton method to solve fixed-point conditions given model parameters for generating the latent features at each hidden state. We prove that our inexact Newton method converges locally with linear rate (under mild conditions). We leverage this result for efficient training of ERNNs based on backpropagation. |
Tasks | |
Published | 2019-03-02 |
URL | http://arxiv.org/abs/1903.00755v1 |
http://arxiv.org/pdf/1903.00755v1.pdf | |
PWC | https://paperswithcode.com/paper/equilibrated-recurrent-neural-network |
Repo | |
Framework | |
Deep Learning-Based Decoding of Constrained Sequence Codes
Title | Deep Learning-Based Decoding of Constrained Sequence Codes |
Authors | Congzhe Cao, Duanshun Li, Ivan Fair |
Abstract | Constrained sequence (CS) codes, including fixed-length CS codes and variable-length CS codes, have been widely used in modern wireless communication and data storage systems. Sequences encoded with constrained sequence codes satisfy constraints imposed by the physical channel to enable efficient and reliable transmission of coded symbols. In this paper, we propose using deep learning approaches to decode fixed-length and variable-length CS codes. Traditional encoding and decoding of fixed-length CS codes rely on look-up tables (LUTs), which is prone to errors that occur during transmission. We introduce fixed-length constrained sequence decoding based on multiple layer perception (MLP) networks and convolutional neural networks (CNNs), and demonstrate that we are able to achieve low bit error rates that are close to maximum a posteriori probability (MAP) decoding as well as improve the system throughput. Further, implementation of capacity-achieving fixed-length codes, where the complexity is prohibitively high with LUT decoding, becomes practical with deep learning-based decoding. We then consider CNN-aided decoding of variable-length CS codes. Different from conventional decoding where the received sequence is processed bit-by-bit, we propose using CNNs to perform one-shot batch-processing of variable-length CS codes such that an entire batch is decoded at once, which improves the system throughput. Moreover, since the CNNs can exploit global information with batch-processing instead of only making use of local information as in conventional bit-by-bit processing, the error rates can be reduced. We present simulation results that show excellent performance with both fixed-length and variable-length CS codes that are used in the frontiers of wireless communication systems. |
Tasks | |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.06172v1 |
https://arxiv.org/pdf/1906.06172v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-decoding-of-constrained |
Repo | |
Framework | |
Feature Level Fusion from Facial Attributes for Face Recognition
Title | Feature Level Fusion from Facial Attributes for Face Recognition |
Authors | Mohammad Rasool Izadi |
Abstract | We introduce a deep convolutional neural networks (CNN) architecture to classify facial attributes and recognize face images simultaneously via a shared learning paradigm to improve the accuracy for facial attribute prediction and face recognition performance. In this method, we use facial attributes as an auxiliary source of information to assist CNN features extracted from the face images to improve the face recognition performance. Specifically, we use a shared CNN architecture that jointly predicts facial attributes and recognize face images simultaneously via a shared learning parameters, and then we use facial attribute features an an auxiliary source of information concatenated by face features to increase the discrimination of the CNN for face recognition. This process assists the CNN classifier to better recognize face images. The experimental results show that our model increases both the face recognition and facial attribute prediction performance, especially for the identity attributes such as gender and race. We evaluated our method on several standard datasets labeled by identities and face attributes and the results show that the proposed method outperforms state-of-the-art face recognition models. |
Tasks | Face Recognition |
Published | 2019-09-28 |
URL | https://arxiv.org/abs/1909.13126v1 |
https://arxiv.org/pdf/1909.13126v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-level-fusion-from-facial-attributes |
Repo | |
Framework | |
Robust subsampling-based sparse Bayesian inference to tackle four challenges (large noise, outliers, data integration, and extrapolation) in the discovery of physical laws from data
Title | Robust subsampling-based sparse Bayesian inference to tackle four challenges (large noise, outliers, data integration, and extrapolation) in the discovery of physical laws from data |
Authors | Sheng Zhang, Guang Lin |
Abstract | The derivation of physical laws is a dominant topic in scientific research. We propose a new method capable of discovering the physical laws from data to tackle four challenges in the previous methods. The four challenges are: (1) large noise in the data, (2) outliers in the data, (3) integrating the data collected from different experiments, and (4) extrapolating the solutions to the areas that have no available data. To resolve these four challenges, we try to discover the governing differential equations and develop a model-discovering method based on sparse Bayesian inference and subsampling. The subsampling technique is used for improving the accuracy of the Bayesian learning algorithm here, while it is usually employed for estimating statistics or speeding up algorithms elsewhere. The optimal subsampling size is moderate, neither too small nor too big. Another merit of our method is that it can work with limited data by the virtue of Bayesian inference. We demonstrate how to use our method to tackle the four aforementioned challenges step by step through numerical examples: (1) predator-prey model with noise, (2) shallow water equations with outliers, (3) heat diffusion with random initial and boundary conditions, and (4) fish-harvesting problem with bifurcations. Numerical results show that the robustness and accuracy of our new method is significantly better than the other model-discovering methods and traditional regression methods. |
Tasks | Bayesian Inference |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07788v2 |
https://arxiv.org/pdf/1907.07788v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-data-driven-discovery-of-governing |
Repo | |
Framework | |
Multi-Granularity Representations of Dialog
Title | Multi-Granularity Representations of Dialog |
Authors | Shikib Mehri, Maxine Eskenazi |
Abstract | Neural models of dialog rely on generalized latent representations of language. This paper introduces a novel training procedure which explicitly learns multiple representations of language at several levels of granularity. The multi-granularity training algorithm modifies the mechanism by which negative candidate responses are sampled in order to control the granularity of learned latent representations. Strong performance gains are observed on the next utterance retrieval task using both the MultiWOZ dataset and the Ubuntu dialog corpus. Analysis significantly demonstrates that multiple granularities of representation are being learned, and that multi-granularity training facilitates better transfer to downstream tasks. |
Tasks | Conversational Response Selection |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09890v1 |
https://arxiv.org/pdf/1908.09890v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-granularity-representations-of-dialog |
Repo | |
Framework | |
Deep k-NN Defense against Clean-label Data Poisoning Attacks
Title | Deep k-NN Defense against Clean-label Data Poisoning Attacks |
Authors | Neehar Peri, Neal Gupta, W. Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Goldstein, John P. Dickerson |
Abstract | Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference. Although defenses have been proposed for general poisoning attacks, no reliable defense for clean-label attacks has been demonstrated, despite the attacks’ effectiveness and realistic applications. In this work, we propose a simple, yet highly-effective Deep k-NN defense against both feature collision and convex polytope clean-label attacks on the CIFAR-10 dataset. We demonstrate that our proposed strategy is able to detect over 99% of poisoned examples in both attacks and remove them without compromising model performance. Additionally, through ablation studies, we discover simple guidelines for selecting the value of k as well as for implementing the Deep k-NN defense on real-world datasets with class imbalance. Our proposed defense shows that current clean-label poisoning attack strategies can be annulled, and serves as a strong yet simple-to-implement baseline defense to test future clean-label poisoning attacks. Our code is available at https://github.com/neeharperi/DeepKNNDefense |
Tasks | Adversarial Attack, data poisoning |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13374v2 |
https://arxiv.org/pdf/1909.13374v2.pdf | |
PWC | https://paperswithcode.com/paper/strong-baseline-defenses-against-clean-label |
Repo | |
Framework | |
Applying machine learning to improve simulations of a chaotic dynamical system using empirical error correction
Title | Applying machine learning to improve simulations of a chaotic dynamical system using empirical error correction |
Authors | Peter A. G. Watson |
Abstract | Dynamical weather and climate prediction models underpin many studies of the Earth system and hold the promise of being able to make robust projections of future climate change based on physical laws. However, simulations from these models still show many differences compared with observations. Machine learning has been applied to solve certain prediction problems with great success, and recently it’s been proposed that this could replace the role of physically-derived dynamical weather and climate models to give better quality simulations. Here, instead, a framework using machine learning together with physically-derived models is tested, in which it is learnt how to correct the errors of the latter from timestep to timestep. This maintains the physical understanding built into the models, whilst allowing performance improvements, and also requires much simpler algorithms and less training data. This is tested in the context of simulating the chaotic Lorenz ‘96 system, and it is shown that the approach yields models that are stable and that give both improved skill in initialised predictions and better long-term climate statistics. Improvements in long-term statistics are smaller than for single time-step tendencies, however, indicating that it would be valuable to develop methods that target improvements on longer time scales. Future strategies for the development of this approach and possible applications to making progress on important scientific problems are discussed. |
Tasks | |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10904v1 |
http://arxiv.org/pdf/1904.10904v1.pdf | |
PWC | https://paperswithcode.com/paper/applying-machine-learning-to-improve |
Repo | |
Framework | |
Reducing Catastrophic Forgetting in Modular Neural Networks by Dynamic Information Balancing
Title | Reducing Catastrophic Forgetting in Modular Neural Networks by Dynamic Information Balancing |
Authors | Mohammed Amer, Tomás Maul |
Abstract | Lifelong learning is a very important step toward realizing robust autonomous artificial agents. Neural networks are the main engine of deep learning, which is the current state-of-the-art technique in formulating adaptive artificial intelligent systems. However, neural networks suffer from catastrophic forgetting when stressed with the challenge of continual learning. We investigate how to exploit modular topology in neural networks in order to dynamically balance the information load between different modules by routing inputs based on the information content in each module so that information interference is minimized. Our dynamic information balancing (DIB) technique adapts a reinforcement learning technique to guide the routing of different inputs based on a reward signal derived from a measure of the information load in each module. Our empirical results show that DIB combined with elastic weight consolidation (EWC) regularization outperforms models with similar capacity and EWC regularization across different task formulations and datasets. |
Tasks | Continual Learning |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04508v1 |
https://arxiv.org/pdf/1912.04508v1.pdf | |
PWC | https://paperswithcode.com/paper/reducing-catastrophic-forgetting-in-modular |
Repo | |
Framework | |
Making Neural Networks FAIR
Title | Making Neural Networks FAIR |
Authors | Anna Nguyen, Tobias Weller, York Sure-Vetter |
Abstract | Research on neural networks has gained significant momentum over the past few years. A plethora of neural networks is currently being trained on available data in research as well as in industry. Because training is a resource-intensive process and training data cannot always be made available to everyone, there has been a recent trend to attempt to re-use already-trained neural networks. As such, neural networks themselves have become research data. In this paper, we present the Neural Network Ontology, an ontology to make neural networks findable, accessible, interoperable and reusable as suggested by the well-established FAIR guiding principles for scientific data management and stewardship. We created the new FAIRnets Dataset that comprises about 2,000 neural networks openly accessible on the internet and uses the Neural Network Ontology to semantically annotate and represent the neural networks. For each of the neural networks in the FAIRnets Dataset, the relevant properties according to the Neural Network Ontology such as the description and the architecture are stored. Ultimately, the FAIRnets Dataset can be queried with a set of desired properties and responds with a set of neural networks that have these properties. We provide the service FAIRnets Search which is implemented on top of a SPARQL endpoint and allows for querying, searching and finding trained neural networks annotated with the Neural Network Ontology. The service is demonstrated by a browser-based frontend to the SPARQL endpoint. |
Tasks | |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.11569v1 |
https://arxiv.org/pdf/1907.11569v1.pdf | |
PWC | https://paperswithcode.com/paper/making-neural-networks-fair |
Repo | |
Framework | |
Coalitional Games with Stochastic Characteristic Functions and Private Types
Title | Coalitional Games with Stochastic Characteristic Functions and Private Types |
Authors | Dengji Zhao, Yiqing Huang, Liat Cohen, Tal Grinshpoun |
Abstract | The research on coalitional games has focused on how to share the reward among a coalition such that players are incentivised to collaborate together. It assumes that the (deterministic or stochastic) characteristic function is known in advance. This paper studies a new setting (a task allocation problem) where the characteristic function is not known and it is controlled by some private information from the players. Hence, the challenge here is twofold: (i) incentivize players to reveal their private information truthfully, (ii) incentivize them to collaborate together. We show that existing reward distribution mechanisms or auctions cannot solve the challenge. Hence, we propose the very first mechanism for the problem from the perspective of both mechanism design and coalitional games. |
Tasks | |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11737v1 |
https://arxiv.org/pdf/1910.11737v1.pdf | |
PWC | https://paperswithcode.com/paper/coalitional-games-with-stochastic |
Repo | |
Framework | |