Paper Group ANR 459
Machine learning non-local correlations. MS-BACO: A new Model Selection algorithm using Binary Ant Colony Optimization for neural complexity and error reduction. Goal-constrained Planning Domain Model Verification of Safety Properties. Graph Planning with Expected Finite Horizon. Context and Humor: Understanding Amul advertisements of India. Genera …
Machine learning non-local correlations
Title | Machine learning non-local correlations |
Authors | Askery Canabarro, Samuraí Brito, Rafael Chaves |
Abstract | The ability to witness non-local correlations lies at the core of foundational aspects of quantum mechanics and its application in the processing of information. Commonly, this is achieved via the violation of Bell inequalities. Unfortunately, however, their systematic derivation quickly becomes unfeasible as the scenario of interest grows in complexity. To cope with that, we propose here a machine learning approach for the detection and quantification of non-locality. It consists of an ensemble of multilayer perceptrons blended with genetic algorithms achieving a high performance in a number of relevant Bell scenarios. Our results offer a novel method and a proof-of-principle for the relevance of machine learning for understanding non-locality. |
Tasks | |
Published | 2018-08-21 |
URL | http://arxiv.org/abs/1808.07069v1 |
http://arxiv.org/pdf/1808.07069v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-non-local-correlations |
Repo | |
Framework | |
MS-BACO: A new Model Selection algorithm using Binary Ant Colony Optimization for neural complexity and error reduction
Title | MS-BACO: A new Model Selection algorithm using Binary Ant Colony Optimization for neural complexity and error reduction |
Authors | Saman Sadeghyan, Shahrokh Asadi |
Abstract | Stabilizing the complexity of Feedforward Neural Networks (FNNs) for the given approximation task can be managed by defining an appropriate model magnitude which is also greatly correlated with the generalization quality and computational efficiency. However, deciding on the right level of model complexity can be highly challenging in FNN applications. In this paper, a new Model Selection algorithm using Binary Ant Colony Optimization (MS-BACO) is proposed in order to achieve the optimal FNN model in terms of neural complexity and cross-entropy error. MS-BACO is a meta-heuristic algorithm that treats the problem as a combinatorial optimization problem. By quantifying both the amount of correlation exists among hidden neurons and the sensitivity of the FNN output to the hidden neurons using a sample-based sensitivity analysis method called, extended Fourier amplitude sensitivity test, the algorithm mostly tends to select the FNN model containing hidden neurons with most distinct hyperplanes and high contribution percentage. Performance of the proposed algorithm with three different designs of heuristic information is investigated. Comparison of the findings verifies that the newly introduced algorithm is able to provide more compact and accurate FNN model. |
Tasks | Combinatorial Optimization, Model Selection |
Published | 2018-10-21 |
URL | http://arxiv.org/abs/1810.08944v1 |
http://arxiv.org/pdf/1810.08944v1.pdf | |
PWC | https://paperswithcode.com/paper/ms-baco-a-new-model-selection-algorithm-using |
Repo | |
Framework | |
Goal-constrained Planning Domain Model Verification of Safety Properties
Title | Goal-constrained Planning Domain Model Verification of Safety Properties |
Authors | Anas Shrinah, Kerstin Eder |
Abstract | The verification of planning domain models is crucial to ensure the safety, integrity and correctness of planning-based automated systems. This task is usually performed using model checking techniques. However, unconstrained application of model checkers to verify planning domain models can result in false positives, i.e.counterexamples that are unreachable by a sound planner when using the domain under verification during a planning task. In this paper, we discuss the downside of unconstrained planning domain model verification. We then introduce the notion of a valid planning counterexample, and demonstrate how model checkers, as well as state trajectory constraints planning techniques, should be used to verify planning domain models so that invalid planning counterexamples are not returned. |
Tasks | |
Published | 2018-11-22 |
URL | https://arxiv.org/abs/1811.09231v4 |
https://arxiv.org/pdf/1811.09231v4.pdf | |
PWC | https://paperswithcode.com/paper/goal-constrained-planning-domain-model-formal |
Repo | |
Framework | |
Graph Planning with Expected Finite Horizon
Title | Graph Planning with Expected Finite Horizon |
Authors | Krishnendu Chatterjee, Laurent Doyen |
Abstract | Graph planning gives rise to fundamental algorithmic questions such as shortest path, traveling salesman problem, etc. A classical problem in discrete planning is to consider a weighted graph and construct a path that maximizes the sum of weights for a given time horizon $T$. However, in many scenarios, the time horizon is not fixed, but the stopping time is chosen according to some distribution such that the expected stopping time is $T$. If the stopping time distribution is not known, then to ensure robustness, the distribution is chosen by an adversary, to represent the worst-case scenario. A stationary plan for every vertex always chooses the same outgoing edge. For fixed horizon or fixed stopping-time distribution, stationary plans are not sufficient for optimality. Quite surprisingly we show that when an adversary chooses the stopping-time distribution with expected stopping time $T$, then stationary plans are sufficient. While computing optimal stationary plans for fixed horizon is NP-complete, we show that computing optimal stationary plans under adversarial stopping-time distribution can be achieved in polynomial time. Consequently, our polynomial-time algorithm for adversarial stopping time also computes an optimal plan among all possible plans. |
Tasks | |
Published | 2018-02-10 |
URL | http://arxiv.org/abs/1802.03642v1 |
http://arxiv.org/pdf/1802.03642v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-planning-with-expected-finite-horizon |
Repo | |
Framework | |
Context and Humor: Understanding Amul advertisements of India
Title | Context and Humor: Understanding Amul advertisements of India |
Authors | Radhika Mamidi |
Abstract | Contextual knowledge is the most important element in understanding language. By contextual knowledge we mean both general knowledge and discourse knowledge i.e. knowledge of the situational context, background knowledge and the co-textual context [10]. In this paper, we will discuss the importance of contextual knowledge in understanding the humor present in the cartoon based Amul advertisements in India.In the process, we will analyze these advertisements and also see if humor is an effective tool for advertising and thereby, for marketing.These bilingual advertisements also expect the audience to have the appropriate linguistic knowledge which includes knowledge of English and Hindi vocabulary, morphology and syntax. Different techniques like punning, portmanteaus and parodies of popular proverbs, expressions, acronyms, famous dialogues, songs etc are employed to convey the message in a humorous way. The present study will concentrate on these linguistic cues and the required context for understanding wit and humor. |
Tasks | |
Published | 2018-04-15 |
URL | http://arxiv.org/abs/1804.05398v1 |
http://arxiv.org/pdf/1804.05398v1.pdf | |
PWC | https://paperswithcode.com/paper/context-and-humor-understanding-amul |
Repo | |
Framework | |
Generative Adversarial Network Training is a Continual Learning Problem
Title | Generative Adversarial Network Training is a Continual Learning Problem |
Authors | Kevin J Liang, Chunyuan Li, Guoyin Wang, Lawrence Carin |
Abstract | Generative Adversarial Networks (GANs) have proven to be a powerful framework for learning to draw samples from complex distributions. However, GANs are also notoriously difficult to train, with mode collapse and oscillations a common problem. We hypothesize that this is at least in part due to the evolution of the generator distribution and the catastrophic forgetting tendency of neural networks, which leads to the discriminator losing the ability to remember synthesized samples from previous instantiations of the generator. Recognizing this, our contributions are twofold. First, we show that GAN training makes for a more interesting and realistic benchmark for continual learning methods evaluation than some of the more canonical datasets. Second, we propose leveraging continual learning techniques to augment the discriminator, preserving its ability to recognize previous generator samples. We show that the resulting methods add only a light amount of computation, involve minimal changes to the model, and result in better overall performance on the examined image and text generation tasks. |
Tasks | Continual Learning, Text Generation |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.11083v1 |
http://arxiv.org/pdf/1811.11083v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-network-training-is-a |
Repo | |
Framework | |
Deep Class-Wise Hashing: Semantics-Preserving Hashing via Class-wise Loss
Title | Deep Class-Wise Hashing: Semantics-Preserving Hashing via Class-wise Loss |
Authors | Xuefei Zhe, Shifeng Chen, Hong Yan |
Abstract | Deep supervised hashing has emerged as an influential solution to large-scale semantic image retrieval problems in computer vision. In the light of recent progress, convolutional neural network based hashing methods typically seek pair-wise or triplet labels to conduct the similarity preserving learning. However, complex semantic concepts of visual contents are hard to capture by similar/dissimilar labels, which limits the retrieval performance. Generally, pair-wise or triplet losses not only suffer from expensive training costs but also lack in extracting sufficient semantic information. In this regard, we propose a novel deep supervised hashing model to learn more compact class-level similarity preserving binary codes. Our deep learning based model is motivated by deep metric learning that directly takes semantic labels as supervised information in training and generates corresponding discriminant hashing code. Specifically, a novel cubic constraint loss function based on Gaussian distribution is proposed, which preserves semantic variations while penalizes the overlap part of different classes in the embedding space. To address the discrete optimization problem introduced by binary codes, a two-step optimization strategy is proposed to provide efficient training and avoid the problem of gradient vanishing. Extensive experiments on four large-scale benchmark databases show that our model can achieve the state-of-the-art retrieval performance. Moreover, when training samples are limited, our method surpasses other supervised deep hashing methods with non-negligible margins. |
Tasks | Image Retrieval, Metric Learning |
Published | 2018-03-12 |
URL | http://arxiv.org/abs/1803.04137v1 |
http://arxiv.org/pdf/1803.04137v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-class-wise-hashing-semantics-preserving |
Repo | |
Framework | |
Adaptive neural network classifier for decoding MEG signals
Title | Adaptive neural network classifier for decoding MEG signals |
Authors | Ivan Zubarev, Rasmus Zetter, Hanna-Leena Halme, Lauri Parkkonen |
Abstract | Convolutional Neural Networks (CNN) outperform traditional classification methods in many domains. Recently these methods have gained attention in neuroscience and particularly in brain-computer interface (BCI) community. Here, we introduce a CNN optimized for classification of brain states from magnetoencephalographic (MEG) measurements. Our CNN design is based on a generative model of the electromagnetic (EEG and MEG) brain signals and is readily interpretable in neurophysiological terms. We show here that the proposed network is able to decode event-related responses as well as modulations of oscillatory brain activity and that it outperforms more complex neural networks and traditional classifiers used in the field. Importantly, the model is robust to inter-individual differences and can successfully generalize to new subjects in offline and online classification. |
Tasks | EEG |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.10981v2 |
http://arxiv.org/pdf/1805.10981v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-neural-network-classifier-for |
Repo | |
Framework | |
Improved ASR for Under-Resourced Languages Through Multi-Task Learning with Acoustic Landmarks
Title | Improved ASR for Under-Resourced Languages Through Multi-Task Learning with Acoustic Landmarks |
Authors | Di He, Boon Pang Lim, Xuesong Yang, Mark Hasegawa-Johnson, Deming Chen |
Abstract | Furui first demonstrated that the identity of both consonant and vowel can be perceived from the C-V transition; later, Stevens proposed that acoustic landmarks are the primary cues for speech perception, and that steady-state regions are secondary or supplemental. Acoustic landmarks are perceptually salient, even in a language one doesn’t speak, and it has been demonstrated that non-speakers of the language can identify features such as the primary articulator of the landmark. These factors suggest a strategy for developing language-independent automatic speech recognition: landmarks can potentially be learned once from a suitably labeled corpus and rapidly applied to many other languages. This paper proposes enhancing the cross-lingual portability of a neural network by using landmarks as the secondary task in multi-task learning (MTL). The network is trained in a well-resourced source language with both phone and landmark labels (English), then adapted to an under-resourced target language with only word labels (Iban). Landmark-tasked MTL reduces source-language phone error rate by 2.9% relative, and reduces target-language word error rate by 1.9%-5.9% depending on the amount of target-language training data. These results suggest that landmark-tasked MTL causes the DNN to learn hidden-node features that are useful for cross-lingual adaptation. |
Tasks | Multi-Task Learning, Speech Recognition |
Published | 2018-05-15 |
URL | http://arxiv.org/abs/1805.05574v1 |
http://arxiv.org/pdf/1805.05574v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-asr-for-under-resourced-languages |
Repo | |
Framework | |
Sim-to-Real Optimization of Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play
Title | Sim-to-Real Optimization of Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play |
Authors | Yongxi Tan, Jin Yang, Xin Chen, Qitao Song, Yunjun Chen, Zhangxiang Ye, Zhenqiang Su |
Abstract | Mobile network that millions of people use every day is one of the most complex systems in the world. Optimization of mobile network to meet exploding customer demand and reduce capital/operation expenditures poses great challenges. Despite recent progress, application of deep reinforcement learning (DRL) to complex real world problem still remains unsolved, given data scarcity, partial observability, risk and complex rules/dynamics in real world, as well as the huge reality gap between simulation and real world. To bridge the reality gap, we introduce a Sim-to-Real framework to directly transfer learning from simulation to real world via graph convolutional neural network (CNN) - by abstracting partially observable mobile network into graph, then distilling domain-variant irregular graph into domain-invariant tensor in locally Euclidean space as input to CNN -, domain randomization and multi-task learning. We use a novel self-play mechanism to encourage competition among DRL agents for best record on multiple tasks via simulated annealing, just like athletes compete for world record in decathlon. We also propose a decentralized multi-agent, competitive and cooperative DRL method to coordinate the actions of multi-cells to maximize global reward and minimize negative impact to neighbor cells. Using 6 field trials on commercial mobile networks, we demonstrate for the first time that a DRL agent can successfully transfer learning from simulation to complex real world problem with imperfect information, complex rules/dynamics, huge state/action space, and multi-agent interactions, without any training in the real world. |
Tasks | Multi-Task Learning, Transfer Learning |
Published | 2018-02-18 |
URL | http://arxiv.org/abs/1802.06416v3 |
http://arxiv.org/pdf/1802.06416v3.pdf | |
PWC | https://paperswithcode.com/paper/sim-to-real-optimization-of-complex-real |
Repo | |
Framework | |
Stochastic Distributed Optimization for Machine Learning from Decentralized Features
Title | Stochastic Distributed Optimization for Machine Learning from Decentralized Features |
Authors | Yaochen Hu, Di Niu, Jianming Yang, Shengping Zhou |
Abstract | Distributed machine learning has been widely studied in the literature to scale up machine learning model training in the presence of an ever-increasing amount of data. We study distributed machine learning from another perspective, where the information about the training same samples are inherently decentralized and located on different parities. We propose an asynchronous stochastic gradient descent (SGD) algorithm for such a feature distributed machine learning (FDML) problem, to jointly learn from decentralized features, with theoretical convergence guarantees under bounded asynchrony. Our algorithm does not require sharing the original feature data or even local model parameters between parties, thus preserving a high level of data confidentiality. We implement our algorithm for FDML in a parameter server architecture. We compare our system with fully centralized training (which violates data locality requirements) and training only based on local features, through extensive experiments performed on a large amount of data from a real-world application, involving 5 million samples and $8700$ features in total. Experimental results have demonstrated the effectiveness and efficiency of the proposed FDML system. |
Tasks | Distributed Optimization |
Published | 2018-12-16 |
URL | https://arxiv.org/abs/1812.06415v2 |
https://arxiv.org/pdf/1812.06415v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-distributed-optimization-for |
Repo | |
Framework | |
Neural CRF transducers for sequence labeling
Title | Neural CRF transducers for sequence labeling |
Authors | Kai Hu, Zhijian Ou, Min Hu, Junlan Feng |
Abstract | Conditional random fields (CRFs) have been shown to be one of the most successful approaches to sequence labeling. Various linear-chain neural CRFs (NCRFs) are developed to implement the non-linear node potentials in CRFs, but still keeping the linear-chain hidden structure. In this paper, we propose NCRF transducers, which consists of two RNNs, one extracting features from observations and the other capturing (theoretically infinite) long-range dependencies between labels. Different sequence labeling methods are evaluated over POS tagging, chunking and NER (English, Dutch). Experiment results show that NCRF transducers achieve consistent improvements over linear-chain NCRFs and RNN transducers across all the four tasks, and can improve state-of-the-art results. |
Tasks | Chunking |
Published | 2018-11-04 |
URL | http://arxiv.org/abs/1811.01382v1 |
http://arxiv.org/pdf/1811.01382v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-crf-transducers-for-sequence-labeling |
Repo | |
Framework | |
Hypergraph Neural Networks
Title | Hypergraph Neural Networks |
Authors | Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, Yue Gao |
Abstract | In this paper, we present a hypergraph neural networks (HGNN) framework for data representation learning, which can encode high-order data correlation in a hypergraph structure. Confronting the challenges of learning representation for complex data in real practice, we propose to incorporate such data structure in a hypergraph, which is more flexible on data modeling, especially when dealing with complex data. In this method, a hyperedge convolution operation is designed to handle the data correlation during representation learning. In this way, traditional hypergraph learning procedure can be conducted using hyperedge convolution operations efficiently. HGNN is able to learn the hidden layer representation considering the high-order data structure, which is a general framework considering the complex data correlations. We have conducted experiments on citation network classification and visual object recognition tasks and compared HGNN with graph convolutional networks and other traditional methods. Experimental results demonstrate that the proposed HGNN method outperforms recent state-of-the-art methods. We can also reveal from the results that the proposed HGNN is superior when dealing with multi-modal data compared with existing methods. |
Tasks | Object Recognition, Representation Learning |
Published | 2018-09-25 |
URL | http://arxiv.org/abs/1809.09401v3 |
http://arxiv.org/pdf/1809.09401v3.pdf | |
PWC | https://paperswithcode.com/paper/hypergraph-neural-networks |
Repo | |
Framework | |
Unsupervised Deep Slow Feature Analysis for Change Detection in Multi-Temporal Remote Sensing Images
Title | Unsupervised Deep Slow Feature Analysis for Change Detection in Multi-Temporal Remote Sensing Images |
Authors | Bo Du, Lixiang Ru, Chen Wu, Liangpei Zhang |
Abstract | Change detection has been a hotspot in remote sensing technology for a long time. With the increasing availability of multi-temporal remote sensing images, numerous change detection algorithms have been proposed. Among these methods, image transformation methods with feature extraction and mapping could effectively highlight the changed information and thus has better change detection performance. However, changes of multi-temporal images are usually complex, existing methods are not effective enough. In recent years, deep network has shown its brilliant performance in many fields including feature extraction and projection. Therefore, in this paper, based on deep network and slow feature analysis (SFA) theory, we proposed a new change detection algorithm for multi-temporal remotes sensing images called Deep Slow Feature Analysis (DSFA). In DSFA model, two symmetric deep networks are utilized for projecting the input data of bi-temporal imagery. Then, the SFA module is deployed to suppress the unchanged components and highlight the changed components of the transformed features. The CVA pre-detection is employed to find unchanged pixels with high confidence as training samples. Finally, the change intensity is calculated with chi-square distance and the changes are determined by threshold algorithms. The experiments are performed on two real-world datasets and a public hyperspectral dataset. The visual comparison and quantitative evaluation have both shown that DSFA could outperform the other state-of-the-art algorithms, including other SFA-based and deep learning methods. |
Tasks | |
Published | 2018-12-03 |
URL | https://arxiv.org/abs/1812.00645v2 |
https://arxiv.org/pdf/1812.00645v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-deep-slow-feature-analysis-for |
Repo | |
Framework | |
Automatic Three-Dimensional Cephalometric Annotation System Using Three-Dimensional Convolutional Neural Networks
Title | Automatic Three-Dimensional Cephalometric Annotation System Using Three-Dimensional Convolutional Neural Networks |
Authors | Sung Ho Kang, Kiwan Jeon, Hak-Jin Kim, Jin Keun Seo, Sang-Hwy Lee |
Abstract | Background: Three-dimensional (3D) cephalometric analysis using computerized tomography data has been rapidly adopted for dysmorphosis and anthropometry. Several different approaches to automatic 3D annotation have been proposed to overcome the limitations of traditional cephalometry. The purpose of this study was to evaluate the accuracy of our newly-developed system using a deep learning algorithm for automatic 3D cephalometric annotation. Methods: To overcome current technical limitations, some measures were developed to directly annotate 3D human skull data. Our deep learning-based model system mainly consisted of a 3D convolutional neural network and image data resampling. Results: The discrepancies between the referenced and predicted coordinate values in three axes and in 3D distance were calculated to evaluate system accuracy. Our new model system yielded prediction errors of 3.26, 3.18, and 4.81 mm (for three axes) and 7.61 mm (for 3D). Moreover, there was no difference among the landmarks of the three groups, including the midsagittal plane, horizontal plane, and mandible (p>0.05). Conclusion: A new 3D convolutional neural network-based automatic annotation system for 3D cephalometry was developed. The strategies used to implement the system were detailed and measurement results were evaluated for accuracy. Further development of this system is planned for full clinical application of automatic 3D cephalometric annotation. |
Tasks | |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07889v1 |
http://arxiv.org/pdf/1811.07889v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-three-dimensional-cephalometric |
Repo | |
Framework | |