Paper Group ANR 362
Community Detection in Degree-Corrected Block Models. Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding. Robust Learning with Kernel Mean p-Power Error Loss. A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation. Reasoning with Memory Augmented Neural Networks for Lang …
Community Detection in Degree-Corrected Block Models
Title | Community Detection in Degree-Corrected Block Models |
Authors | Chao Gao, Zongming Ma, Anderson Y. Zhang, Harrison H. Zhou |
Abstract | Community detection is a central problem of network data analysis. Given a network, the goal of community detection is to partition the network nodes into a small number of clusters, which could often help reveal interesting structures. The present paper studies community detection in Degree-Corrected Block Models (DCBMs). We first derive asymptotic minimax risks of the problem for a misclassification proportion loss under appropriate conditions. The minimax risks are shown to depend on degree-correction parameters, community sizes, and average within and between community connectivities in an intuitive and interpretable way. In addition, we propose a polynomial time algorithm to adaptively perform consistent and even asymptotically optimal community detection in DCBMs. |
Tasks | Community Detection |
Published | 2016-07-24 |
URL | http://arxiv.org/abs/1607.06993v1 |
http://arxiv.org/pdf/1607.06993v1.pdf | |
PWC | https://paperswithcode.com/paper/community-detection-in-degree-corrected-block |
Repo | |
Framework | |
Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding
Title | Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding |
Authors | Lina M. Rojas Barahona, Milica Gasic, Nikola Mrkšić, Pei-Hao Su, Stefan Ultes, Tsung-Hsien Wen, Steve Young |
Abstract | This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling dialogue, the semantic decoder predicts the dialogue act and a set of slot-value pairs from a set of n-best hypotheses returned by the Automatic Speech Recognition. Most current models for spoken language understanding assume (i) word-aligned semantic annotations as in sequence taggers and (ii) delexicalisation, or a mapping of input words to domain-specific concepts using heuristics that try to capture morphological variation but that do not scale to other domains nor to language variation (e.g., morphology, synonyms, paraphrasing ). In this work the semantic decoder is trained using unaligned semantic annotations and it uses distributed semantic representation learning to overcome the limitations of explicit delexicalisation. The proposed architecture uses a convolutional neural network for the sentence representation and a long-short term memory network for the context representation. Results are presented for the publicly available DSTC2 corpus and an In-car corpus which is similar to DSTC2 but has a significantly higher word error rate (WER). |
Tasks | Representation Learning, Slot Filling, Speech Recognition, Spoken Language Understanding |
Published | 2016-10-13 |
URL | http://arxiv.org/abs/1610.04120v1 |
http://arxiv.org/pdf/1610.04120v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-sentence-and-context |
Repo | |
Framework | |
Robust Learning with Kernel Mean p-Power Error Loss
Title | Robust Learning with Kernel Mean p-Power Error Loss |
Authors | Badong Chen, Lei Xing, Xin Wang, Jing Qin, Nanning Zheng |
Abstract | Correntropy is a second order statistical measure in kernel space, which has been successfully applied in robust learning and signal processing. In this paper, we define a nonsecond order statistical measure in kernel space, called the kernel mean-p power error (KMPE), including the correntropic loss (CLoss) as a special case. Some basic properties of KMPE are presented. In particular, we apply the KMPE to extreme learning machine (ELM) and principal component analysis (PCA), and develop two robust learning algorithms, namely ELM-KMPE and PCA-KMPE. Experimental results on synthetic and benchmark data show that the developed algorithms can achieve consistently better performance when compared with some existing methods. |
Tasks | |
Published | 2016-12-21 |
URL | http://arxiv.org/abs/1612.07019v1 |
http://arxiv.org/pdf/1612.07019v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-learning-with-kernel-mean-p-power |
Repo | |
Framework | |
A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation
Title | A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation |
Authors | Amrita Saha, Mitesh M. Khapra, Sarath Chandar, Janarthanan Rajendran, Kyunghyun Cho |
Abstract | Interlingua based Machine Translation (MT) aims to encode multiple languages into a common linguistic representation and then decode sentences in multiple target languages from this representation. In this work we explore this idea in the context of neural encoder decoder architectures, albeit on a smaller scale and without MT as the end goal. Specifically, we consider the case of three languages or modalities X, Z and Y wherein we are interested in generating sequences in Y starting from information available in X. However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications). Z thus acts as a pivot/bridge. An obvious solution, which is perhaps less elegant but works very well in practice is to train a two stage model which first converts from X to Z and then from Z to Y. Instead we explore an interlingua inspired solution which jointly learns to do the following (i) encode X and Z to a common representation and (ii) decode Y from this common representation. We evaluate our model on two tasks: (i) bridge transliteration and (ii) bridge captioning. We report promising results in both these applications and believe that this is a right step towards truly interlingua inspired encoder decoder architectures. |
Tasks | Machine Translation, Transliteration |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.04754v1 |
http://arxiv.org/pdf/1606.04754v1.pdf | |
PWC | https://paperswithcode.com/paper/a-correlational-encoder-decoder-architecture |
Repo | |
Framework | |
Reasoning with Memory Augmented Neural Networks for Language Comprehension
Title | Reasoning with Memory Augmented Neural Networks for Language Comprehension |
Authors | Tsendsuren Munkhdalai, Hong Yu |
Abstract | Hypothesis testing is an important cognitive process that supports human reasoning. In this paper, we introduce a computational hypothesis testing approach based on memory augmented neural networks. Our approach involves a hypothesis testing loop that reconsiders and progressively refines a previously formed hypothesis in order to generate new hypotheses to test. We apply the proposed approach to language comprehension task by using Neural Semantic Encoders (NSE). Our NSE models achieve the state-of-the-art results showing an absolute improvement of 1.2% to 2.6% accuracy over previous results obtained by single and ensemble systems on standard machine comprehension benchmarks such as the Children’s Book Test (CBT) and Who-Did-What (WDW) news article datasets. |
Tasks | Reading Comprehension |
Published | 2016-10-20 |
URL | http://arxiv.org/abs/1610.06454v2 |
http://arxiv.org/pdf/1610.06454v2.pdf | |
PWC | https://paperswithcode.com/paper/reasoning-with-memory-augmented-neural |
Repo | |
Framework | |
What do different evaluation metrics tell us about saliency models?
Title | What do different evaluation metrics tell us about saliency models? |
Authors | Zoya Bylinskii, Tilke Judd, Aude Oliva, Antonio Torralba, Frédo Durand |
Abstract | How best to evaluate a saliency model’s ability to predict where humans look in images is an open research question. The choice of evaluation metric depends on how saliency is defined and how the ground truth is represented. Metrics differ in how they rank saliency models, and this results from how false positives and false negatives are treated, whether viewing biases are accounted for, whether spatial deviations are factored in, and how the saliency maps are pre-processed. In this paper, we provide an analysis of 8 different evaluation metrics and their properties. With the help of systematic experiments and visualizations of metric computations, we add interpretability to saliency scores and more transparency to the evaluation of saliency models. Building off the differences in metric properties and behaviors, we make recommendations for metric selections under specific assumptions and for specific applications. |
Tasks | |
Published | 2016-04-12 |
URL | http://arxiv.org/abs/1604.03605v2 |
http://arxiv.org/pdf/1604.03605v2.pdf | |
PWC | https://paperswithcode.com/paper/what-do-different-evaluation-metrics-tell-us |
Repo | |
Framework | |
Comparative Deep Learning of Hybrid Representations for Image Recommendations
Title | Comparative Deep Learning of Hybrid Representations for Image Recommendations |
Authors | Chenyi Lei, Dong Liu, Weiping Li, Zheng-Jun Zha, Houqiang Li |
Abstract | In many image-related tasks, learning expressive and discriminative representations of images is essential, and deep learning has been studied for automating the learning of such representations. Some user-centric tasks, such as image recommendations, call for effective representations of not only images but also preferences and intents of users over images. Such representations are termed \emph{hybrid} and addressed via a deep learning approach in this paper. We design a dual-net deep network, in which the two sub-networks map input images and preferences of users into a same latent semantic space, and then the distances between images and users in the latent space are calculated to make decisions. We further propose a comparative deep learning (CDL) method to train the deep network, using a pair of images compared against one user to learn the pattern of their relative distances. The CDL embraces much more training data than naive deep learning, and thus achieves superior performance than the latter, with no cost of increasing network complexity. Experimental results with real-world data sets for image recommendations have shown the proposed dual-net network and CDL greatly outperform other state-of-the-art image recommendation solutions. |
Tasks | |
Published | 2016-04-05 |
URL | http://arxiv.org/abs/1604.01252v1 |
http://arxiv.org/pdf/1604.01252v1.pdf | |
PWC | https://paperswithcode.com/paper/comparative-deep-learning-of-hybrid |
Repo | |
Framework | |
Mapping Temporal Variables into the NeuCube for Improved Pattern Recognition, Predictive Modelling and Understanding of Stream Data
Title | Mapping Temporal Variables into the NeuCube for Improved Pattern Recognition, Predictive Modelling and Understanding of Stream Data |
Authors | Enmei Tu, Nikola Kasabov, Jie Yang |
Abstract | This paper proposes a new method for an optimized mapping of temporal variables, describing a temporal stream data, into the recently proposed NeuCube spiking neural network architecture. This optimized mapping extends the use of the NeuCube, which was initially designed for spatiotemporal brain data, to work on arbitrary stream data and to achieve a better accuracy of temporal pattern recognition, a better and earlier event prediction and a better understanding of complex temporal stream data through visualization of the NeuCube connectivity. The effect of the new mapping is demonstrated on three bench mark problems. The first one is early prediction of patient sleep stage event from temporal physiological data. The second one is pattern recognition of dynamic temporal patterns of traffic in the Bay Area of California and the last one is the Challenge 2012 contest data set. In all cases the use of the proposed mapping leads to an improved accuracy of pattern recognition and event prediction and a better understanding of the data when compared to traditional machine learning techniques or spiking neural network reservoirs with arbitrary mapping of the variables. |
Tasks | |
Published | 2016-03-17 |
URL | http://arxiv.org/abs/1603.05594v1 |
http://arxiv.org/pdf/1603.05594v1.pdf | |
PWC | https://paperswithcode.com/paper/mapping-temporal-variables-into-the-neucube |
Repo | |
Framework | |
Hypothesis Transfer Learning via Transformation Functions
Title | Hypothesis Transfer Learning via Transformation Functions |
Authors | Simon Shaolei Du, Jayanth Koushik, Aarti Singh, Barnabas Poczos |
Abstract | We consider the Hypothesis Transfer Learning (HTL) problem where one incorporates a hypothesis trained on the source domain into the learning procedure of the target domain. Existing theoretical analysis either only studies specific algorithms or only presents upper bounds on the generalization error but not on the excess risk. In this paper, we propose a unified algorithm-dependent framework for HTL through a novel notion of transformation function, which characterizes the relation between the source and the target domains. We conduct a general risk analysis of this framework and in particular, we show for the first time, if two domains are related, HTL enjoys faster convergence rates of excess risks for Kernel Smoothing and Kernel Ridge Regression than those of the classical non-transfer learning settings. Experiments on real world data demonstrate the effectiveness of our framework. |
Tasks | Transfer Learning |
Published | 2016-12-03 |
URL | http://arxiv.org/abs/1612.01020v4 |
http://arxiv.org/pdf/1612.01020v4.pdf | |
PWC | https://paperswithcode.com/paper/hypothesis-transfer-learning-via |
Repo | |
Framework | |
A New Learning Method for Inference Accuracy, Core Occupation, and Performance Co-optimization on TrueNorth Chip
Title | A New Learning Method for Inference Accuracy, Core Occupation, and Performance Co-optimization on TrueNorth Chip |
Authors | Wei Wen, Chunpeng Wu, Yandan Wang, Kent Nixon, Qing Wu, Mark Barnell, Hai Li, Yiran Chen |
Abstract | IBM TrueNorth chip uses digital spikes to perform neuromorphic computing and achieves ultrahigh execution parallelism and power efficiency. However, in TrueNorth chip, low quantization resolution of the synaptic weights and spikes significantly limits the inference (e.g., classification) accuracy of the deployed neural network model. Existing workaround, i.e., averaging the results over multiple copies instantiated in spatial and temporal domains, rapidly exhausts the hardware resources and slows down the computation. In this work, we propose a novel learning method on TrueNorth platform that constrains the random variance of each computation copy and reduces the number of needed copies. Compared to the existing learning method, our method can achieve up to 68.8% reduction of the required neuro-synaptic cores or 6.5X speedup, with even slightly improved inference accuracy. |
Tasks | Quantization |
Published | 2016-04-03 |
URL | http://arxiv.org/abs/1604.00697v3 |
http://arxiv.org/pdf/1604.00697v3.pdf | |
PWC | https://paperswithcode.com/paper/a-new-learning-method-for-inference-accuracy |
Repo | |
Framework | |
Incremental Semiparametric Inverse Dynamics Learning
Title | Incremental Semiparametric Inverse Dynamics Learning |
Authors | Raffaello Camoriano, Silvio Traversaro, Lorenzo Rosasco, Giorgio Metta, Francesco Nori |
Abstract | This paper presents a novel approach for incremental semiparametric inverse dynamics learning. In particular, we consider the mixture of two approaches: Parametric modeling based on rigid body dynamics equations and nonparametric modeling based on incremental kernel methods, with no prior information on the mechanical properties of the system. This yields to an incremental semiparametric approach, leveraging the advantages of both the parametric and nonparametric models. We validate the proposed technique learning the dynamics of one arm of the iCub humanoid robot. |
Tasks | |
Published | 2016-01-18 |
URL | http://arxiv.org/abs/1601.04549v1 |
http://arxiv.org/pdf/1601.04549v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-semiparametric-inverse-dynamics |
Repo | |
Framework | |
Deep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition
Title | Deep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition |
Authors | Zewang Zhang, Zheng Sun, Jiaqi Liu, Jingwen Chen, Zhao Huo, Xiao Zhang |
Abstract | A deep learning approach has been widely applied in sequence modeling problems. In terms of automatic speech recognition (ASR), its performance has significantly been improved by increasing large speech corpus and deeper neural network. Especially, recurrent neural network and deep convolutional neural network have been applied in ASR successfully. Given the arising problem of training speed, we build a novel deep recurrent convolutional network for acoustic modeling and then apply deep residual learning to it. Our experiments show that it has not only faster convergence speed but better recognition accuracy over traditional deep convolutional recurrent network. In the experiments, we compare the convergence speed of our novel deep recurrent convolutional networks and traditional deep convolutional recurrent networks. With faster convergence speed, our novel deep recurrent convolutional networks can reach the comparable performance. We further show that applying deep residual learning can boost the convergence speed of our novel deep recurret convolutional networks. Finally, we evaluate all our experimental networks by phoneme error rate (PER) with our proposed bidirectional statistical n-gram language model. Our evaluation results show that our newly proposed deep recurrent convolutional network applied with deep residual learning can reach the best PER of 17.33% with the fastest convergence speed on TIMIT database. The outstanding performance of our novel deep recurrent convolutional neural network with deep residual learning indicates that it can be potentially adopted in other sequential problems. |
Tasks | Language Modelling, Speech Recognition |
Published | 2016-11-22 |
URL | http://arxiv.org/abs/1611.07174v2 |
http://arxiv.org/pdf/1611.07174v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-recurrent-convolutional-neural-network |
Repo | |
Framework | |
Skill-Based Differences in Spatio-Temporal Team Behavior in Defence of The Ancients 2
Title | Skill-Based Differences in Spatio-Temporal Team Behavior in Defence of The Ancients 2 |
Authors | Anders Drachen, Matthew Yancey, John Maguire, Derrek Chu, Iris Yuhui Wang, Tobias Mahlmann, Matthias Schubert, Diego Klabjan |
Abstract | Multiplayer Online Battle Arena (MOBA) games are among the most played digital games in the world. In these games, teams of players fight against each other in arena environments, and the gameplay is focused on tactical combat. Mastering MOBAs requires extensive practice, as is exemplified in the popular MOBA Defence of the Ancients 2 (DotA 2). In this paper, we present three data-driven measures of spatio-temporal behavior in DotA 2: 1) Zone changes; 2) Distribution of team members and: 3) Time series clustering via a fuzzy approach. We present a method for obtaining accurate positional data from DotA 2. We investigate how behavior varies across these measures as a function of the skill level of teams, using four tiers from novice to professional players. Results indicate that spatio-temporal behavior of MOBA teams is related to team skill, with professional teams having smaller within-team distances and conducting more zone changes than amateur teams. The temporal distribution of the within-team distances of professional and high-skilled teams also generally follows patterns distinct from lower skill ranks. |
Tasks | Dota 2, Time Series, Time Series Clustering |
Published | 2016-03-24 |
URL | http://arxiv.org/abs/1603.07738v1 |
http://arxiv.org/pdf/1603.07738v1.pdf | |
PWC | https://paperswithcode.com/paper/skill-based-differences-in-spatio-temporal |
Repo | |
Framework | |
Coin Betting and Parameter-Free Online Learning
Title | Coin Betting and Parameter-Free Online Learning |
Authors | Francesco Orabona, Dávid Pál |
Abstract | In the recent years, a number of parameter-free algorithms have been developed for online linear optimization over Hilbert spaces and for learning with expert advice. These algorithms achieve optimal regret bounds that depend on the unknown competitors, without having to tune the learning rates with oracle choices. We present a new intuitive framework to design parameter-free algorithms for \emph{both} online linear optimization over Hilbert spaces and for learning with expert advice, based on reductions to betting on outcomes of adversarial coins. We instantiate it using a betting algorithm based on the Krichevsky-Trofimov estimator. The resulting algorithms are simple, with no parameters to be tuned, and they improve or match previous results in terms of regret guarantee and per-round complexity. |
Tasks | |
Published | 2016-02-12 |
URL | http://arxiv.org/abs/1602.04128v4 |
http://arxiv.org/pdf/1602.04128v4.pdf | |
PWC | https://paperswithcode.com/paper/coin-betting-and-parameter-free-online |
Repo | |
Framework | |
Total variation reconstruction for compressive sensing using nonlocal Lagrangian multiplier
Title | Total variation reconstruction for compressive sensing using nonlocal Lagrangian multiplier |
Authors | Trinh Van Chien, Khanh Quoc Dinh, Viet Anh Nguyen, Byeungwoo Jeon |
Abstract | Total variation has proved its effectiveness in solving inverse problems for compressive sensing. Besides, the nonlocal means filter used as regularization preserves texture better for recovered images, but it is quite complex to implement. In this paper, based on existence of both noise and image information in the Lagrangian multiplier, we propose a simple method in term of implementation called nonlocal Lagrangian multiplier (NLLM) in order to reduce noise and boost useful image information. Experimental results show that the proposed NLLM is superior both in subjective and objective qualities of recovered image over other recovery algorithms. |
Tasks | Compressive Sensing |
Published | 2016-08-28 |
URL | http://arxiv.org/abs/1608.07813v1 |
http://arxiv.org/pdf/1608.07813v1.pdf | |
PWC | https://paperswithcode.com/paper/total-variation-reconstruction-for |
Repo | |
Framework | |