Paper Group ANR 1466
On the Effect of Word Order on Cross-lingual Sentiment Analysis. Design Light-weight 3D Convolutional Networks for Video Recognition Temporal Residual, Fully Separable Block, and Fast Algorithm. Decomposing information into copying versus transformation. Single Training Dimension Selection for Word Embedding with PCA. The Communication Complexity o …
On the Effect of Word Order on Cross-lingual Sentiment Analysis
Title | On the Effect of Word Order on Cross-lingual Sentiment Analysis |
Authors | Àlex R. Atrio, Toni Badia, Jeremy Barnes |
Abstract | Current state-of-the-art models for sentiment analysis make use of word order either explicitly by pre-training on a language modeling objective or implicitly by using recurrent neural networks (RNNs) or convolutional networks (CNNs). This is a problem for cross-lingual models that use bilingual embeddings as features, as the difference in word order between source and target languages is not resolved. In this work, we explore reordering as a pre-processing step for sentence-level cross-lingual sentiment classification with two language combinations (English-Spanish, English-Catalan). We find that while reordering helps both models, CNNS are more sensitive to local reorderings, while global reordering benefits RNNs. |
Tasks | Language Modelling, Sentiment Analysis |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05889v1 |
https://arxiv.org/pdf/1906.05889v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-effect-of-word-order-on-cross-lingual |
Repo | |
Framework | |
Design Light-weight 3D Convolutional Networks for Video Recognition Temporal Residual, Fully Separable Block, and Fast Algorithm
Title | Design Light-weight 3D Convolutional Networks for Video Recognition Temporal Residual, Fully Separable Block, and Fast Algorithm |
Authors | Haonan Wang, Jun Lin, Zhongfeng Wang |
Abstract | Deep 3-dimensional (3D) Convolutional Network (ConvNet) has shown promising performance on video recognition tasks because of its powerful spatio-temporal information fusion ability. However, the extremely intensive requirements on memory access and computing power prohibit it from being used in resource-constrained scenarios, such as portable and edge devices. So in this paper, we first propose a two-stage Fully Separable Block (FSB) to significantly compress the model sizes of 3D ConvNets. Then a feature enhancement approach named Temporal Residual Gradient (TRG) is developed to improve the performance of compressed model on video tasks, which provides higher accuracy, faster convergency and better robustness. Moreover, in order to further decrease the computing workload, we propose a hybrid Fast Algorithm (hFA) to drastically reduce the computation complexity of convolutions. These methods are effectively combined to design a light-weight and efficient ConvNet for video recognition tasks. Experiments on the popular dataset report 2.3x compression rate, 3.6x workload reduction, and 6.3% top-1 accuracy gain, over the state-of-the-art SlowFast model, which is already a highly compact model. The proposed methods also show good adaptability on traditional 3D ConvNet, demonstrating 7.4x more compact model, 11.0x less workload, and 3.0% higher accuracy |
Tasks | Video Recognition |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13388v1 |
https://arxiv.org/pdf/1905.13388v1.pdf | |
PWC | https://paperswithcode.com/paper/design-light-weight-3d-convolutional-networks |
Repo | |
Framework | |
Decomposing information into copying versus transformation
Title | Decomposing information into copying versus transformation |
Authors | Artemy Kolchinsky, Bernat Corominas-Murtra |
Abstract | In many real-world systems, information can be transmitted in two qualitatively different ways: by copying or by transformation. Copying occurs when messages are transmitted without modification, e.g., when an offspring receives an unaltered copy of a gene from its parent. Transformation occurs when messages are modified systematically during transmission, e.g., when mutational biases occur during genetic replication. Standard information-theoretic measures do not distinguish these two modes of information transfer, although they may reflect different mechanisms and have different functional consequences. Starting from a few simple axioms, we derive a decomposition of mutual information into the information transmitted by copying versus the information transmitted by transformation. We begin with a decomposition that applies when the source and destination of the channel have the same set of messages and a notion of message identity exists. We then generalize our decomposition to other kinds of channels, which can involve different source and destination sets and broader notions of similarity. In addition, we show that copy information can be interpreted as the minimal work needed by a physical copying process, which is relevant for understanding the physics of replication. We use the proposed decomposition to explore a model of amino acid substitution rates. Our results apply to any system in which the fidelity of copying, rather than simple predictability, is of critical relevance. |
Tasks | |
Published | 2019-03-21 |
URL | https://arxiv.org/abs/1903.10693v3 |
https://arxiv.org/pdf/1903.10693v3.pdf | |
PWC | https://paperswithcode.com/paper/decomposing-information-into-copying-versus |
Repo | |
Framework | |
Single Training Dimension Selection for Word Embedding with PCA
Title | Single Training Dimension Selection for Word Embedding with PCA |
Authors | Yu Wang |
Abstract | In this paper, we present a fast and reliable method based on PCA to select the number of dimensions for word embeddings. First, we train one embedding with a generous upper bound (e.g. 1,000) of dimensions. Then we transform the embeddings using PCA and incrementally remove the lesser dimensions one at a time while recording the embeddings’ performance on language tasks. Lastly, we select the number of dimensions while balancing model size and accuracy. Experiments using various datasets and language tasks demonstrate that we are able to train 10 times fewer sets of embeddings while retaining optimal performance. Researchers interested in training the best-performing embeddings for downstream tasks, such as sentiment analysis, question answering and hypernym extraction, as well as those interested in embedding compression should find the method helpful. |
Tasks | Question Answering, Sentiment Analysis, Word Embeddings |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1909.01761v1 |
https://arxiv.org/pdf/1909.01761v1.pdf | |
PWC | https://paperswithcode.com/paper/single-training-dimension-selection-for-word |
Repo | |
Framework | |
The Communication Complexity of Optimization
Title | The Communication Complexity of Optimization |
Authors | Santosh S. Vempala, Ruosong Wang, David P. Woodruff |
Abstract | We consider the communication complexity of a number of distributed optimization problems. We start with the problem of solving a linear system. Suppose there is a coordinator together with $s$ servers $P_1, \ldots, P_s$, the $i$-th of which holds a subset $A^{(i)} x = b^{(i)}$ of $n_i$ constraints of a linear system in $d$ variables, and the coordinator would like to output $x \in \mathbb{R}^d$ for which $A^{(i)} x = b^{(i)}$ for $i = 1, \ldots, s$. We assume each coefficient of each constraint is specified using $L$ bits. We first resolve the randomized and deterministic communication complexity in the point-to-point model of communication, showing it is $\tilde{\Theta}(d^2L + sd)$ and $\tilde{\Theta}(sd^2L)$, respectively. We obtain similar results for the blackboard model. When there is no solution to the linear system, a natural alternative is to find the solution minimizing the $\ell_p$ loss. While this problem has been studied, we give improved upper or lower bounds for every value of $p \ge 1$. One takeaway message is that sampling and sketching techniques, which are commonly used in earlier work on distributed optimization, are neither optimal in the dependence on $d$ nor on the dependence on the approximation $\epsilon$, thus motivating new techniques from optimization to solve these problems. Towards this end, we consider the communication complexity of optimization tasks which generalize linear systems. For linear programming, we first resolve the communication complexity when $d$ is constant, showing it is $\tilde{\Theta}(sL)$ in the point-to-point model. For general $d$ and in the point-to-point model, we show an $\tilde{O}(sd^3 L)$ upper bound and an $\tilde{\Omega}(d^2 L + sd)$ lower bound. We also show if one perturbs the coefficients randomly by numbers as small as $2^{-\Theta(L)}$, then the upper bound is $\tilde{O}(sd^2 L) + \textrm{poly}(dL)$. |
Tasks | Distributed Optimization |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05832v2 |
https://arxiv.org/pdf/1906.05832v2.pdf | |
PWC | https://paperswithcode.com/paper/the-communication-complexity-of-optimization |
Repo | |
Framework | |
Joint Entity Linking with Deep Reinforcement Learning
Title | Joint Entity Linking with Deep Reinforcement Learning |
Authors | Zheng Fang, Yanan Cao, Dongjie Zhang, Qian Li, Zhenyu Zhang, Yanbing Liu |
Abstract | Entity linking is the task of aligning mentions to corresponding entities in a given knowledge base. Previous studies have highlighted the necessity for entity linking systems to capture the global coherence. However, there are two common weaknesses in previous global models. First, most of them calculate the pairwise scores between all candidate entities and select the most relevant group of entities as the final result. In this process, the consistency among wrong entities as well as that among right ones are involved, which may introduce noise data and increase the model complexity. Second, the cues of previously disambiguated entities, which could contribute to the disambiguation of the subsequent mentions, are usually ignored by previous models. To address these problems, we convert the global linking into a sequence decision problem and propose a reinforcement learning model which makes decisions from a global perspective. Our model makes full use of the previous referred entities and explores the long-term influence of current selection on subsequent decisions. We conduct experiments on different types of datasets, the results show that our model outperforms state-of-the-art systems and has better generalization performance. |
Tasks | Entity Linking |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.00330v1 |
http://arxiv.org/pdf/1902.00330v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-entity-linking-with-deep-reinforcement |
Repo | |
Framework | |
Learning Deep Multi-Level Similarity for Thermal Infrared Object Tracking
Title | Learning Deep Multi-Level Similarity for Thermal Infrared Object Tracking |
Authors | Qiao Liu, Xin Li, Zhenyu He, Nana Fan, Di Yuan, Hongpeng Wang |
Abstract | Existing deep Thermal InfraRed (TIR) trackers only use semantic features to describe the TIR object, which lack the sufficient discriminative capacity for handling distractors. This becomes worse when the feature extraction network is only trained on RGB images.To address this issue, we propose a multi-level similarity model under a Siamese framework for robust TIR object tracking. Specifically, we compute different pattern similarities on two convolutional layers using the proposed multi-level similarity network. One of them focuses on the global semantic similarity and the other computes the local structural similarity of the TIR object. These two similarities complement each other and hence enhance the discriminative capacity of the network for handling distractors. In addition, we design a simple while effective relative entropy based ensemble subnetwork to integrate the semantic and structural similarities. This subnetwork can adaptive learn the weights of the semantic and structural similarities at the training stage. To further enhance the discriminative capacity of the tracker, we construct the first large scale TIR video sequence dataset for training the proposed model. The proposed TIR dataset not only benefits the training for TIR tracking but also can be applied to numerous TIR vision tasks. Extensive experimental results on the VOT-TIR2015 and VOT-TIR2017 benchmarks demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods. |
Tasks | Object Tracking, Semantic Similarity, Thermal Infrared Object Tracking |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03568v1 |
https://arxiv.org/pdf/1906.03568v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-deep-multi-level-similarity-for |
Repo | |
Framework | |
Un duel probabiliste pour départager deux présidents (LIA @ DEFT’2005)
Title | Un duel probabiliste pour départager deux présidents (LIA @ DEFT’2005) |
Authors | Marc El-Bèze, Juan-Manuel Torres-Moreno, Frédéric Béchet |
Abstract | We present a set of probabilistic models applied to binary classification as defined in the DEFT’05 challenge. The challenge consisted a mixture of two differents problems in Natural Language Processing : identification of author (a sequence of Fran\c{c}ois Mitterrand’s sentences might have been inserted into a speech of Jacques Chirac) and thematic break detection (the subjects addressed by the two authors are supposed to be different). Markov chains, Bayes models and an adaptative process have been used to identify the paternity of these sequences. A probabilistic model of the internal coherence of speeches which has been employed to identify thematic breaks. Adding this model has shown to improve the quality results. A comparison with different approaches demostrates the superiority of a strategy that combines learning, coherence and adaptation. Applied to the DEFT’05 data test the results in terms of precision (0.890), recall (0.955) and Fscore (0.925) measure are very promising. |
Tasks | |
Published | 2019-03-11 |
URL | http://arxiv.org/abs/1903.07397v1 |
http://arxiv.org/pdf/1903.07397v1.pdf | |
PWC | https://paperswithcode.com/paper/un-duel-probabiliste-pour-departager-deux |
Repo | |
Framework | |
Heterogeneous network approach to predict individuals’ mental health
Title | Heterogeneous network approach to predict individuals’ mental health |
Authors | Shikang Liu, Fatemeh Vahedian, David Hachen, Omar Lizardo, Christian Poellabauer, Aaron Striegel, Tijana Milenkovic |
Abstract | Depression and anxiety are critical public health issues affecting millions of people around the world. To identify individuals who are vulnerable to depression and anxiety, predictive models have been built that typically utilize data from one source. Unlike these traditional models, in this study, we leverage a rich heterogeneous data set from the University of Notre Dame’s NetHealth study that collected individuals’ (student participants’) social interaction data via smartphones, health-related behavioral data via wearables (Fitbit), and trait data from surveys. To integrate the different types of information, we model the NetHealth data as a heterogeneous information network (HIN). Then, we redefine the problem of predicting individuals’ mental health conditions (depression or anxiety) in a novel manner, as applying to our HIN a popular paradigm of a recommender system (RS), which is typically used to predict the preference that a person would give to an item (e.g., a movie or book). In our case, the items are the individuals’ different mental health states. We evaluate four state-of-the-art RS approaches. Also, we model the prediction of individuals’ mental health as another problem type - that of node classification (NC) in our HIN, evaluating in the process four node features under logistic regression as a proof-of-concept classifier. We find that our RS and NC network methods produce more accurate predictions than a logistic regression model using the same NetHealth data in the traditional non-network fashion as well as a random-approach. Also, we find that the best of the considered RS approaches outperforms all considered NC approaches. This is the first study to integrate smartphone, wearable sensor, and survey data in an HIN manner and use RS or NC on the HIN to predict individuals’ mental health conditions. |
Tasks | Node Classification, Recommendation Systems |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04346v2 |
https://arxiv.org/pdf/1906.04346v2.pdf | |
PWC | https://paperswithcode.com/paper/heterogeneous-network-approach-to-predict |
Repo | |
Framework | |
Graph-aware Modeling of Brain Connectivity Networks
Title | Graph-aware Modeling of Brain Connectivity Networks |
Authors | Yura Kim, Elizaveta Levina |
Abstract | Functional connections in the brain are frequently represented by weighted networks, with nodes representing locations in the brain, and edges representing the strength of connectivity between these locations. One challenge in analyzing such data is that inference at the individual edge level is not particularly biologically meaningful; interpretation is more useful at the level of so-called functional regions, or groups of nodes and connections between them; this is often called “graph-aware” inference in the neuroimaging literature. However, pooling over functional regions leads to significant loss of information and lower accuracy. Another challenge is correlation among edge weights within a subject, which makes inference based on independence assumptions unreliable. We address both these challenges with a linear mixed effects model, which accounts for functional regions and for edge dependence, while still modeling individual edge weights to avoid loss of information. The model allows for comparing two populations, such as patients and healthy controls, both at the functional regions level and at individual edge level, leading to biologically meaningful interpretations. We fit this model to a resting state fMRI data on schizophrenics and healthy controls, obtaining interpretable results consistent with the schizophrenia literature. |
Tasks | |
Published | 2019-03-06 |
URL | http://arxiv.org/abs/1903.02129v2 |
http://arxiv.org/pdf/1903.02129v2.pdf | |
PWC | https://paperswithcode.com/paper/graph-aware-linear-mixed-effects-models-for |
Repo | |
Framework | |
Forecasting Drought Using Multilayer Perceptron Artificial Neural Network Model
Title | Forecasting Drought Using Multilayer Perceptron Artificial Neural Network Model |
Authors | Zulifqar Ali, Ijaz Hussain, Muhammad Faisal, Hafiza Mamona Nazir, Tajammal Hussain, Muhammad Yousaf Shad, Alaa Mohamd Shoukry, Showkat Hussain Gani |
Abstract | These days human beings are facing many environmental challenges due to frequently occurring drought hazards. It may have an effect on the countrys environment, the community, and industries. Several adverse impacts of drought hazard are continued in Pakistan, including other hazards. However, early measurement and detection of drought can provide guidance to water resources management for employing drought mitigation policies. In this paper, we used a multilayer perceptron neural network (MLPNN) algorithm for drought forecasting. We applied and tested MLPNN algorithm on monthly time series data of Standardized Precipitation Evapotranspiration Index (SPEI) for seventeen climatological stations located in Northern Area and KPK (Pakistan). We found that MLPNN has potential capability for SPEI drought forecasting based on performance measures (i.e., Mean Average Error (MAE), the coefficient of correlation R, and Root Mean Square Error (RMSE). Water resources and management planner can take necessary action in advance (e.g., in water scarcity areas) by using MLPNN model as part of their decision making. |
Tasks | Decision Making, Time Series |
Published | 2019-04-17 |
URL | http://arxiv.org/abs/1904.11576v1 |
http://arxiv.org/pdf/1904.11576v1.pdf | |
PWC | https://paperswithcode.com/paper/190411576 |
Repo | |
Framework | |
Time Series Anomaly Detection with Variational Autoencoders
Title | Time Series Anomaly Detection with Variational Autoencoders |
Authors | Chunkai Zhang, Yingyang Chen |
Abstract | Anomaly detection is a very worthwhile question. However, the anomaly is not a simple two-category in reality, so it is difficult to give accurate results through the comparison of similarities. There are already some deep learning models based on GAN for anomaly detection that demonstrate validity and accuracy on time series data sets. In this paper, we propose an unsupervised model-based anomaly detection named LVEAD, which assumpts that the anomalies are objects that do not fit perfectly with the model. For better handling the time series, we use the LSTM model as the encoder and decoder part of the VAE model. Considering to better distinguish the normal and anomaly data, we train a re-encoder model to the latent space to generate new data. Experimental results of several benchmarks show that our method outperforms state-of-the-art anomaly detection techniques. |
Tasks | Anomaly Detection, Time Series |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.01702v1 |
https://arxiv.org/pdf/1907.01702v1.pdf | |
PWC | https://paperswithcode.com/paper/time-series-anomaly-detection-with |
Repo | |
Framework | |
Reinforcement Learning for Robotics and Control with Active Uncertainty Reduction
Title | Reinforcement Learning for Robotics and Control with Active Uncertainty Reduction |
Authors | Narendra Patwardhan, Zequn Wang |
Abstract | Model-free reinforcement learning based methods such as Proximal Policy Optimization, or Q-learning typically require thousands of interactions with the environment to approximate the optimum controller which may not always be feasible in robotics due to safety and time consumption. Model-based methods such as PILCO or BlackDrops, while data-efficient, provide solutions with limited robustness and complexity. To address this tradeoff, we introduce active uncertainty reduction-based virtual environments, which are formed through limited trials conducted in the original environment. We provide an efficient method for uncertainty management, which is used as a metric for self-improvement by identification of the points with maximum expected improvement through adaptive sampling. Capturing the uncertainty also allows for better mimicking of the reward responses of the original system. Our approach enables the use of complex policy structures and reward functions through a unique combination of model-based and model-free methods, while still retaining the data efficiency. We demonstrate the validity of our method on several classic reinforcement learning problems in OpenAI gym. We prove that our approach offers a better modeling capacity for complex system dynamics as compared to established methods. |
Tasks | Q-Learning |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06274v1 |
https://arxiv.org/pdf/1905.06274v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-robotics-and |
Repo | |
Framework | |
Policies Modulating Trajectory Generators
Title | Policies Modulating Trajectory Generators |
Authors | Atil Iscen, Ken Caluwaerts, Jie Tan, Tingnan Zhang, Erwin Coumans, Vikas Sindhwani, Vincent Vanhoucke |
Abstract | We propose an architecture for learning complex controllable behaviors by having simple Policies Modulate Trajectory Generators (PMTG), a powerful combination that can provide both memory and prior knowledge to the controller. The result is a flexible architecture that is applicable to a class of problems with periodic motion for which one has an insight into the class of trajectories that might lead to a desired behavior. We illustrate the basics of our architecture using a synthetic control problem, then go on to learn speed-controlled locomotion for a quadrupedal robot by using Deep Reinforcement Learning and Evolutionary Strategies. We demonstrate that a simple linear policy, when paired with a parametric Trajectory Generator for quadrupedal gaits, can induce walking behaviors with controllable speed from 4-dimensional IMU observations alone, and can be learned in under 1000 rollouts. We also transfer these policies to a real robot and show locomotion with controllable forward velocity. |
Tasks | |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02812v1 |
https://arxiv.org/pdf/1910.02812v1.pdf | |
PWC | https://paperswithcode.com/paper/policies-modulating-trajectory-generators |
Repo | |
Framework | |
Automatic Co-Registration of Aerial Imagery and Untextured Model Data Utilizing Average Shading Gradients
Title | Automatic Co-Registration of Aerial Imagery and Untextured Model Data Utilizing Average Shading Gradients |
Authors | Sylvia Schmitz, Martin Weinmann, Boitumelo Ruf |
Abstract | The comparison of current image data with existing 3D model data of a scene provides an efficient method to keep models up to date. In order to transfer information between 2D and 3D data, a preliminary co-registration is necessary. In this paper, we present a concept to automatically co-register aerial imagery and untextured 3D model data. To refine a given initial camera pose, our algorithm computes dense correspondence fields using SIFT flow between gradient representations of the model and camera image, from which 2D-3D correspondences are obtained. These correspondences are then used in an iterative optimization scheme to refine the initial camera pose by minimizing the reprojection error. Since it is assumed that the model does not contain texture information, our algorithm is built up on an existing method based on Average Shading Gradients (ASG) to generate gradient images based on raw geometry information only. We apply our algorithm for the co-registering of aerial photographs to an untextured, noisy mesh model. We have investigated different magnitudes of input error and show that the proposed approach can reduce the final reprojection error to a minimum of 1.27 plus-minus 0.54 pixels, which is less than 10 % of its initial value. Furthermore, our evaluation shows that our approach outperforms the accuracy of a standard Iterative Closest Point (ICP) implementation. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.10882v2 |
https://arxiv.org/pdf/1906.10882v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-co-registration-of-aerial-imagery |
Repo | |
Framework | |