January 26, 2020

3211 words 16 mins read

Paper Group ANR 1460

Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover’s Distance. HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile Manipulators. MSTDP: A More Biologically Plausible Learning. VecHGrad for Solving Accurately Complex Tensor Decomposition. Adversarial Defense Via Local Flatness Regularizati …

Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover’s Distance


Title	Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover’s Distance
Authors	Md Nasir, Sandeep Nallan Chakravarthula, Brian Baucom, David C. Atkins, Panayiotis Georgiou, Shrikanth Narayanan
Abstract	Linguistic coordination is a well-established phenomenon in spoken conversations and often associated with positive social behaviors and outcomes. While there have been many attempts to measure lexical coordination or entrainment in literature, only a few have explored coordination in syntactic or semantic space. In this work, we attempt to combine these different aspects of coordination into a single measure by leveraging distances in a neural word representation space. In particular, we adopt the recently proposed Word Mover’s Distance with word2vec embeddings and extend it to measure the dissimilarity in language used in multiple consecutive speaker turns. To validate our approach, we apply this measure for two case studies in the clinical psychology domain. We find that our proposed measure is correlated with the therapist’s empathy towards their patient in Motivational Interviewing and with affective behaviors in Couples Therapy. In both case studies, our proposed metric exhibits higher correlation than previously proposed measures. When applied to the couples with relationship improvement, we also notice a significant decrease in the proposed measure over the course of therapy, indicating higher linguistic coordination.
Tasks
Published	2019-04-12
URL	http://arxiv.org/abs/1904.06002v1
PDF	http://arxiv.org/pdf/1904.06002v1.pdf
PWC	https://paperswithcode.com/paper/modeling-interpersonal-linguistic
Repo
Framework


Title	HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile Manipulators
Authors	Chengshu Li, Fei Xia, Roberto Martin-Martin, Silvio Savarese
Abstract	Most common navigation tasks in human environments require auxiliary arm interactions, e.g. opening doors, pressing buttons and pushing obstacles away. This type of navigation tasks, which we call Interactive Navigation, requires the use of mobile manipulators: mobile bases with manipulation capabilities. Interactive Navigation tasks are usually long-horizon and composed of heterogeneous phases of pure navigation, pure manipulation, and their combination. Using the wrong part of the embodiment is inefficient and hinders progress. We propose HRL4IN, a novel Hierarchical RL architecture for Interactive Navigation tasks. HRL4IN exploits the exploration benefits of HRL over flat RL for long-horizon tasks thanks to temporally extended commitments towards subgoals. Different from other HRL solutions, HRL4IN handles the heterogeneous nature of the Interactive Navigation task by creating subgoals in different spaces in different phases of the task. Moreover, HRL4IN selects different parts of the embodiment to use for each phase, improving energy efficiency. We evaluate HRL4IN against flat PPO and HAC, a state-of-the-art HRL algorithm, on Interactive Navigation in two environments - a 2D grid-world environment and a 3D environment with physics simulation. We show that HRL4IN significantly outperforms its baselines in terms of task performance and energy efficiency. More information is available at https://sites.google.com/view/hrl4in.
Tasks	Hierarchical Reinforcement Learning
Published	2019-10-24
URL	https://arxiv.org/abs/1910.11432v1
PDF	https://arxiv.org/pdf/1910.11432v1.pdf
PWC	https://paperswithcode.com/paper/hrl4in-hierarchical-reinforcement-learning
Repo
Framework

MSTDP: A More Biologically Plausible Learning


Title	MSTDP: A More Biologically Plausible Learning
Authors	Shiyuan Li
Abstract	Spike-timing dependent plasticity (STDP) which observed in the brain has proven to be important in biological learning. On the other hand, artificial neural networks use a different way to learn, such as Back-Propagation or Contrastive Hebbian Learning. In this work, we propose MSTDP, a new framework that uses only STDP rules for supervised and unsupervised learning. The framework works like an auto-encoder by making each input neuron also an output neuron. It can make predictions or generate patterns in one model without additional configuration. We also brought a new iterative inference method using momentum to make the framework more efficient, which can be used in training and testing phases. Finally, we verified our framework on MNIST dataset for classification and generation task.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1912.00009v1
PDF	https://arxiv.org/pdf/1912.00009v1.pdf
PWC	https://paperswithcode.com/paper/mstdp-a-more-biologically-plausible-learning
Repo
Framework

VecHGrad for Solving Accurately Complex Tensor Decomposition


Title	VecHGrad for Solving Accurately Complex Tensor Decomposition
Authors	Jeremy Charlier, Vladimir Makarenkov
Abstract	Tensor decomposition, a collection of factorization techniques for multidimensional arrays, are among the most general and powerful tools for scientific analysis. However, because of their increasing size, today’s data sets require more complex tensor decomposition involving factorization with multiple matrices and diagonal tensors such as DEDICOM or PARATUCK2. Traditional tensor resolution algorithms such as Stochastic Gradient Descent (SGD), Non-linear Conjugate Gradient descent (NCG) or Alternating Least Square (ALS), cannot be easily applied to complex tensor decomposition or often lead to poor accuracy at convergence. We propose a new resolution algorithm, called VecHGrad, for accurate and efficient stochastic resolution over all existing tensor decomposition, specifically designed for complex decomposition. VecHGrad relies on gradient, Hessian-vector product and adaptive line search to ensure the convergence during optimization. Our experiments on five real-world data sets with the state-of-the-art deep learning gradient optimization models show that VecHGrad is capable of converging considerably faster because of its superior theoretical convergence rate per step. Therefore, VecHGrad targets as well deep learning optimizer algorithms. The experiments are performed for various tensor decomposition including CP, DEDICOM and PARATUCK2. Although it involves a slightly more complex update rule, VecHGrad’s runtime is similar in practice to that of gradient methods such as SGD, Adam or RMSProp.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.12413v2
PDF	https://arxiv.org/pdf/1905.12413v2.pdf
PWC	https://paperswithcode.com/paper/190512413
Repo
Framework

Adversarial Defense Via Local Flatness Regularization


Title	Adversarial Defense Via Local Flatness Regularization
Authors	Jia Xu, Yiming Li, Yong Jiang, Shu-Tao Xia
Abstract	Adversarial defense is a popular and important research area. Due to its intrinsic mechanism, one of the most straightforward and effective ways of defending attacks is to analyze the property of loss surface in the input space. In this paper, we define the local flatness of the loss surface as the maximum value of the chosen norm of the gradient regarding to the input within a neighborhood centered on the benign sample, and discuss the relationship between the local flatness and adversarial vulnerability. Based on the analysis, we propose a novel defense approach via regularizing the local flatness, dubbed local flatness regularization (LFR). We also demonstrate the effectiveness of the proposed method from other perspectives, such as human visual mechanism, and analyze the relationship between LFR and other related methods theoretically. Experiments are conducted to verify our theory and demonstrate the superiority of the proposed method.
Tasks	Adversarial Defense
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12165v2
PDF	https://arxiv.org/pdf/1910.12165v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-defense-via-local-flatness
Repo
Framework

Subjectivity Learning Theory towards Artificial General Intelligence


Title	Subjectivity Learning Theory towards Artificial General Intelligence
Authors	Xin Su, Shangqi Guo, Feng Chen
Abstract	The construction of artificial general intelligence (AGI) was a long-term goal of AI research aiming to deal with the complex data in the real world and make reasonable judgments in various cases like a human. However, the current AI creations, referred to as “Narrow AI”, are limited to a specific problem. The constraints come from two basic assumptions of data, which are independent and identical distributed samples and single-valued mapping between inputs and outputs. We completely break these constraints and develop the subjectivity learning theory for general intelligence. We assign the mathematical meaning for the philosophical concept of subjectivity and build the data representation of general intelligence. Under the subjectivity representation, then the global risk is constructed as the new learning goal. We prove that subjectivity learning holds a lower risk bound than traditional machine learning. Moreover, we propose the principle of empirical global risk minimization (EGRM) as the subjectivity learning process in practice, establish the condition of consistency, and present triple variables for controlling the total risk bound. The subjectivity learning is a novel learning theory for unconstrained real data and provides a path to develop AGI.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03798v2
PDF	https://arxiv.org/pdf/1909.03798v2.pdf
PWC	https://paperswithcode.com/paper/subjectivity-learning-theory-towards
Repo
Framework

Cumulative Sum Ranking


Title	Cumulative Sum Ranking
Authors	Ruy Luiz Milidiú, Rafael Henrique Santos Rocha
Abstract	The goal of Ordinal Regression is to find a rule that ranks items from a given set. Several learning algorithms to solve this prediction problem build an ensemble of binary classifiers. Ranking by Projecting uses interdependent binary perceptrons. These perceptrons share the same direction vector, but use different bias values. Similar approaches use independent direction vectors and biases. To combine the binary predictions, most of them adopt a simple counting heuristics. Here, we introduce a novel cumulative sum scoring function to combine the binary predictions. The proposed score value aggregates the strength of each one of the relevant binary classifications on how large is the item’s rank. We show that our modeling casts ordinal regression as a Structured Perceptron problem. As a consequence, we simplify its formulation and description, which results in two simple online learning algorithms. The second algorithm is a Passive-Aggressive version of the first algorithm. We show that under some rank separability condition both algorithms converge. Furthermore, we provide mistake bounds for each one of the two online algorithms. For the Passive-Aggressive version, we assume the knowledge of a separation margin, what significantly improves the corresponding mistake bound. Additionally, we show that Ranking by Projecting is a special case of our prediction algorithm. From a neural network architecture point of view, our empirical findings suggest a layer of cusum units for ordinal regression, instead of the usual softmax layer of multiclass problems.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/1911.11255v1
PDF	https://arxiv.org/pdf/1911.11255v1.pdf
PWC	https://paperswithcode.com/paper/cumulative-sum-ranking
Repo
Framework

ner and pos when nothing is capitalized


Title	ner and pos when nothing is capitalized
Authors	Stephen Mayhew, Tatiana Tsygankova, Dan Roth
Abstract	For those languages which use it, capitalization is an important signal for the fundamental NLP tasks of Named Entity Recognition (NER) and Part of Speech (POS) tagging. In fact, it is such a strong signal that model performance on these tasks drops sharply in common lowercased scenarios, such as noisy web text or machine translation outputs. In this work, we perform a systematic analysis of solutions to this problem, modifying only the casing of the train or test data using lowercasing and truecasing methods. While prior work and first impressions might suggest training a caseless model, or using a truecaser at test time, we show that the most effective strategy is a concatenation of cased and lowercased training data, producing a single model with high performance on both cased and uncased text. As shown in our experiments, this result holds across tasks and input representations. Finally, we show that our proposed solution gives an 8% F1 improvement in mention detection on noisy out-of-domain Twitter data.
Tasks	Machine Translation, Named Entity Recognition, Part-Of-Speech Tagging
Published	2019-03-27
URL	https://arxiv.org/abs/1903.11222v2
PDF	https://arxiv.org/pdf/1903.11222v2.pdf
PWC	https://paperswithcode.com/paper/ner-and-pos-when-nothing-is-capitalized
Repo
Framework

Training Robust Tree Ensembles for Security


Title	Training Robust Tree Ensembles for Security
Authors	Yizheng Chen, Shiqi Wang, Weifan Jiang, Asaf Cidon, Suman Jana
Abstract	Tree ensemble models including random forests and gradient boosted decision trees, are widely used as security classifiers to detect malware, phishing, scam, social engineering, etc. However, the robustness of tree ensembles has not been thoroughly studied. Existing approaches mainly focus on adding more robust features and conducting feature ablation study, which do not provide robustness guarantee against strong adversaries. In this paper, we propose a new algorithm to train robust tree ensembles. Robust training maximizes the defender’s gain as if the adversary is trying to minimize that. We design a general algorithm based on greedy heuristic to find better solutions to the minimization problem than previous work. We implement the algorithm for gradient boosted decision trees in xgboost and random forests in scikit-learn. Our evaluation over benchmark datasets show that, we can train more robust models than the start-of-the-art robust training algorithm in gradient boosted decision trees, with a 1.26X increase in the $L_\infty$ evasion distance required for the strongest whitebox attacker. In addition, our algorithm is general across different gain metrics and types of tree ensembles. We achieve 3.32X increase in $L_\infty$ robustness distance compared to the baseline random forest training method. Furthermore, to make the robustness increase meaningful in security applications, we propose attack-cost-driven constraints for the robust training process. Our training algorithm maximizes attacker’s evasion cost by integrating domain knowledge about feature manipulation costs. We use twitter spam detection as a case study to analyze attacker’s cost increase to evade our robust model. Our technique can train robust model to rank robust features as most important ones, and our robust model requires about 8.4X increase in attacker’s economic cost to be evaded compared to the baseline.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01149v1
PDF	https://arxiv.org/pdf/1912.01149v1.pdf
PWC	https://paperswithcode.com/paper/training-robust-tree-ensembles-for-security
Repo
Framework

Ex-Twit: Explainable Twitter Mining on Health Data


Title	Ex-Twit: Explainable Twitter Mining on Health Data
Authors	Tunazzina Islam
Abstract	Since most machine learning models provide no explanations for the predictions, their predictions are obscure for the human. The ability to explain a model’s prediction has become a necessity in many applications including Twitter mining. In this work, we propose a method called Explainable Twitter Mining (Ex-Twit) combining Topic Modeling and Local Interpretable Model-agnostic Explanation (LIME) to predict the topic and explain the model predictions. We demonstrate the effectiveness of Ex-Twit on Twitter health-related data.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1906.02132v2
PDF	https://arxiv.org/pdf/1906.02132v2.pdf
PWC	https://paperswithcode.com/paper/190602132
Repo
Framework

Self-Supervised Learning of Physics-Based Reconstruction Neural Networks without Fully-Sampled Reference Data


Title	Self-Supervised Learning of Physics-Based Reconstruction Neural Networks without Fully-Sampled Reference Data
Authors	Burhaneddin Yaman, Seyed Amir Hossein Hosseini, Steen Moeller, Jutta Ellermann, Kâmil Uğurbil, Mehmet Akçakaya
Abstract	Purpose: To develop a strategy for training a physics-driven MRI reconstruction neural network without a database of fully-sampled datasets. Theory and Methods: Self-supervised learning via data under-sampling (SSDU) for physics-based deep learning (DL) reconstruction partitions available measurements into two sets, one of which is used in the data consistency units in the unrolled network and the other is used to define the loss for training. The proposed training without fully-sampled data is compared to fully-supervised training with ground-truth data, as well as conventional compressed sensing and parallel imaging methods using the publicly available fastMRI knee database. The same physics-based neural network is used for both proposed SSDU and supervised training. The SSDU training is also applied to prospectively 2-fold accelerated high-resolution brain datasets at different acceleration rates, and compared to parallel imaging. Results: Results on five different knee sequences at acceleration rate of 4 shows that proposed self-supervised approach performs closely with supervised learning, while significantly outperforming conventional compressed sensing and parallel imaging, as characterized by quantitative metrics and a clinical reader study. The results on prospectively sub-sampled brain datasets, where supervised learning cannot be employed due to lack of ground-truth reference, show that the proposed self-supervised approach successfully perform reconstruction at high acceleration rates (4, 6 and 8). Image readings indicate improved visual reconstruction quality with the proposed approach compared to parallel imaging at acquisition acceleration. Conclusion: The proposed SSDU approach allows training of physics-based DL-MRI reconstruction without fully-sampled data, while achieving comparable results with supervised DL-MRI trained on fully-sampled data.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07669v1
PDF	https://arxiv.org/pdf/1912.07669v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-learning-of-physics-based
Repo
Framework

Machine Translation Evaluation Meets Community Question Answering


Title	Machine Translation Evaluation Meets Community Question Answering
Authors	Francisco Guzmán, Lluís Màrquez, Preslav Nakov
Abstract	We explore the applicability of machine translation evaluation (MTE) methods to a very different problem: answer ranking in community Question Answering. In particular, we adopt a pairwise neural network (NN) architecture, which incorporates MTE features, as well as rich syntactic and semantic embeddings, and which efficiently models complex non-linear interactions. The evaluation results show state-of-the-art performance, with sizeable contribution from both the MTE features and from the pairwise NN architecture.
Tasks	Community Question Answering, Machine Translation, Question Answering
Published	2019-12-06
URL	https://arxiv.org/abs/1912.02998v1
PDF	https://arxiv.org/pdf/1912.02998v1.pdf
PWC	https://paperswithcode.com/paper/machine-translation-evaluation-meets-1
Repo
Framework

Abnormality Detection in Musculoskeletal Radiographs with Convolutional Neural Networks(Ensembles) and Performance Optimization


Title	Abnormality Detection in Musculoskeletal Radiographs with Convolutional Neural Networks(Ensembles) and Performance Optimization
Authors	Dennis Banga, Peter Waiganjo
Abstract	Musculoskeletal conditions affect more than 1.7 billion people worldwide based on a study by Global Burden Disease, and they are the second greatest cause of disability[1,2]. The diagnosis of these conditions vary but mostly physical exams carried out and image tests. There are few imaging and diagnostic experts while there is a huge workload of radiograph examinations which might affect diagnostic accuracy. We built machine learning models to perform abnormality detection using the available musculoskeletal public dataset [3]. Convolutional Neural Networks (CNN) were used as are the most successful models in performing various tasks such as classification and object detection [4]. The development of the models involved theoretical study, iterative prototyping, and empirical evaluation of the results. The current model, 169 layer DenseNet, by Pranav et al.(2018) on the abnormality detection task, the performance was lower than the worst radiologist in 5 out of the 7 studies, and the overall model performance was lower than the best radiologist. We developed the ensemble200 model which scored 0.66 Cohen Kappa which was lower than the DenseNet model (Pranav et al, 2018) but the model performance with the F1 score outperforms the DenseNet model and its Cohen Kappa score variability with the different studies is lower as the best cohen kappa score on the upper extremity studies is 0.7408 (Wrist) and the lowest is (0.5844) hand. The ensemble200 model outperformed DenseNet model on the finger studies with a Cohen Kappa score of 0.653 showing reduced performance variability on the model performance.
Tasks	Anomaly Detection, Object Detection
Published	2019-08-06
URL	https://arxiv.org/abs/1908.02170v1
PDF	https://arxiv.org/pdf/1908.02170v1.pdf
PWC	https://paperswithcode.com/paper/abnormality-detection-in-musculoskeletal
Repo
Framework

Improving image classifiers for small datasets by learning rate adaptations


Title	Improving image classifiers for small datasets by learning rate adaptations
Authors	Sourav Mishra, Toshihiko Yamasaki, Hideaki Imaizumi
Abstract	Our paper introduces an efficient combination of established techniques to improve classifier performance, in terms of accuracy and training time. We achieve two-fold to ten-fold speedup in nearing state of the art accuracy, over different model architectures, by dynamically tuning the learning rate. We find it especially beneficial in the case of a small dataset, where reliability of machine reasoning is lower. We validate our approach by comparing our method versus vanilla training on CIFAR-10. We also demonstrate its practical viability by implementing on an unbalanced corpus of diagnostic images.
Tasks
Published	2019-03-26
URL	http://arxiv.org/abs/1903.10726v2
PDF	http://arxiv.org/pdf/1903.10726v2.pdf
PWC	https://paperswithcode.com/paper/improving-image-classifiers-for-small
Repo
Framework

PoreNet: CNN-based Pore Descriptor for High-resolution Fingerprint Recognition


Title	PoreNet: CNN-based Pore Descriptor for High-resolution Fingerprint Recognition
Authors	Vijay Anand, Vivek Kanhangad
Abstract	With the development of high-resolution fingerprint scanners, high-resolution fingerprint-based biometric recognition has received increasing attention in recent years. This paper presents a pore feature-based approach for biometric recognition. Our approach employs a convolutional neural network (CNN) model, DeepResPore, to detect pores in the input fingerprint image. Thereafter, a CNN-based descriptor is computed for a patch around each detected pore. Specifically, we have designed a residual learning-based CNN, referred to as PoreNet that learns distinctive feature representation from pore patches. For verification, the match score is generated by comparing pore descriptors obtained from a pair of fingerprint images in bi-directional manner using the Euclidean distance. The proposed approach for high-resolution fingerprint recognition achieves 2.91% and 0.57% equal error rates (EERs) on partial (DBI) and complete (DBII) fingerprints of the benchmark PolyU HRF dataset. Most importantly, it achieves lower FMR1000 and FMR10000 values than the current state-of-the-art approach on both the datasets.
Tasks
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06981v2
PDF	https://arxiv.org/pdf/1905.06981v2.pdf
PWC	https://paperswithcode.com/paper/porenet-cnn-based-pore-descriptor-for-high
Repo
Framework