February 1, 2020

3039 words 15 mins read

Paper Group AWR 234

FoodX-251: A Dataset for Fine-grained Food Classification. On Dimensional Linguistic Properties of the Word Embedding Space. A Full Probabilistic Model for Yes/No Type Crowdsourcing in Multi-Class Classification. Tracing Network Evolution Using the PARAFAC2 Model. Tensor Canonical Correlation Analysis. Automatic Detection of Protective Behavior in …

FoodX-251: A Dataset for Fine-grained Food Classification


Title	FoodX-251: A Dataset for Fine-grained Food Classification
Authors	Parneet Kaur, Karan Sikka, Weijun Wang, Serge Belongie, Ajay Divakaran
Abstract	Food classification is a challenging problem due to the large number of categories, high visual similarity between different foods, as well as the lack of datasets for training state-of-the-art deep models. Solving this problem will require advances in both computer vision models as well as datasets for evaluating these models. In this paper we focus on the second aspect and introduce FoodX-251, a dataset of 251 fine-grained food categories with 158k images collected from the web. We use 118k images as a training set and provide human verified labels for 40k images that can be used for validation and testing. In this work, we outline the procedure of creating this dataset and provide relevant baselines with deep learning models. The FoodX-251 dataset has been used for organizing iFood-2019 challenge in the Fine-Grained Visual Categorization workshop (FGVC6 at CVPR 2019) and is available for download.
Tasks	Fine-Grained Visual Categorization
Published	2019-07-14
URL	https://arxiv.org/abs/1907.06167v1
PDF	https://arxiv.org/pdf/1907.06167v1.pdf
PWC	https://paperswithcode.com/paper/foodx-251-a-dataset-for-fine-grained-food
Repo	https://github.com/karansikka1/iFood_2019
Framework	none

On Dimensional Linguistic Properties of the Word Embedding Space


Title	On Dimensional Linguistic Properties of the Word Embedding Space
Authors	Vikas Raunak, Vaibhav Kumar, Vivek Gupta, Florian Metze
Abstract	Word embeddings have become a staple of several natural language processing tasks, yet much remains to be understood about their properties. In this work, we analyze word embeddings in terms of their principal components and arrive at a number of novel conclusions. In particular, we characterize the utility of variance explained by the principal components (widely used as a fundamental tool to assess the quality of the resulting representations) as a proxy for downstream performance. Further, through dimensional linguistic probing of the embedding space, we show that the syntactic information captured by a principal component does not depend on the amount of variance it explains. Consequently, we investigate the limitations of variance based embedding post-processing techniques and demonstrate that such post-processing is counter-productive in a number of scenarios such as sentence classification and machine translation tasks. Finally, we offer a few guidelines on variance based embedding post-processing. We have released the source code along with the paper.
Tasks	Machine Translation, Sentence Classification, Word Embeddings
Published	2019-10-05
URL	https://arxiv.org/abs/1910.02211v1
PDF	https://arxiv.org/pdf/1910.02211v1.pdf
PWC	https://paperswithcode.com/paper/on-dimensional-linguistic-properties-of-the
Repo	https://github.com/vyraun/Half-Size
Framework	none

A Full Probabilistic Model for Yes/No Type Crowdsourcing in Multi-Class Classification


Title	A Full Probabilistic Model for Yes/No Type Crowdsourcing in Multi-Class Classification
Authors	Belen Saldias, Pavlos Protopapas, Karim Pichara
Abstract	Crowdsourcing has become widely used in supervised scenarios where training sets are scarce and difficult to obtain. Most crowdsourcing models in the literature assume labelers can provide answers to full questions. In classification contexts, full questions require a labeler to discern among all possible classes. Unfortunately, discernment is not always easy in realistic scenarios. Labelers may not be experts in differentiating all classes. In this work, we provide a full probabilistic model for a shorter type of queries. Our shorter queries only require “yes” or “no” responses. Our model estimates a joint posterior distribution of matrices related to labelers’ confusions and the posterior probability of the class of every object. We developed an approximate inference approach, using Monte Carlo Sampling and Black Box Variational Inference, which provides the derivation of the necessary gradients. We built two realistic crowdsourcing scenarios to test our model. The first scenario queries for irregular astronomical time-series. The second scenario relies on the image classification of animals. We achieved results that are comparable with those of full query crowdsourcing. Furthermore, we show that modeling labelers’ failures plays an important role in estimating true classes. Finally, we provide the community with two real datasets obtained from our crowdsourcing experiments. All our code is publicly available.
Tasks	Image Classification, Time Series
Published	2019-01-02
URL	https://arxiv.org/abs/1901.00397v3
PDF	https://arxiv.org/pdf/1901.00397v3.pdf
PWC	https://paperswithcode.com/paper/a-full-probabilistic-model-for-yesno-type
Repo	https://github.com/bcsaldias/yes-no-crowdsourcing
Framework	none

Tracing Network Evolution Using the PARAFAC2 Model


Title	Tracing Network Evolution Using the PARAFAC2 Model
Authors	Marie Roald, Suchita Bhinge, Chunying Jia, Vince Calhoun, Tülay Adalı, Evrim Acar
Abstract	Characterizing time-evolving networks is a challenging task, but it is crucial for understanding the dynamic behavior of complex systems such as the brain. For instance, how spatial networks of functional connectivity in the brain evolve during a task is not well-understood. A traditional approach in neuroimaging data analysis is to make simplifications through the assumption of static spatial networks. In this paper, without assuming static networks in time and/or space, we arrange the temporal data as a higher-order tensor and use a tensor factorization model called PARAFAC2 to capture underlying patterns (spatial networks) in time-evolving data and their evolution. Numerical experiments on simulated data demonstrate that PARAFAC2 can successfully reveal the underlying networks and their dynamics. We also show the promising performance of the model in terms of tracing the evolution of task-related functional connectivity in the brain through the analysis of functional magnetic resonance imaging data.
Tasks
Published	2019-10-23
URL	https://arxiv.org/abs/1911.02926v1
PDF	https://arxiv.org/pdf/1911.02926v1.pdf
PWC	https://paperswithcode.com/paper/tracing-network-evolution-using-the-parafac2
Repo	https://github.com/marieroald/ICASSP20
Framework	none

Tensor Canonical Correlation Analysis


Title	Tensor Canonical Correlation Analysis
Authors	You-Lin Chen, Mladen Kolar, Ruey S. Tsay
Abstract	In many applications, such as classification of images or videos, it is of interest to develop a framework for tensor data instead of ad-hoc way of transforming data to vectors due to the computational and under-sampling issues. In this paper, we study canonical correlation analysis by extending the framework of two dimensional analysis (Lee and Choi, 2007) to tensor-valued data. Instead of adopting the iterative algorithm provided in Lee and Choi (2007), we propose an efficient algorithm, called the higher-order power method, which is commonly used in tensor decomposition and more efficient for large-scale setting. Moreover, we carefully examine theoretical properties of our algorithm and establish a local convergence property via the theory of Lojasiewicz’s inequalities. Our results fill a missing, but crucial, part in the literature on tensor data. For practical applications, we further develop (a) an inexact updating scheme which allows us to use the state-of-the-art stochastic gradient descent algorithm, (b) an effective initialization scheme which alleviates the problem of local optimum in non-convex optimization, and (c) an extension for extracting several canonical components. Empirical analyses on challenging data including gene expression, air pollution indexes in Taiwan, and electricity demand in Australia, show the effectiveness and efficiency of the proposed methodology.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05358v2
PDF	https://arxiv.org/pdf/1906.05358v2.pdf
PWC	https://paperswithcode.com/paper/tensor-canonical-correlation-analysis
Repo	https://github.com/youlinchen/TCCA
Framework	none

Automatic Detection of Protective Behavior in Chronic Pain Physical Rehabilitation: A Recurrent Neural Network Approach


Title	Automatic Detection of Protective Behavior in Chronic Pain Physical Rehabilitation: A Recurrent Neural Network Approach
Authors	Chongyang Wang, Temitayo A. Olugbade, Akhil Mathur, Amanda C. De C. Williams, Nicholas D. Lane, Nadia Bianchi-Berthouze
Abstract	In chronic pain physical rehabilitation, physiotherapists adapt movement to current performance of patients especially based on the expression of protective behavior, gradually exposing them to feared but harmless and essential everyday movements. As physical rehabilitation moves outside the clinic, physical rehabilitation technology needs to automatically detect such behaviors so as to provide similar personalized support. In this paper, we investigate the use of a Long Short-Term Memory (LSTM) network, which we call Protect-LSTM, to detect events of protective behavior, based on motion capture and electromyography data of healthy people and people with chronic low back pain engaged in five everyday movements. Differently from previous work on the same dataset, we aim to continuously detect protective behavior within a movement rather than overall estimate the presence of such behavior. The Protect-LSTM reaches best average F1 score of 0.815 with leave-one-subject-out (LOSO) validation, using low level features, better than other algorithms. Performances increase for some movements when modelled separately (mean F1 scores: bending=0.77, standing on one leg=0.81, sit-to-stand=0.72, stand-to-sit=0.83, reaching forward=0.67). These results reach excellent level of agreement with the average ratings of physiotherapists. As such, the results show clear potential for in-home technology supported affect-based personalized physical rehabilitation.
Tasks	Motion Capture
Published	2019-02-24
URL	http://arxiv.org/abs/1902.08990v1
PDF	http://arxiv.org/pdf/1902.08990v1.pdf
PWC	https://paperswithcode.com/paper/automatic-detection-of-protective-behavior-in
Repo	https://github.com/CodeShareBot/BodyAttentionNetwork
Framework	none

Unsupervised Recurrent Neural Network Grammars


Title	Unsupervised Recurrent Neural Network Grammars
Authors	Yoon Kim, Alexander M. Rush, Lei Yu, Adhiguna Kuncoro, Chris Dyer, Gábor Melis
Abstract	Recurrent neural network grammars (RNNG) are generative models of language which jointly model syntax and surface structure by incrementally generating a syntax tree and sentence in a top-down, left-to-right order. Supervised RNNGs achieve strong language modeling and parsing performance, but require an annotated corpus of parse trees. In this work, we experiment with unsupervised learning of RNNGs. Since directly marginalizing over the space of latent trees is intractable, we instead apply amortized variational inference. To maximize the evidence lower bound, we develop an inference network parameterized as a neural CRF constituency parser. On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese. On constituency grammar induction, they are competitive with recent neural language models that induce tree structures from words through attention mechanisms.
Tasks	Constituency Grammar Induction, Language Modelling
Published	2019-04-07
URL	https://arxiv.org/abs/1904.03746v6
PDF	https://arxiv.org/pdf/1904.03746v6.pdf
PWC	https://paperswithcode.com/paper/unsupervised-recurrent-neural-network
Repo	https://github.com/harvardnlp/urnng
Framework	pytorch

DeepMoD: Deep learning for Model Discovery in noisy data


Title	DeepMoD: Deep learning for Model Discovery in noisy data
Authors	Gert-Jan Both, Subham Choudhury, Pierre Sens, Remy Kusters
Abstract	We introduce DeepMoD, a Deep learning based Model Discovery algorithm. DeepMoD discovers the partial differential equation underlying a spatio-temporal data set using sparse regression on a library of possible functions and their derivatives. A neural network approximates the data and constructs the function library, but it also performs the sparse regression. This construction makes it extremely robust to noise, applicable to small data sets, and, contrary to other deep learning methods, does not require a training set. We benchmark our approach on several physical problems such as the Burgers’, Korteweg-de Vries and Keller-Segel equations, and find that it requires as few as $\mathcal{O}(10^2)$ samples and works at noise levels up to $75%$. Motivated by these results, we apply DeepMoD directly on noisy experimental time-series data from a gel electrophoresis experiment and find that it discovers the advection-diffusion equation describing this system.
Tasks	Time Series
Published	2019-04-20
URL	https://arxiv.org/abs/1904.09406v2
PDF	https://arxiv.org/pdf/1904.09406v2.pdf
PWC	https://paperswithcode.com/paper/190409406
Repo	https://github.com/PhIMaL/DeePyMoD_torch
Framework	pytorch

Formal Verification of Input-Output Mappings of Tree Ensembles


Title	Formal Verification of Input-Output Mappings of Tree Ensembles
Authors	John Törnblom, Simin Nadjm-Tehrani
Abstract	Recent advances in machine learning and artificial intelligence are now being considered in safety-critical autonomous systems where software defects may cause severe harm to humans and the environment. Design organizations in these domains are currently unable to provide convincing arguments that their systems are safe to operate when machine learning algorithms are used to implement their software. In this paper, we present an efficient method to extract equivalence classes from decision trees and tree ensembles, and to formally verify that their input-output mappings comply with requirements. The idea is that, given that safety requirements can be traced to desirable properties on system input-output patterns, we can use positive verification outcomes in safety arguments. This paper presents the implementation of the method in the tool VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case studies presented in current literature. We demonstrate that our method is practical for tree ensembles trained on low-dimensional data with up to 25 decision trees and tree depths of up to 20. Our work also studies the limitations of the method with high-dimensional data and preliminarily investigates the trade-off between large number of trees and time taken for verification.
Tasks
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04194v2
PDF	https://arxiv.org/pdf/1905.04194v2.pdf
PWC	https://paperswithcode.com/paper/formal-verification-of-input-output-mappings
Repo	https://github.com/john-tornblom/vote
Framework	none

An Open-World Extension to Knowledge Graph Completion Models


Title	An Open-World Extension to Knowledge Graph Completion Models
Authors	Haseeb Shah, Johannes Villmow, Adrian Ulges, Ulrich Schwanecke, Faisal Shafait
Abstract	We present a novel extension to embedding-based knowledge graph completion models which enables them to perform open-world link prediction, i.e. to predict facts for entities unseen in training based on their textual description. Our model combines a regular link prediction model learned from a knowledge graph with word embeddings learned from a textual corpus. After training both independently, we learn a transformation to map the embeddings of an entity’s name and description to the graph-based embedding space. In experiments on several datasets including FB20k, DBPedia50k and our new dataset FB15k-237-OWE, we demonstrate competitive results. Particularly, our approach exploits the full knowledge graph structure even when textual descriptions are scarce, does not require a joint training on graph and text, and can be applied to any embedding-based link prediction model, such as TransE, ComplEx and DistMult.
Tasks	Knowledge Graph Completion, Link Prediction, Word Embeddings
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08382v1
PDF	https://arxiv.org/pdf/1906.08382v1.pdf
PWC	https://paperswithcode.com/paper/an-open-world-extension-to-knowledge-graph
Repo	https://github.com/haseebs/OWE
Framework	pytorch

Jacobian Adversarially Regularized Networks for Robustness


Title	Jacobian Adversarially Regularized Networks for Robustness
Authors	Alvin Chan, Yi Tay, Yew Soon Ong, Jie Fu
Abstract	Adversarial examples are crafted with imperceptible perturbations with the intent to fool neural networks. Against such attacks, adversarial training and its variants stand as the strongest defense to date. Previous studies have pointed out that robust models that have undergone adversarial training tend to produce more salient and interpretable Jacobian matrices than their non-robust counterparts. A natural question is whether a model trained with an objective to produce salient Jacobian can result in better robustness. This paper answers this question with affirmative empirical results. We propose Jacobian Adversarially Regularized Networks (JARN) as a method to optimize the saliency of a classifier’s Jacobian by adversarially regularizing the model’s Jacobian to resemble natural training images. Image classifiers trained with JARN show improved robust accuracy compared to standard models on the MNIST, SVHN and CIFAR-10 datasets, uncovering a new angle to boost robustness without using adversarial training examples.
Tasks
Published	2019-12-21
URL	https://arxiv.org/abs/1912.10185v2
PDF	https://arxiv.org/pdf/1912.10185v2.pdf
PWC	https://paperswithcode.com/paper/jacobian-adversarially-regularized-networks-1
Repo	https://github.com/alvinchangw/JARN_ICLR2020
Framework	tf

GRU-ODE-Bayes: Continuous modeling of sporadically-observed time series


Title	GRU-ODE-Bayes: Continuous modeling of sporadically-observed time series
Authors	Edward De Brouwer, Jaak Simm, Adam Arany, Yves Moreau
Abstract	Modeling real-world multidimensional time series can be particularly challenging when these are sporadically observed (i.e., sampling is irregular both in time and across dimensions)-such as in the case of clinical patient data. To address these challenges, we propose (1) a continuous-time version of the Gated Recurrent Unit, building upon the recent Neural Ordinary Differential Equations (Chen et al., 2018), and (2) a Bayesian update network that processes the sporadic observations. We bring these two ideas together in our GRU-ODE-Bayes method. We then demonstrate that the proposed method encodes a continuity prior for the latent process and that it can exactly represent the Fokker-Planck dynamics of complex processes driven by a multidimensional stochastic differential equation. Additionally, empirical evaluation shows that our method outperforms the state of the art on both synthetic data and real-world data with applications in healthcare and climate forecast. What is more, the continuity prior is shown to be well suited for low number of samples settings.
Tasks	Irregular Time Series, Multivariate Time Series Forecasting, Time Series
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12374v2
PDF	https://arxiv.org/pdf/1905.12374v2.pdf
PWC	https://paperswithcode.com/paper/gru-ode-bayes-continuous-modeling-of
Repo	https://github.com/edebrouwer/gru_ode_bayes
Framework	pytorch

KaWAT: A Word Analogy Task Dataset for Indonesian


Title	KaWAT: A Word Analogy Task Dataset for Indonesian
Authors	Kemal Kurniawan
Abstract	We introduced KaWAT (Kata Word Analogy Task), a new word analogy task dataset for Indonesian. We evaluated on it several existing pretrained Indonesian word embeddings and embeddings trained on Indonesian online news corpus. We also tested them on two downstream tasks and found that pretrained word embeddings helped either by reducing the training epochs or yielding significant performance gains.
Tasks	Word Embeddings
Published	2019-06-17
URL	https://arxiv.org/abs/1906.09912v1
PDF	https://arxiv.org/pdf/1906.09912v1.pdf
PWC	https://paperswithcode.com/paper/kawat-a-word-analogy-task-dataset-for
Repo	https://github.com/kata-ai/kawat
Framework	none

Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings


Title	Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings
Authors	Yadollah Yaghoobzadeh, Katharina Kann, Timothy J. Hazen, Eneko Agirre, Hinrich Schütze
Abstract	Word embeddings typically represent different meanings of a word in a single conflated vector. Empirical analysis of embeddings of ambiguous words is currently limited by the small size of manually annotated resources and by the fact that word senses are treated as unrelated individual concepts. We present a large dataset based on manual Wikipedia annotations and word senses, where word senses from different words are related by semantic classes. This is the basis for novel diagnostic tests for an embedding’s content: we probe word embeddings for semantic classes and analyze the embedding space by classifying embeddings into semantic classes. Our main findings are: (i) Information about a sense is generally represented well in a single-vector embedding - if the sense is frequent. (ii) A classifier can accurately predict whether a word is single-sense or multi-sense, based only on its embedding. (iii) Although rare senses are not well represented in single-vector embeddings, this does not have negative impact on an NLP application whose performance depends on frequent senses.
Tasks	Word Embeddings
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03608v1
PDF	https://arxiv.org/pdf/1906.03608v1.pdf
PWC	https://paperswithcode.com/paper/probing-for-semantic-classes-diagnosing-the
Repo	https://github.com/yyaghoobzadeh/WIKI-PSE
Framework	none

Enhancing Gradient-based Attacks with Symbolic Intervals


Title	Enhancing Gradient-based Attacks with Symbolic Intervals
Authors	Shiqi Wang, Yizheng Chen, Ahmed Abdou, Suman Jana
Abstract	Recent breakthroughs in defenses against adversarial examples, like adversarial training, make the neural networks robust against various classes of attackers (e.g., first-order gradient-based attacks). However, it is an open question whether the adversarially trained networks are truly robust under unknown attacks. In this paper, we present interval attacks, a new technique to find adversarial examples to evaluate the robustness of neural networks. Interval attacks leverage symbolic interval propagation, a bound propagation technique that can exploit a broader view around the current input to locate promising areas containing adversarial instances, which in turn can be searched with existing gradient-guided attacks. We can obtain such a broader view using sound bound propagation methods to track and over-approximate the errors of the network within given input ranges. Our results show that, on state-of-the-art adversarially trained networks, interval attack can find on average 47% relatively more violations than the state-of-the-art gradient-guided PGD attack.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02282v1
PDF	https://arxiv.org/pdf/1906.02282v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-gradient-based-attacks-with
Repo	https://github.com/tcwangshiqi-columbia/Interval-Attack
Framework	tf