January 30, 2020

3105 words 15 mins read

Paper Group ANR 338

Classification des S{é}ries Temporelles Incertaines par Transformation Shapelet. COSTRA 1.0: A Dataset of Complex Sentence Transformations. Data Efficient Direct Speech-to-Text Translation with Modality Agnostic Meta-Learning. Deep Efficient End-to-end Reconstruction (DEER) Network for Low-dose Few-view Breast CT from Projection Data. The Intrinsic …

Classification des S{é}ries Temporelles Incertaines par Transformation Shapelet


Title	Classification des S{é}ries Temporelles Incertaines par Transformation Shapelet
Authors	Michael Mbouopda, Engelbert Mephu Nguifo
Abstract	Time serie classification is used in a diverse range of domain such as meteorology, medicine and physics. It aims to classify chronological data. Many accurate approaches have been built during the last decade and shapelet transformation is one of them. However, none of these approaches does take data uncertainty into account. Using uncertainty propagation techiniques, we propose a new dissimilarity measure based on euclidean distance. We also show how to use this new measure to adapt shapelet transformation to uncertain time series classification. An experimental assessment of our contribution is done on some state of the art datasets.
Tasks	Time Series, Time Series Classification
Published	2019-12-11
URL	https://arxiv.org/abs/1912.08919v1
PDF	https://arxiv.org/pdf/1912.08919v1.pdf
PWC	https://paperswithcode.com/paper/classification-des-series-temporelles
Repo
Framework

COSTRA 1.0: A Dataset of Complex Sentence Transformations


Title	COSTRA 1.0: A Dataset of Complex Sentence Transformations
Authors	Petra Barančíková, Ondřej Bojar
Abstract	We present COSTRA 1.0, a dataset of complex sentence transformations. The dataset is intended for the study of sentence-level embeddings beyond simple word alternations or standard paraphrasing. This first version of the dataset is limited to sentences in Czech but the construction method is universal and we plan to use it also for other languages. The dataset consist of 4,262 unique sentences with average length of 10 words, illustrating 15 types of modifications such as simplification, generalization, or formal and informal language variation. The hope is that with this dataset, we should be able to test semantic properties of sentence embeddings and perhaps even to find some topologically interesting ‘skeleton’ in the sentence embedding space. A preliminary analysis using LASER, multi-purpose multi-lingual sentence embeddings suggests that the LASER space does not exhibit the desired properties.
Tasks	Sentence Embedding, Sentence Embeddings
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01673v1
PDF	https://arxiv.org/pdf/1912.01673v1.pdf
PWC	https://paperswithcode.com/paper/costra-10-a-dataset-of-complex-sentence
Repo
Framework

Data Efficient Direct Speech-to-Text Translation with Modality Agnostic Meta-Learning


Title	Data Efficient Direct Speech-to-Text Translation with Modality Agnostic Meta-Learning
Authors	Sathish Indurthi, Houjeung Han, Nikhil Kumar Lakumarapu, Beomseok Lee, Insoo Chung, Sangha Kim, Chanwoo Kim
Abstract	End-to-end Speech Translation (ST) models have several advantages such as lower latency, smaller model size, and less error compounding over conventional pipelines that combine Automatic Speech Recognition (ASR) and text Machine Translation (MT) models. However, collecting large amounts of parallel data for ST task is more difficult compared to the ASR and MT tasks. Previous studies have proposed the use of transfer learning approaches to overcome the above difficulty. These approaches benefit from weakly supervised training data, such as ASR speech-to-transcript or MT text-to-text translation pairs. However, the parameters in these models are updated independently of each task, which may lead to sub-optimal solutions. In this work, we adopt a meta-learning algorithm to train a modality agnostic multi-task model that transfers knowledge from source tasks=ASR+MT to target task=ST where ST task severely lacks data. In the meta-learning phase, the parameters of the model are exposed to vast amounts of speech transcripts (e.g., English ASR) and text translations (e.g., English-German MT). During this phase, parameters are updated in such a way to understand speech, text representations, the relation between them, as well as act as a good initialization point for the target ST task. We evaluate the proposed meta-learning approach for ST tasks on English-German (En-De) and English-French (En-Fr) language pairs from the Multilingual Speech Translation Corpus (MuST-C). Our method outperforms the previous transfer learning approaches and sets new state-of-the-art results for En-De and En-Fr ST tasks by obtaining 9.18, and 11.76 BLEU point improvements, respectively.
Tasks	Machine Translation, Meta-Learning, Speech Recognition, Transfer Learning
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04283v1
PDF	https://arxiv.org/pdf/1911.04283v1.pdf
PWC	https://paperswithcode.com/paper/data-efficient-direct-speech-to-text
Repo
Framework

Deep Efficient End-to-end Reconstruction (DEER) Network for Low-dose Few-view Breast CT from Projection Data


Title	Deep Efficient End-to-end Reconstruction (DEER) Network for Low-dose Few-view Breast CT from Projection Data
Authors	Huidong Xie, Hongming Shan, Wenxiang Cong, Xiaohua Zhang, Shaohua Liu, Ruola Ning, Ge Wang
Abstract	Breast CT provides image volumes with isotropic resolution in high contrast, enabling detection of calcification (down to a few hundred microns in size) and subtle density differences. Since breast is sensitive to x-ray radiation, dose reduction of breast CT is an important topic, and for this purpose low-dose few-view scanning is a main approach. In this article, we propose a Deep Efficient End-to-end Reconstruction (DEER) network for low-dose few-view breast CT. The major merits of our network include high dose efficiency, excellent image quality, and low model complexity. By the design, the proposed network can learn the reconstruction process in terms of as less as O(N) parameters, where N is the size of an image to be reconstructed, which represents orders of magnitude improvements relative to the state-of-the-art deep-learning based reconstruction methods that map projection data to tomographic images directly. As a result, our method does not require expensive GPUs to train and run. Also, validated on a cone-beam breast CT dataset prepared by Koning Corporation on a commercial scanner, our method demonstrates competitive performance over the state-of-the-art reconstruction networks in terms of image quality.
Tasks
Published	2019-12-09
URL	https://arxiv.org/abs/1912.04278v2
PDF	https://arxiv.org/pdf/1912.04278v2.pdf
PWC	https://paperswithcode.com/paper/deep-efficient-end-to-end-reconstruction-deer
Repo
Framework

The Intrinsic Properties of Brain Based on the Network Structure


Title	The Intrinsic Properties of Brain Based on the Network Structure
Authors	Xiang Zou, Lie Yao, Donghua Zhao, Liang Chen, Ying Mao
Abstract	Objective: Brain is a fantastic organ that helps creature adapting to the environment. Network is the most essential structure of brain, but the capability of a simple network is still not very clear. In this study, we try to expound some brain functions only by the network property. Methods: Every network can be equivalent to a simplified network, which is expressed by an equation set. The dynamic of the equation set can be described by some basic equations, which is based on the mathematical derivation. Results (1) In a closed network, the stability is based on the excitatory/inhibitory synapse proportion. Spike probabilities in the assembly can meet the solution of a nonlinear equation set. (2) Network activity can spontaneously evolve into a certain distribution under different stimulation, which is closely related to decision making. (3) Short memory can be formed by coupling of network assemblies. Conclusion: The essential property of a network may contribute to some important brain functions.
Tasks	Decision Making
Published	2019-11-02
URL	https://arxiv.org/abs/1911.00640v1
PDF	https://arxiv.org/pdf/1911.00640v1.pdf
PWC	https://paperswithcode.com/paper/the-intrinsic-properties-of-brain-based-on
Repo
Framework

Fast and Robust Shortest Paths on Manifolds Learned from Data


Title	Fast and Robust Shortest Paths on Manifolds Learned from Data
Authors	Georgios Arvanitidis, Søren Hauberg, Philipp Hennig, Michael Schober
Abstract	We propose a fast, simple and robust algorithm for computing shortest paths and distances on Riemannian manifolds learned from data. This amounts to solving a system of ordinary differential equations (ODEs) subject to boundary conditions. Here standard solvers perform poorly because they require well-behaved Jacobians of the ODE, and usually, manifolds learned from data imply unstable and ill-conditioned Jacobians. Instead, we propose a fixed-point iteration scheme for solving the ODE that avoids Jacobians. This enhances the stability of the solver, while reduces the computational cost. In experiments involving both Riemannian metric learning and deep generative models we demonstrate significant improvements in speed and stability over both general-purpose state-of-the-art solvers as well as over specialized solvers.
Tasks	Metric Learning
Published	2019-01-22
URL	http://arxiv.org/abs/1901.07229v1
PDF	http://arxiv.org/pdf/1901.07229v1.pdf
PWC	https://paperswithcode.com/paper/fast-and-robust-shortest-paths-on-manifolds
Repo
Framework

Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization


Title	Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization
Authors	Abhishek Kumar Singh, Manish Gupta, Vasudeva Varma
Abstract	Automated multi-document extractive text summarization is a widely studied research problem in the field of natural language understanding. Such extractive mechanisms compute in some form the worthiness of a sentence to be included into the summary. While the conventional approaches rely on human crafted document-independent features to generate a summary, we develop a data-driven novel summary system called HNet, which exploits the various semantic and compositional aspects latent in a sentence to capture document independent features. The network learns sentence representation in a way that, salient sentences are closer in the vector space than non-salient sentences. This semantic and compositional feature vector is then concatenated with the document-dependent features for sentence ranking. Experiments on the DUC benchmark datasets (DUC-2001, DUC-2002 and DUC-2004) indicate that our model shows significant performance gain of around 1.5-2 points in terms of ROUGE score compared with the state-of-the-art baselines.
Tasks	Text Summarization
Published	2019-12-25
URL	https://arxiv.org/abs/1912.11688v1
PDF	https://arxiv.org/pdf/1912.11688v1.pdf
PWC	https://paperswithcode.com/paper/unity-in-diversity-learning-distributed
Repo
Framework

Learning to Caption Images with Two-Stream Attention and Sentence Auto-Encoder


Title	Learning to Caption Images with Two-Stream Attention and Sentence Auto-Encoder
Authors	Arushi Goel, Basura Fernando, Thanh-Son Nguyen, Hakan Bilen
Abstract	Automatically generating natural language descriptions from an image is a challenging problem in artificial intelligence that requires a good understanding of the correlations between visual and textual cues. To bridge these two modalities, state-of-the-art methods commonly use a dynamic interface between image and text, called attention, that learns to identify related image parts to estimate the next word conditioned on the previous steps. While this mechanism is effective, it fails to find the right associations between visual and textual cues when they are noisy. In this paper we propose two novel approaches to address this issue - (i) a two-stream attention mechanism that can automatically discover latent categories and relate them to image regions based on the previously generated words, (ii) a regularization technique that encapsulates the syntactic and semantic structure of captions and improves the optimization of the image captioning model. Our qualitative and quantitative results demonstrate remarkable improvements on the MSCOCO dataset setting and lead to new state-of-the-art performances for image captioning.
Tasks	Image Captioning
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10082v1
PDF	https://arxiv.org/pdf/1911.10082v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-caption-images-with-two-stream
Repo
Framework

A Simple Deep Personalized Recommendation System


Title	A Simple Deep Personalized Recommendation System
Authors	Pavlos Mitsoulis-Ntompos, Meisam Hejazinia, Serena Zhang, Travis Brady
Abstract	Recommender systems are critical tools to match listings and travelers in two-sided vacation rental marketplaces. Such systems require high capacity to extract user preferences for items from implicit signals at scale. To learn those preferences, we propose a Simple Deep Personalized Recommendation System to compute travelers’ conditional embeddings. Our method combines listing embeddings in a supervised structure to build short-term historical context to personalize recommendations for travelers. Deployed in the production environment, this approach is computationally efficient and scalable, and allows us to capture non-linear dependencies. Our offline evaluation indicates that traveler embeddings created using a Deep Average Network can improve the precision of a downstream conversion prediction model by seven percent, outperforming more complex benchmark methods for online shopping experience personalization.
Tasks	Recommendation Systems
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11336v2
PDF	https://arxiv.org/pdf/1906.11336v2.pdf
PWC	https://paperswithcode.com/paper/a-simple-deep-personalized-recommendation
Repo
Framework

Video Modeling with Correlation Networks


Title	Video Modeling with Correlation Networks
Authors	Heng Wang, Du Tran, Lorenzo Torresani, Matt Feiszli
Abstract	Motion is a salient cue to recognize actions in video. Modern action recognition models leverage motion information either explicitly by using optical flow as input or implicitly by means of 3D convolutional filters that simultaneously capture appearance and motion information. This paper proposes an alternative approach based on a learnable correlation operator that can be used to establish frame-to-frame matches over convolutional feature maps in the different layers of the network. The proposed architecture enables the fusion of this explicit temporal matching information with traditional appearance cues captured by 2D convolution. Our correlation network compares favorably with widely-used 3D CNNs for video modeling, and achieves competitive results over the prominent two-stream network while being much faster to train. We empirically demonstrate that correlation networks produce strong results on a variety of video datasets, and outperform the state of the art on three popular benchmarks for action recognition: Kinetics, Something-Something and Diving48.
Tasks	Optical Flow Estimation
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03349v1
PDF	https://arxiv.org/pdf/1906.03349v1.pdf
PWC	https://paperswithcode.com/paper/video-modeling-with-correlation-networks
Repo
Framework

Recursive Subtree Composition in LSTM-Based Dependency Parsing


Title	Recursive Subtree Composition in LSTM-Based Dependency Parsing
Authors	Miryam de Lhoneux, Miguel Ballesteros, Joakim Nivre
Abstract	The need for tree structure modelling on top of sequence modelling is an open issue in neural dependency parsing. We investigate the impact of adding a tree layer on top of a sequential model by recursively composing subtree representations (composition) in a transition-based parser that uses features extracted by a BiLSTM. Composition seems superfluous with such a model, suggesting that BiLSTMs capture information about subtrees. We perform model ablations to tease out the conditions under which composition helps. When ablating the backward LSTM, performance drops and composition does not recover much of the gap. When ablating the forward LSTM, performance drops less dramatically and composition recovers a substantial part of the gap, indicating that a forward LSTM and composition capture similar information. We take the backward LSTM to be related to lookahead features and the forward LSTM to the rich history-based features both crucial for transition-based parsers. To capture history-based information, composition is better than a forward LSTM on its own, but it is even better to have a forward LSTM as part of a BiLSTM. We correlate results with language properties, showing that the improved lookahead of a backward LSTM is especially important for head-final languages.
Tasks	Dependency Parsing
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09781v1
PDF	http://arxiv.org/pdf/1902.09781v1.pdf
PWC	https://paperswithcode.com/paper/recursive-subtree-composition-in-lstm-based
Repo
Framework

Extracting Conceptual Knowledge from Natural Language Text Using Maximum Likelihood Principle


Title	Extracting Conceptual Knowledge from Natural Language Text Using Maximum Likelihood Principle
Authors	Shipra Sharma, Balwinder Sodhi
Abstract	Domain-specific knowledge graphs constructed from natural language text are ubiquitous in today’s world. In many such scenarios the base text, from which the knowledge graph is constructed, concerns itself with practical, on-hand, actual or ground-reality information about the domain. Product documentation in software engineering domain are one example of such base texts. Other examples include blogs and texts related to digital artifacts, reports on emerging markets and business models, patient medical records, etc. Though the above sources contain a wealth of knowledge about their respective domains, the conceptual knowledge on which they are based is often missing or unclear. Access to this conceptual knowledge can enormously increase the utility of available data and assist in several tasks such as knowledge graph completion, grounding, querying, etc. Our contributions in this paper are twofold. First, we propose a novel Markovian stochastic model for document generation from conceptual knowledge. The uniqueness of our approach lies in the fact that the conceptual knowledge in the writer’s mind forms a component of the parameter set of our stochastic model. Secondly, we solve the inverse problem of learning the best conceptual knowledge from a given document, by finding model parameters which maximize the likelihood of generating the specific document over all possible parameter values. This likelihood maximization is done using an application of Baum-Welch algorithm, which is a known special case of Expectation-Maximization (EM) algorithm. We run our conceptualization algorithm on several well-known natural language sources and obtain very encouraging results. The results of our extensive experiments concur with the hypothesis that the information contained in these sources has a well-defined and rigorous underlying conceptual structure, which can be discovered using our method.
Tasks	Knowledge Graph Completion, Knowledge Graphs
Published	2019-09-19
URL	https://arxiv.org/abs/1909.08927v1
PDF	https://arxiv.org/pdf/1909.08927v1.pdf
PWC	https://paperswithcode.com/paper/extracting-conceptual-knowledge-from-natural
Repo
Framework

Adversarial Example Generation using Evolutionary Multi-objective Optimization


Title	Adversarial Example Generation using Evolutionary Multi-objective Optimization
Authors	Takahiro Suzuki, Shingo Takeshita, Satoshi Ono
Abstract	This paper proposes Evolutionary Multi-objective Optimization (EMO)-based Adversarial Example (AE) design method that performs under black-box setting. Previous gradient-based methods produce AEs by changing all pixels of a target image, while previous EC-based method changes small number of pixels to produce AEs. Thanks to EMO’s property of population based-search, the proposed method produces various types of AEs involving ones locating between AEs generated by the previous two approaches, which helps to know the characteristics of a target model or to know unknown attack patterns. Experimental results showed the potential of the proposed method, e.g., it can generate robust AEs and, with the aid of DCT-based perturbation pattern generation, AEs for high resolution images.
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/2001.05844v1
PDF	https://arxiv.org/pdf/2001.05844v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-example-generation-using
Repo
Framework

Broad Neural Network for Change Detection in Aerial Images


Title	Broad Neural Network for Change Detection in Aerial Images
Authors	Shailesh Shrivastava, Alakh Aggarwal, Pratik Chattopadhyay
Abstract	A change detection system takes as input two images of a region captured at two different times, and predicts which pixels in the region have undergone change over the time period. Since pixel-based analysis can be erroneous due to noise, illumination difference and other factors, contextual information is usually used to determine the class of a pixel (changed or not). This contextual information is taken into account by considering a pixel of the difference image along with its neighborhood. With the help of ground truth information, the labeled patterns are generated. Finally, Broad Learning classifier is used to get prediction about the class of each pixel. Results show that Broad Learning can classify the data set with a significantly higher F-Score than that of Multilayer Perceptron. Performance comparison has also been made with other popular classifiers, namely Multilayer Perceptron and Random Forest.
Tasks
Published	2019-02-28
URL	https://arxiv.org/abs/1903.00087v2
PDF	https://arxiv.org/pdf/1903.00087v2.pdf
PWC	https://paperswithcode.com/paper/broad-neural-network-for-change-detection-in
Repo
Framework

Research Commentary on Recommendations with Side Information: A Survey and Research Directions


Title	Research Commentary on Recommendations with Side Information: A Survey and Research Directions
Authors	Zhu Sun, Qing Guo, Jie Yang, Hui Fang, Guibing Guo, Jie Zhang, Robin Burke
Abstract	Recommender systems have become an essential tool to help resolve the information overload problem in recent decades. Traditional recommender systems, however, suffer from data sparsity and cold start problems. To address these issues, a great number of recommendation algorithms have been proposed to leverage side information of users or items (e.g., social network and item category), demonstrating a high degree of effectiveness in improving recommendation performance. This Research Commentary aims to provide a comprehensive and systematic survey of the recent research on recommender systems with side information. Specifically, we provide an overview of state-of-the-art recommendation algorithms with side information from two orthogonal perspectives. One involves the different methodologies of recommendation: the memory-based methods, latent factor, representation learning, and deep learning models. The others cover different representations of side information, including structural data (flat, network, and hierarchical features, and knowledge graphs); and non-structural data (text, image and video features). Finally, we discuss challenges and provide new potential directions in recommendation, along with the conclusion of this survey.
Tasks	Knowledge Graphs, Recommendation Systems, Representation Learning
Published	2019-09-19
URL	https://arxiv.org/abs/1909.12807v2
PDF	https://arxiv.org/pdf/1909.12807v2.pdf
PWC	https://paperswithcode.com/paper/research-commentary-on-recommendations-with
Repo
Framework