October 20, 2019

3135 words 15 mins read

Paper Group AWR 285

Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures using Spatial Information. Unsupervised Detection of Lesions in Brain MRI using constrained adversarial auto-encoders. Improved Image Segmentation via Cost Minimization of Multiple Hypotheses. Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning. …

Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures using Spatial Information


Title	Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures using Spatial Information
Authors	Efthymios Tzinis, Shrikant Venkataramani, Paris Smaragdis
Abstract	We present a monophonic source separation system that is trained by only observing mixtures with no ground truth separation information. We use a deep clustering approach which trains on multi-channel mixtures and learns to project spectrogram bins to source clusters that correlate with various spatial features. We show that using such a training process we can obtain separation performance that is as good as making use of ground truth separation information. Once trained, this system is capable of performing sound separation on monophonic inputs, despite having learned how to do so using multi-channel recordings.
Tasks	Multi-Speaker Source Separation, Speech Separation
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01531v2
PDF	http://arxiv.org/pdf/1811.01531v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-deep-clustering-for-source
Repo	https://github.com/etzinis/unsupervised_spatial_dc
Framework	none

Unsupervised Detection of Lesions in Brain MRI using constrained adversarial auto-encoders


Title	Unsupervised Detection of Lesions in Brain MRI using constrained adversarial auto-encoders
Authors	Xiaoran Chen, Ender Konukoglu
Abstract	Lesion detection in brain Magnetic Resonance Images (MRI) remains a challenging task. State-of-the-art approaches are mostly based on supervised learning making use of large annotated datasets. Human beings, on the other hand, even non-experts, can detect most abnormal lesions after seeing a handful of healthy brain images. Replicating this capability of using prior information on the appearance of healthy brain structure to detect lesions can help computers achieve human level abnormality detection, specifically reducing the need for numerous labeled examples and bettering generalization of previously unseen lesions. To this end, we study detection of lesion regions in an unsupervised manner by learning data distribution of brain MRI of healthy subjects using auto-encoder based methods. We hypothesize that one of the main limitations of the current models is the lack of consistency in latent representation. We propose a simple yet effective constraint that helps mapping of an image bearing lesion close to its corresponding healthy image in the latent space. We use the Human Connectome Project dataset to learn distribution of healthy-appearing brain MRI and report improved detection, in terms of AUC, of the lesions in the BRATS challenge dataset.
Tasks	Anomaly Detection
Published	2018-06-13
URL	http://arxiv.org/abs/1806.04972v1
PDF	http://arxiv.org/pdf/1806.04972v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-detection-of-lesions-in-brain
Repo	https://github.com/aubreychen9012/cAAE
Framework	tf

Improved Image Segmentation via Cost Minimization of Multiple Hypotheses


Title	Improved Image Segmentation via Cost Minimization of Multiple Hypotheses
Authors	Marc Bosch, Christopher M. Gifford, Austin G. Dress, Clare W. Lau, Jeffrey G. Skibo, Gordon A. Christie
Abstract	Image segmentation is an important component of many image understanding systems. It aims to group pixels in a spatially and perceptually coherent manner. Typically, these algorithms have a collection of parameters that control the degree of over-segmentation produced. It still remains a challenge to properly select such parameters for human-like perceptual grouping. In this work, we exploit the diversity of segments produced by different choices of parameters. We scan the segmentation parameter space and generate a collection of image segmentation hypotheses (from highly over-segmented to under-segmented). These are fed into a cost minimization framework that produces the final segmentation by selecting segments that: (1) better describe the natural contours of the image, and (2) are more stable and persistent among all the segmentation hypotheses. We compare our algorithm’s performance with state-of-the-art algorithms, showing that we can achieve improved results. We also show that our framework is robust to the choice of segmentation kernel that produces the initial set of hypotheses.
Tasks	Semantic Segmentation
Published	2018-01-31
URL	http://arxiv.org/abs/1802.00088v1
PDF	http://arxiv.org/pdf/1802.00088v1.pdf
PWC	https://paperswithcode.com/paper/improved-image-segmentation-via-cost
Repo	https://github.com/pubgeo/cmmh_segmentation
Framework	none

Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning


Title	Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning
Authors	Hung Le, Truyen Tran, Svetha Venkatesh
Abstract	One of the core tasks in multi-view learning is to capture relations among views. For sequential data, the relations not only span across views, but also extend throughout the view length to form long-term intra-view and inter-view interactions. In this paper, we present a new memory augmented neural network model that aims to model these complex interactions between two asynchronous sequential views. Our model uses two encoders for reading from and writing to two external memories for encoding input views. The intra-view interactions and the long-term dependencies are captured by the use of memories during this encoding process. There are two modes of memory accessing in our system: late-fusion and early-fusion, corresponding to late and early inter-view interactions. In the late-fusion mode, the two memories are separated, containing only view-specific contents. In the early-fusion mode, the two memories share the same addressing space, allowing cross-memory accessing. In both cases, the knowledge from the memories will be combined by a decoder to make predictions over the output space. The resulting dual memory neural computer is demonstrated on a comprehensive set of experiments, including a synthetic task of summing two sequences and the tasks of drug prescription and disease progression in healthcare. The results demonstrate competitive performance over both traditional algorithms and deep learning methods designed for multi-view problems.
Tasks	MULTI-VIEW LEARNING
Published	2018-02-02
URL	http://arxiv.org/abs/1802.00662v2
PDF	http://arxiv.org/pdf/1802.00662v2.pdf
PWC	https://paperswithcode.com/paper/dual-memory-neural-computer-for-asynchronous
Repo	https://github.com/thaihungle/DMNC
Framework	tf

Geometric Constellation Shaping for Fiber Optic Communication Systems via End-to-end Learning


Title	Geometric Constellation Shaping for Fiber Optic Communication Systems via End-to-end Learning
Authors	Rasmus T. Jones, Tobias A. Eriksson, Metodi P. Yankov, Benjamin J. Puttnam, Georg Rademacher, Ruben S. Luis, Darko Zibar
Abstract	In this paper, an unsupervised machine learning method for geometric constellation shaping is investigated. By embedding a differentiable fiber channel model within two neural networks, the learning algorithm is optimizing for a geometric constellation shape. The learned constellations yield improved performance to state-of-the-art geometrically shaped constellations, and include an implicit trade-off between amplification noise and nonlinear effects. Further, the method allows joint optimization of system parameters, such as the optimal launch power, simultaneously with the constellation shape. An experimental demonstration validates the findings. Improved performances are reported, up to 0.13 bit/4D in simulation and experimentally up to 0.12 bit/4D.
Tasks
Published	2018-10-01
URL	http://arxiv.org/abs/1810.00774v1
PDF	http://arxiv.org/pdf/1810.00774v1.pdf
PWC	https://paperswithcode.com/paper/geometric-constellation-shaping-for-fiber
Repo	https://github.com/Rassibassi/claude
Framework	tf

Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks


Title	Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks
Authors	Yu Shi, Qi Zhu, Fang Guo, Chao Zhang, Jiawei Han
Abstract	Heterogeneous information networks (HINs) are ubiquitous in real-world applications. In the meantime, network embedding has emerged as a convenient tool to mine and learn from networked data. As a result, it is of interest to develop HIN embedding methods. However, the heterogeneity in HINs introduces not only rich information but also potentially incompatible semantics, which poses special challenges to embedding learning in HINs. With the intention to preserve the rich yet potentially incompatible information in HIN embedding, we propose to study the problem of comprehensive transcription of heterogeneous information networks. The comprehensive transcription of HINs also provides an easy-to-use approach to unleash the power of HINs, since it requires no additional supervision, expertise, or feature engineering. To cope with the challenges in the comprehensive transcription of HINs, we propose the HEER algorithm, which embeds HINs via edge representations that are further coupled with properly-learned heterogeneous metrics. To corroborate the efficacy of HEER, we conducted experiments on two large-scale real-words datasets with an edge reconstruction task and multiple case studies. Experiment results demonstrate the effectiveness of the proposed HEER model and the utility of edge representations and heterogeneous metrics. The code and data are available at https://github.com/GentleZhu/HEER.
Tasks	Feature Engineering, Network Embedding
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03490v1
PDF	http://arxiv.org/pdf/1807.03490v1.pdf
PWC	https://paperswithcode.com/paper/easing-embedding-learning-by-comprehensive
Repo	https://github.com/GentleZhu/HEER
Framework	pytorch

Investigating Rumor News Using Agreement-Aware Search


Title	Investigating Rumor News Using Agreement-Aware Search
Authors	Jingbo Shang, Tianhang Sun, Jiaming Shen, Xingbang Liu, Anja Gruenheid, Flip Korn, Adam Lelkes, Cong Yu, Jiawei Han
Abstract	Recent years have witnessed a widespread increase of rumor news generated by humans and machines. Therefore, tools for investigating rumor news have become an urgent necessity. One useful function of such tools is to see ways a specific topic or event is represented by presenting different points of view from multiple sources. In this paper, we propose Maester, a novel agreement-aware search framework for investigating rumor news. Given an investigative question, Maester will retrieve related articles to that question, assign and display top articles from agree, disagree, and discuss categories to users. Splitting the results into these three categories provides the user a holistic view towards the investigative question. We build Maester based on the following two key observations: (1) relatedness can commonly be determined by keywords and entities occurring in both questions and articles, and (2) the level of agreement between the investigative question and the related news article can often be decided by a few key sentences. Accordingly, we use gradient boosting tree models with keyword/entity matching features for relatedness detection, and leverage recurrent neural network to infer the level of agreement. Our experiments on the Fake News Challenge (FNC) dataset demonstrate up to an order of magnitude improvement of Maester over the original FNC winning solution, for agreement-aware search.
Tasks
Published	2018-02-21
URL	https://arxiv.org/abs/1802.07398v2
PDF	https://arxiv.org/pdf/1802.07398v2.pdf
PWC	https://paperswithcode.com/paper/investigating-rumor-news-using-agreement
Repo	https://github.com/shangjingbo1226/Maester
Framework	none

Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning


Title	Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning
Authors	Charles Schaff, David Yunis, Ayan Chakrabarti, Matthew R. Walter
Abstract	The physical design of a robot and the policy that controls its motion are inherently coupled, and should be determined according to the task and environment. In an increasing number of applications, data-driven and learning-based approaches, such as deep reinforcement learning, have proven effective at designing control policies. For most tasks, the only way to evaluate a physical design with respect to such control policies is empirical–i.e., by picking a design and training a control policy for it. Since training these policies is time-consuming, it is computationally infeasible to train separate policies for all possible designs as a means to identify the best one. In this work, we address this limitation by introducing a method that performs simultaneous joint optimization of the physical design and control network. Our approach maintains a distribution over designs and uses reinforcement learning to optimize a control policy to maximize expected reward over the design distribution. We give the controller access to design parameters to allow it to tailor its policy to each design in the distribution. Throughout training, we shift the distribution towards higher-performing designs, eventually converging to a design and control policy that are jointly optimal. We evaluate our approach in the context of legged locomotion, and demonstrate that it discovers novel designs and walking gaits, outperforming baselines in both performance and efficiency.
Tasks
Published	2018-01-04
URL	http://arxiv.org/abs/1801.01432v3
PDF	http://arxiv.org/pdf/1801.01432v3.pdf
PWC	https://paperswithcode.com/paper/jointly-learning-to-construct-and-control
Repo	https://github.com/wangzizhao/Jointly-Learning-to-Construct-and-Control-Agents-using-Deep-Reinforcement-Learning
Framework	tf

HATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification


Title	HATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification
Authors	Amos Sironi, Manuele Brambilla, Nicolas Bourdis, Xavier Lagorce, Ryad Benosman
Abstract	Event-based cameras have recently drawn the attention of the Computer Vision community thanks to their advantages in terms of high temporal resolution, low power consumption and high dynamic range, compared to traditional frame-based cameras. These properties make event-based cameras an ideal choice for autonomous vehicles, robot navigation or UAV vision, among others. However, the accuracy of event-based object classification algorithms, which is of crucial importance for any reliable system working in real-world conditions, is still far behind their frame-based counterparts. Two main reasons for this performance gap are: 1. The lack of effective low-level representations and architectures for event-based object classification and 2. The absence of large real-world event-based datasets. In this paper we address both problems. First, we introduce a novel event-based feature representation together with a new machine learning architecture. Compared to previous approaches, we use local memory units to efficiently leverage past temporal information and build a robust event-based representation. Second, we release the first large real-world event-based dataset for object classification. We compare our method to the state-of-the-art with extensive experiments, showing better classification performance and real-time computation.
Tasks	Autonomous Vehicles, Object Classification, Robot Navigation
Published	2018-03-21
URL	http://arxiv.org/abs/1803.07913v1
PDF	http://arxiv.org/pdf/1803.07913v1.pdf
PWC	https://paperswithcode.com/paper/hats-histograms-of-averaged-time-surfaces-for
Repo	https://github.com/muzishen/awesome-vehicle_reid-dataset
Framework	none

Targeted Syntactic Evaluation of Language Models


Title	Targeted Syntactic Evaluation of Language Models
Authors	Rebecca Marvin, Tal Linzen
Abstract	We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, each consisting of a grammatical and an ungrammatical sentence. The sentence pairs represent different variations of structure-sensitive phenomena: subject-verb agreement, reflexive anaphora and negative polarity items. We expect a language model to assign a higher probability to the grammatical sentence than the ungrammatical one. In an experiment using this data set, an LSTM language model performed poorly on many of the constructions. Multi-task training with a syntactic objective (CCG supertagging) improved the LSTM’s accuracy, but a large gap remained between its performance and the accuracy of human participants recruited online. This suggests that there is considerable room for improvement over LSTMs in capturing syntax in a language model.
Tasks	CCG Supertagging, Language Modelling
Published	2018-08-27
URL	http://arxiv.org/abs/1808.09031v1
PDF	http://arxiv.org/pdf/1808.09031v1.pdf
PWC	https://paperswithcode.com/paper/targeted-syntactic-evaluation-of-language
Repo	https://github.com/icewing1996/bert-syntax
Framework	none

A Unified Model with Structured Output for Fashion Images Classification


Title	A Unified Model with Structured Output for Fashion Images Classification
Authors	Beatriz Quintino Ferreira, Luís Baía, João Faria, Ricardo Gamelas Sousa
Abstract	A picture is worth a thousand words. Albeit a clich'e, for the fashion industry, an image of a clothing piece allows one to perceive its category (e.g., dress), sub-category (e.g., day dress) and properties (e.g., white colour with floral patterns). The seasonal nature of the fashion industry creates a highly dynamic and creative domain with evermore data, making it unpractical to manually describe a large set of images (of products). In this paper, we explore the concept of visual recognition for fashion images through an end-to-end architecture embedding the hierarchical nature of the annotations directly into the model. Towards that goal, and inspired by the work of [7], we have modified and adapted the original architecture proposal. Namely, we have removed the message passing layer symmetry to cope with Farfetch category tree, added extra layers for hierarchy level specificity, and moved the message passing layer into an enriched latent space. We compare the proposed unified architecture against state-of-the-art models and demonstrate the performance advantage of our model for structured multi-level categorization on a dataset of about 350k fashion product images.
Tasks
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09445v1
PDF	http://arxiv.org/pdf/1806.09445v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-model-with-structured-output-for
Repo	https://github.com/dpaddon/product_image_categorisation
Framework	none


Title	ViTac: Feature Sharing between Vision and Tactile Sensing for Cloth Texture Recognition
Authors	Shan Luo, Wenzhen Yuan, Edward Adelson, Anthony G. Cohn, Raul Fuentes
Abstract	Vision and touch are two of the important sensing modalities for humans and they offer complementary information for sensing the environment. Robots could also benefit from such multi-modal sensing ability. In this paper, addressing for the first time (to the best of our knowledge) texture recognition from tactile images and vision, we propose a new fusion method named Deep Maximum Covariance Analysis (DMCA) to learn a joint latent space for sharing features through vision and tactile sensing. The features of camera images and tactile data acquired from a GelSight sensor are learned by deep neural networks. But the learned features are of a high dimensionality and are redundant due to the differences between the two sensing modalities, which deteriorates the perception performance. To address this, the learned features are paired using maximum covariance analysis. Results of the algorithm on a newly collected dataset of paired visual and tactile data relating to cloth textures show that a good recognition performance of greater than 90% can be achieved by using the proposed DMCA framework. In addition, we find that the perception performance of either vision or tactile sensing can be improved by employing the shared representation space, compared to learning from unimodal data.
Tasks
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07490v2
PDF	http://arxiv.org/pdf/1802.07490v2.pdf
PWC	https://paperswithcode.com/paper/vitac-feature-sharing-between-vision-and
Repo	https://github.com/jettdlee/vis_tac_cross_modal
Framework	tf

SAI, a Sensible Artificial Intelligence that plays Go


Title	SAI, a Sensible Artificial Intelligence that plays Go
Authors	Francesco Morandin, Gianluca Amato, Rosa Gini, Carlo Metta, Maurizio Parton, Gian-Carlo Pascutto
Abstract	We propose a multiple-komi modification of the AlphaGo Zero/Leela Zero paradigm. The winrate as a function of the komi is modeled with a two-parameters sigmoid function, so that the neural network must predict just one more variable to assess the winrate for all komi values. A second novel feature is that training is based on self-play games that occasionally branch – with changed komi – when the position is uneven. With this setting, reinforcement learning is showed to work on 7x7 Go, obtaining very strong playing agents. As a useful byproduct, the sigmoid parameters given by the network allow to estimate the score difference on the board, and to evaluate how much the game is decided.
Tasks
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03928v2
PDF	http://arxiv.org/pdf/1809.03928v2.pdf
PWC	https://paperswithcode.com/paper/sai-a-sensible-artificial-intelligence-that
Repo	https://github.com/sai-dev/sai
Framework	none

Tangent-Space Regularization for Neural-Network Models of Dynamical Systems


Title	Tangent-Space Regularization for Neural-Network Models of Dynamical Systems
Authors	Fredrik Bagge Carlson, Rolf Johansson, Anders Robertsson
Abstract	This work introduces the concept of tangent space regularization for neural-network models of dynamical systems. The tangent space to the dynamics function of many physical systems of interest in control applications exhibits useful properties, e.g., smoothness, motivating regularization of the model Jacobian along system trajectories using assumptions on the tangent space of the dynamics. Without assumptions, large amounts of training data are required for a neural network to learn the full non-linear dynamics without overfitting. We compare different network architectures on one-step prediction and simulation performance and investigate the propensity of different architectures to learn models with correct input-output Jacobian. Furthermore, the influence of $L_2$ weight regularization on the learned Jacobian eigenvalue spectrum, and hence system stability, is investigated.
Tasks
Published	2018-06-26
URL	http://arxiv.org/abs/1806.09919v1
PDF	http://arxiv.org/pdf/1806.09919v1.pdf
PWC	https://paperswithcode.com/paper/tangent-space-regularization-for-neural
Repo	https://github.com/baggepinnen/JacProp.jl
Framework	none

Image Completion on CIFAR-10


Title	Image Completion on CIFAR-10
Authors	Mason Swofford
Abstract	This project performed image completion on CIFAR-10, a dataset of 60,000 32x32 RGB images, using three different neural network architectures: fully convolutional networks, convolutional networks with fully connected layers, and encoder-decoder convolutional networks. The highest performing model was a deep fully convolutional network, which was able to achieve a mean squared error of .015 when comparing the original image pixel values with the predicted pixel values. As well, this network was able to output in-painted images which appeared real to the human eye.
Tasks
Published	2018-10-07
URL	http://arxiv.org/abs/1810.03213v1
PDF	http://arxiv.org/pdf/1810.03213v1.pdf
PWC	https://paperswithcode.com/paper/image-completion-on-cifar-10
Repo	https://github.com/mswoff/Image-Completion-on-CIFAR-10
Framework	tf