October 20, 2019

3098 words 15 mins read

Paper Group AWR 170

Glioma Segmentation with Cascaded Unet. Recovering affine features from orientation- and scale-invariant ones. PDE-Net 2.0: Learning PDEs from Data with A Numeric-Symbolic Hybrid Deep Network. Multi-Agent Generative Adversarial Imitation Learning. HAQ: Hardware-Aware Automated Quantization with Mixed Precision. Automatically Composing Representatio …

Glioma Segmentation with Cascaded Unet


Title	Glioma Segmentation with Cascaded Unet
Authors	Dmitry Lachinov, Evgeny Vasiliev, Vadim Turlapov
Abstract	MRI analysis takes central position in brain tumor diagnosis and treatment, thus it’s precise evaluation is crucially important. However, it’s 3D nature imposes several challenges, so the analysis is often performed on 2D projections that reduces the complexity, but increases bias. On the other hand, time consuming 3D evaluation, like, segmentation, is able to provide precise estimation of a number of valuable spatial characteristics, giving us understanding about the course of the disease.\newline Recent studies, focusing on the segmentation task, report superior performance of Deep Learning methods compared to classical computer vision algorithms. But still, it remains a challenging problem. In this paper we present deep cascaded approach for automatic brain tumor segmentation. Similar to recent methods for object detection, our implementation is based on neural networks; we propose modifications to the 3D UNet architecture and augmentation strategy to efficiently handle multimodal MRI input, besides this we introduce approach to enhance segmentation quality with context obtained from models of the same topology operating on downscaled data. We evaluate presented approach on BraTS 2018 dataset and discuss results.
Tasks	Brain Tumor Segmentation, Object Detection
Published	2018-10-09
URL	http://arxiv.org/abs/1810.04008v1
PDF	http://arxiv.org/pdf/1810.04008v1.pdf
PWC	https://paperswithcode.com/paper/glioma-segmentation-with-cascaded-unet
Repo	https://github.com/lachinov/brats2018-graphlabunn
Framework	mxnet

Recovering affine features from orientation- and scale-invariant ones


Title	Recovering affine features from orientation- and scale-invariant ones
Authors	Daniel Barath
Abstract	An approach is proposed for recovering affine correspondences (ACs) from orientation- and scale-invariant, e.g. SIFT, features. The method calculates the affine parameters consistent with a pre-estimated epipolar geometry from the point coordinates and the scales and rotations which the feature detector obtains. The closed-form solution is given as the roots of a quadratic polynomial equation, thus having two possible real candidates and fast procedure, i.e. <1 millisecond. It is shown, as a possible application, that using the proposed algorithm allows us to estimate a homography for every single correspondence independently. It is validated both in our synthetic environment and on publicly available real world datasets, that the proposed technique leads to accurate ACs. Also, the estimated homographies have similar accuracy to what the state-of-the-art methods obtain, but due to requiring only a single correspondence, the robust estimation, e.g. by locally optimized RANSAC, is an order of magnitude faster.
Tasks
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03503v1
PDF	http://arxiv.org/pdf/1807.03503v1.pdf
PWC	https://paperswithcode.com/paper/recovering-affine-features-from-orientation
Repo	https://github.com/danini/recovering-affine-features
Framework	none

PDE-Net 2.0: Learning PDEs from Data with A Numeric-Symbolic Hybrid Deep Network


Title	PDE-Net 2.0: Learning PDEs from Data with A Numeric-Symbolic Hybrid Deep Network
Authors	Zichao Long, Yiping Lu, Bin Dong
Abstract	Partial differential equations (PDEs) are commonly derived based on empirical observations. However, recent advances of technology enable us to collect and store massive amount of data, which offers new opportunities for data-driven discovery of PDEs. In this paper, we propose a new deep neural network, called PDE-Net 2.0, to discover (time-dependent) PDEs from observed dynamic data with minor prior knowledge on the underlying mechanism that drives the dynamics. The design of PDE-Net 2.0 is based on our earlier work \cite{Long2018PDE} where the original version of PDE-Net was proposed. PDE-Net 2.0 is a combination of numerical approximation of differential operators by convolutions and a symbolic multi-layer neural network for model recovery. Comparing with existing approaches, PDE-Net 2.0 has the most flexibility and expressive power by learning both differential operators and the nonlinear response function of the underlying PDE model. Numerical experiments show that the PDE-Net 2.0 has the potential to uncover the hidden PDE of the observed dynamics, and predict the dynamical behavior for a relatively long time, even in a noisy environment.
Tasks
Published	2018-11-30
URL	https://arxiv.org/abs/1812.04426v2
PDF	https://arxiv.org/pdf/1812.04426v2.pdf
PWC	https://paperswithcode.com/paper/pde-net-20-learning-pdes-from-data-with-a
Repo	https://github.com/ZichaoLong/aTEAM
Framework	pytorch

Multi-Agent Generative Adversarial Imitation Learning


Title	Multi-Agent Generative Adversarial Imitation Learning
Authors	Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon
Abstract	Imitation learning algorithms can be used to learn a policy from expert demonstrations without access to a reward signal. However, most existing approaches are not applicable in multi-agent settings due to the existence of multiple (Nash) equilibria and non-stationary environments. We propose a new framework for multi-agent imitation learning for general Markov games, where we build upon a generalized notion of inverse reinforcement learning. We further introduce a practical multi-agent actor-critic algorithm with good empirical performance. Our method can be used to imitate complex behaviors in high-dimensional environments with multiple cooperative or competing agents.
Tasks	Imitation Learning
Published	2018-07-26
URL	http://arxiv.org/abs/1807.09936v1
PDF	http://arxiv.org/pdf/1807.09936v1.pdf
PWC	https://paperswithcode.com/paper/multi-agent-generative-adversarial-imitation
Repo	https://github.com/ermongroup/multiagent-gail
Framework	none

HAQ: Hardware-Aware Automated Quantization with Mixed Precision


Title	HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Authors	Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
Abstract	Model quantization is a widely used technique to compress and accelerate deep neural network (DNN) inference. Emergent DNN hardware accelerators begin to support mixed precision (1-8 bits) to further improve the computation efficiency, which raises a great challenge to find the optimal bitwidth for each layer: it requires domain experts to explore the vast design space trading off among accuracy, latency, energy, and model size, which is both time-consuming and sub-optimal. Conventional quantization algorithm ignores the different hardware architectures and quantizes all the layers in a uniform way. In this paper, we introduce the Hardware-Aware Automated Quantization (HAQ) framework which leverages the reinforcement learning to automatically determine the quantization policy, and we take the hardware accelerator’s feedback in the design loop. Rather than relying on proxy signals such as FLOPs and model size, we employ a hardware simulator to generate direct feedback signals (latency and energy) to the RL agent. Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures. Our framework effectively reduced the latency by 1.4-1.95x and the energy consumption by 1.9x with negligible loss of accuracy compared with the fixed bitwidth (8 bits) quantization. Our framework reveals that the optimal policies on different hardware architectures (i.e., edge and cloud architectures) under different resource constraints (i.e., latency, energy and model size) are drastically different. We interpreted the implication of different quantization policies, which offer insights for both neural network architecture design and hardware architecture design.
Tasks	Quantization
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08886v3
PDF	http://arxiv.org/pdf/1811.08886v3.pdf
PWC	https://paperswithcode.com/paper/haq-hardware-aware-automated-quantization
Repo	https://github.com/mit-han-lab/once-for-all
Framework	pytorch

Automatically Composing Representation Transformations as a Means for Generalization


Title	Automatically Composing Representation Transformations as a Means for Generalization
Authors	Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths
Abstract	A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning – either training a separate learner per task or training a single learner for all tasks – both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution. This paper introduces the compositional problem graph as a broadly applicable formalism to relate tasks of different complexity in terms of problems with shared subproblems. We propose the compositional generalization problem for measuring how readily old knowledge can be reused and hence built upon. As a first step for tackling compositional generalization, we introduce the compositional recursive learner, a domain-general framework for learning algorithmic procedures for composing representation transformations, producing a learner that reasons about what computation to execute by making analogies to previously seen problems. We show on a symbolic and a high-dimensional domain that our compositional approach can generalize to more complex problems than the learner has previously encountered, whereas baselines that are not explicitly compositional do not.
Tasks	Decision Making
Published	2018-07-12
URL	https://arxiv.org/abs/1807.04640v2
PDF	https://arxiv.org/pdf/1807.04640v2.pdf
PWC	https://paperswithcode.com/paper/automatically-composing-representation
Repo	https://github.com/mbchang/crl
Framework	pytorch

Detecting and counting tiny faces


Title	Detecting and counting tiny faces
Authors	Alexandre Attia, Sharone Dayan
Abstract	Finding Tiny Faces (by Hu and Ramanan) proposes a novel approach to find small objects in an image. Our contribution consists in deeply understanding the choices of the paper together with applying and extending a similar method to a real world subject which is the counting of people in a public demonstration.
Tasks
Published	2018-01-19
URL	http://arxiv.org/abs/1801.06504v2
PDF	http://arxiv.org/pdf/1801.06504v2.pdf
PWC	https://paperswithcode.com/paper/detecting-and-counting-tiny-faces
Repo	https://github.com/alexattia/ExtendedTinyFaces
Framework	tf

Unsupervised Video Object Segmentation for Deep Reinforcement Learning


Title	Unsupervised Video Object Segmentation for Deep Reinforcement Learning
Authors	Vik Goel, Jameson Weng, Pascal Poupart
Abstract	We present a new technique for deep reinforcement learning that automatically detects moving objects and uses the relevant information for action selection. The detection of moving objects is done in an unsupervised way by exploiting structure from motion. Instead of directly learning a policy from raw images, the agent first learns to detect and segment moving objects by exploiting flow information in video sequences. The learned representation is then used to focus the policy of the agent on the moving objects. Over time, the agent identifies which objects are critical for decision making and gradually builds a policy based on relevant moving objects. This approach, which we call Motion-Oriented REinforcement Learning (MOREL), is demonstrated on a suite of Atari games where the ability to detect moving objects reduces the amount of interaction needed with the environment to obtain a good policy. Furthermore, the resulting policy is more interpretable than policies that directly map images to actions or values with a black box neural network. We can gain insight into the policy by inspecting the segmentation and motion of each object detected by the agent. This allows practitioners to confirm whether a policy is making decisions based on sensible information.
Tasks	Atari Games, Decision Making, Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2018-05-20
URL	http://arxiv.org/abs/1805.07780v1
PDF	http://arxiv.org/pdf/1805.07780v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-video-object-segmentation-for
Repo	https://github.com/vik-goel/MOREL
Framework	none

Quicker ADC : Unlocking the hidden potential of Product Quantization with SIMD


Title	Quicker ADC : Unlocking the hidden potential of Product Quantization with SIMD
Authors	Fabien André, Anne-Marie Kermarrec, Nicolas Le Scouarnec
Abstract	Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. A common approach is to rely on Product Quantization, which allows the storage of large vector databases in memory and efficient distance computations. Yet, implementations of nearest neighbor search with Product Quantization have their performance limited by the many memory accesses they perform. Following this observation, Andr'e et al. proposed Quick ADC with up to $6\times$ faster implementations of $m\times{}4$ product quantizers (PQ) leveraging specific SIMD instructions. Quicker ADC is a generalization of Quick ADC not limited to $m\times{}4$ codes and supporting AVX-512, the latest revision of SIMD instruction set. In doing so, Quicker ADC faces the challenge of using efficiently 5,6 and 7-bit shuffles that do not align to computer bytes or words. To this end, we introduce (i) irregular product quantizers combining sub-quantizers of different granularity and (ii) split tables allowing lookup tables larger than registers. We evaluate Quicker ADC with multiple indexes including Inverted Multi-Indexes and IVF HNSW and show that it outperforms the reference optimized implementations (i.e., FAISS and polysemous codes) for numerous configurations. Finally, we release an open-source fork of FAISS enhanced with Quicker ADC at http://github.com/nlescoua/faiss-quickeradc.
Tasks	Quantization
Published	2018-12-21
URL	https://arxiv.org/abs/1812.09162v2
PDF	https://arxiv.org/pdf/1812.09162v2.pdf
PWC	https://paperswithcode.com/paper/quicker-adc-unlocking-the-hidden-potential-of
Repo	https://github.com/technicolor-research/faiss-quickeradc
Framework	none

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment


Title	SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment
Authors	Jisun An, Haewoon Kwak, Yong-Yeol Ahn
Abstract	Because word semantics can substantially change across communities and contexts, capturing domain-specific word semantics is an important challenge. Here, we propose SEMAXIS, a simple yet powerful framework to characterize word semantics using many semantic axes in word- vector spaces beyond sentiment. We demonstrate that SEMAXIS can capture nuanced semantic representations in multiple online communities. We also show that, when the sentiment axis is examined, SEMAXIS outperforms the state-of-the-art approaches in building domain-specific sentiment lexicons.
Tasks
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05521v1
PDF	http://arxiv.org/pdf/1806.05521v1.pdf
PWC	https://paperswithcode.com/paper/semaxis-a-lightweight-framework-to
Repo	https://github.com/ghdi6758/SemAxis
Framework	none

Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures


Title	Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures
Authors	Michael Michelashvili, Sagie Benaim, Lior Wolf
Abstract	We study the problem of semi-supervised singing voice separation, in which the training data contains a set of samples of mixed music (singing and instrumental) and an unmatched set of instrumental music. Our solution employs a single mapping function g, which, applied to a mixed sample, recovers the underlying instrumental music, and, applied to an instrumental sample, returns the same sample. The network g is trained using purely instrumental samples, as well as on synthetic mixed samples that are created by mixing reconstructed singing voices with random instrumental samples. Our results indicate that we are on a par with or better than fully supervised methods, which are also provided with training samples of unmixed singing voices, and are better than other recent semi-supervised methods.
Tasks	Music Source Separation, Speech Separation
Published	2018-12-14
URL	https://arxiv.org/abs/1812.06087v3
PDF	https://arxiv.org/pdf/1812.06087v3.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-monaural-singing-voice
Repo	https://github.com/sagiebenaim/Singing
Framework	pytorch

SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks


Title	SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks
Authors	Julian Faraone, Nicholas Fraser, Michaela Blott, Philip H. W. Leong
Abstract	Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with 1-8-bit activations, the information loss from quantization leads to significant accuracy degradation due to large gradient mismatches between the forward and backward functions. In this paper, we introduce a quantization method to reduce this loss by learning a symmetric codebook for particular weight subgroups. These subgroups are determined based on their locality in the weight matrix, such that the hardware simplicity of the low-precision representations is preserved. Empirically, we show that symmetric quantization can substantially improve accuracy for networks with extremely low-precision weights and activations. We also demonstrate that this representation imposes minimal or no hardware implications to more coarse-grained approaches. Source code is available at https://www.github.com/julianfaraone/SYQ.
Tasks	Quantization
Published	2018-07-01
URL	http://arxiv.org/abs/1807.00301v1
PDF	http://arxiv.org/pdf/1807.00301v1.pdf
PWC	https://paperswithcode.com/paper/syq-learning-symmetric-quantization-for
Repo	https://github.com/julianfaraone/SYQ
Framework	tf

Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders


Title	Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders
Authors	Edgar Schönfeld, Sayna Ebrahimi, Samarth Sinha, Trevor Darrell, Zeynep Akata
Abstract	Many approaches in generalized zero-shot learning rely on cross-modal mapping between the image feature space and the class embedding space. As labeled images are expensive, one direction is to augment the dataset by generating either images or image features. However, the former misses fine-grained details and the latter requires learning a mapping associated with class embeddings. In this work, we take feature generation one step further and propose a model where a shared latent space of image features and class embeddings is learned by modality-specific aligned variational autoencoders. This leaves us with the required discriminative information about the image and classes in the latent features, on which we train a softmax classifier. The key to our approach is that we align the distributions learned from images and from side-information to construct latent features that contain the essential multi-modal information associated with unseen classes. We evaluate our learned latent features on several benchmark datasets, i.e. CUB, SUN, AWA1 and AWA2, and establish a new state of the art on generalized zero-shot as well as on few-shot learning. Moreover, our results on ImageNet with various zero-shot splits show that our latent features generalize well in large-scale settings.
Tasks	Few-Shot Learning, Zero-Shot Learning
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01784v4
PDF	http://arxiv.org/pdf/1812.01784v4.pdf
PWC	https://paperswithcode.com/paper/generalized-zero-and-few-shot-learning-via
Repo	https://github.com/edgarschnfld/CADA-VAE-PyTorch
Framework	pytorch

Semantic Cluster Unary Loss for Efficient Deep Hashing


Title	Semantic Cluster Unary Loss for Efficient Deep Hashing
Authors	Shifeng Zhang, Jianmin Li, Bo Zhang
Abstract	Hashing method maps similar data to binary hashcodes with smaller hamming distance, which has received a broad attention due to its low storage cost and fast retrieval speed. With the rapid development of deep learning, deep hashing methods have achieved promising results in efficient information retrieval. Most of the existing deep hashing methods adopt pairwise or triplet losses to deal with similarities underlying the data, but the training is difficult and less efficient because $O(n^2)$ data pairs and $O(n^3)$ triplets are involved. To address these issues, we propose a novel deep hashing algorithm with unary loss which can be trained very efficiently. We first of all introduce a Unary Upper Bound of the traditional triplet loss, thus reducing the complexity to $O(n)$ and bridging the classification-based unary loss and the triplet loss. Second, we propose a novel Semantic Cluster Deep Hashing (SCDH) algorithm by introducing a modified Unary Upper Bound loss, named Semantic Cluster Unary Loss (SCUL). The resultant hashcodes form several compact clusters, which means hashcodes in the same cluster have similar semantic information. We also demonstrate that the proposed SCDH is easy to be extended to semi-supervised settings by incorporating the state-of-the-art semi-supervised learning algorithms. Experiments on large-scale datasets show that the proposed method is superior to state-of-the-art hashing algorithms.
Tasks	Information Retrieval
Published	2018-05-15
URL	http://arxiv.org/abs/1805.08705v2
PDF	http://arxiv.org/pdf/1805.08705v2.pdf
PWC	https://paperswithcode.com/paper/semantic-cluster-unary-loss-for-efficient
Repo	https://github.com/zsffq999/SCDH
Framework	none

Sequential Image-based Attention Network for Inferring Force Estimation without Haptic Sensor


Title	Sequential Image-based Attention Network for Inferring Force Estimation without Haptic Sensor
Authors	Hochul Shin, Hyeon Cho, Dongyi Kim, Daekwan Ko, Soochul Lim, Wonjun Hwang
Abstract	Humans can infer approximate interaction force between objects from only vision information because we already have learned it through experiences. Based on this idea, we propose a recurrent convolutional neural network-based method using sequential images for inferring interaction force without using a haptic sensor. For training and validating deep learning methods, we collected a large number of images and corresponding interaction forces through an electronic motor-based device. To concentrate on changing shapes of a target object by the external force in images, we propose a sequential image-based attention module, which learns a salient model from temporal dynamics. The proposed sequential image-based attention module consists of a sequential spatial attention module and a sequential channel attention module, which are extended to exploit multiple sequential images. For gaining better accuracy, we also created a weighted average pooling layer for both spatial and channel attention modules. The extensive experimental results verified that the proposed method successfully infers interaction forces under the various conditions, such as different target materials, illumination changes, and external force directions.
Tasks
Published	2018-11-17
URL	https://arxiv.org/abs/1811.07190v4
PDF	https://arxiv.org/pdf/1811.07190v4.pdf
PWC	https://paperswithcode.com/paper/sequential-image-based-attention-network-for
Repo	https://github.com/cxz1418/SSAM_ForcePrediction
Framework	tf