October 21, 2019

3276 words 16 mins read

Paper Group AWR 17

Learning Semantic Representations for Novel Words: Leveraging Both Form and Context. Domain Adaptive Faster R-CNN for Object Detection in the Wild. Elastic Boundary Projection for 3D Medical Imaging Segmentation. In Defense of Classical Image Processing: Fast Depth Completion on the CPU. Look Across Elapse: Disentangled Representation Learning and …

Learning Semantic Representations for Novel Words: Leveraging Both Form and Context


Title	Learning Semantic Representations for Novel Words: Leveraging Both Form and Context
Authors	Timo Schick, Hinrich Schütze
Abstract	Word embeddings are a key component of high-performing natural language processing (NLP) systems, but it remains a challenge to learn good representations for novel words on the fly, i.e., for words that did not occur in the training data. The general problem setting is that word embeddings are induced on an unlabeled training corpus and then a model is trained that embeds novel words into this induced embedding space. Currently, two approaches for learning embeddings of novel words exist: (i) learning an embedding from the novel word’s surface-form (e.g., subword n-grams) and (ii) learning an embedding from the context in which it occurs. In this paper, we propose an architecture that leverages both sources of information - surface-form and context - and show that it results in large increases in embedding quality. Our architecture obtains state-of-the-art results on the Definitional Nonce and Contextual Rare Words datasets. As input, we only require an embedding set and an unlabeled corpus for training our architecture to produce embeddings appropriate for the induced embedding space. Thus, our model can easily be integrated into any existing NLP system and enhance its capability to handle novel words.
Tasks	Learning Semantic Representations, Word Embeddings
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03866v1
PDF	http://arxiv.org/pdf/1811.03866v1.pdf
PWC	https://paperswithcode.com/paper/learning-semantic-representations-for-novel
Repo	https://github.com/timoschick/form-context-model
Framework	none

Domain Adaptive Faster R-CNN for Object Detection in the Wild


Title	Domain Adaptive Faster R-CNN for Object Detection in the Wild
Authors	Yuhua Chen, Wen Li, Christos Sakaridis, Dengxin Dai, Luc Van Gool
Abstract	Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.
Tasks	Domain Adaptation, Object Detection, Robust Object Detection, Unsupervised Domain Adaptation
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03243v1
PDF	http://arxiv.org/pdf/1803.03243v1.pdf
PWC	https://paperswithcode.com/paper/domain-adaptive-faster-r-cnn-for-object
Repo	https://github.com/yuhuayc/da-faster-rcnn
Framework	pytorch

Elastic Boundary Projection for 3D Medical Imaging Segmentation


Title	Elastic Boundary Projection for 3D Medical Imaging Segmentation
Authors	Tianwei Ni, Lingxi Xie, Huangjie Zheng, Elliot K. Fishman, Alan L. Yuille
Abstract	We focus on an important yet challenging problem: using a 2D deep network to deal with 3D segmentation for medical imaging analysis. Existing approaches either applied multi-view planar (2D) networks or directly used volumetric (3D) networks for this purpose, but both of them are not ideal: 2D networks cannot capture 3D contexts effectively, and 3D networks are both memory-consuming and less stable arguably due to the lack of pre-trained models. In this paper, we bridge the gap between 2D and 3D using a novel approach named Elastic Boundary Projection (EBP). The key observation is that, although the object is a 3D volume, what we really need in segmentation is to find its boundary which is a 2D surface. Therefore, we place a number of pivot points in the 3D space, and for each pivot, we determine its distance to the object boundary along a dense set of directions. This creates an elastic shell around each pivot which is initialized as a perfect sphere. We train a 2D deep network to determine whether each ending point falls within the object, and gradually adjust the shell so that it gradually converges to the actual shape of the boundary and thus achieves the goal of segmentation. EBP allows 3D segmentation without cutting the volume into slices or small patches, which stands out from conventional 2D and 3D approaches. EBP achieves promising accuracy in segmenting several abdominal organs from CT scans.
Tasks	3D Medical Imaging Segmentation, Medical Image Segmentation
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00518v1
PDF	http://arxiv.org/pdf/1812.00518v1.pdf
PWC	https://paperswithcode.com/paper/elastic-boundary-projection-for-3d-medical
Repo	https://github.com/twni2016/EBP
Framework	pytorch

In Defense of Classical Image Processing: Fast Depth Completion on the CPU


Title	In Defense of Classical Image Processing: Fast Depth Completion on the CPU
Authors	Jason Ku, Ali Harakeh, Steven L. Waslander
Abstract	With the rise of data driven deep neural networks as a realization of universal function approximators, most research on computer vision problems has moved away from hand crafted classical image processing algorithms. This paper shows that with a well designed algorithm, we are capable of outperforming neural network based methods on the task of depth completion. The proposed algorithm is simple and fast, runs on the CPU, and relies only on basic image processing operations to perform depth completion of sparse LIDAR depth data. We evaluate our algorithm on the challenging KITTI depth completion benchmark, and at the time of submission, our method ranks first on the KITTI test server among all published methods. Furthermore, our algorithm is data independent, requiring no training data to perform the task at hand. The code written in Python will be made publicly available at https://github.com/kujason/ip_basic.
Tasks	Depth Completion
Published	2018-01-31
URL	http://arxiv.org/abs/1802.00036v1
PDF	http://arxiv.org/pdf/1802.00036v1.pdf
PWC	https://paperswithcode.com/paper/in-defense-of-classical-image-processing-fast
Repo	https://github.com/kujason/ip_basic
Framework	none

Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition


Title	Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition
Authors	Jian Zhao, Yu Cheng, Yi Cheng, Yang Yang, Haochong Lan, Fang Zhao, Lin Xiong, Yan Xu, Jianshu Li, Sugiri Pranata, Shengmei Shen, Junliang Xing, Hengzhu Liu, Shuicheng Yan, Jiashi Feng
Abstract	Despite the remarkable progress in face recognition related technologies, reliably recognizing faces across ages still remains a big challenge. The appearance of a human face changes substantially over time, resulting in significant intra-class variations. As opposed to current techniques for age-invariant face recognition, which either directly extract age-invariant features for recognition, or first synthesize a face that matches target age before feature extraction, we argue that it is more desirable to perform both tasks jointly so that they can leverage each other. To this end, we propose a deep Age-Invariant Model (AIM) for face recognition in the wild with three distinct novelties. First, AIM presents a novel unified deep architecture jointly performing cross-age face synthesis and recognition in a mutual boosting way. Second, AIM achieves continuous face rejuvenation/aging with remarkable photorealistic and identity-preserving properties, avoiding the requirement of paired data and the true age of testing samples. Third, we develop effective and novel training strategies for end-to-end learning the whole deep architecture, which generates powerful age-invariant face representations explicitly disentangled from the age variation. Moreover, we propose a new large-scale Cross-Age Face Recognition (CAFR) benchmark dataset to facilitate existing efforts and push the frontiers of age-invariant face recognition research. Extensive experiments on both our CAFR and several other cross-age datasets (MORPH, CACD and FG-NET) demonstrate the superiority of the proposed AIM model over the state-of-the-arts. Benchmarking our model on one of the most popular unconstrained face recognition datasets IJB-C additionally verifies the promising generalizability of AIM in recognizing faces in the wild.
Tasks	Age-Invariant Face Recognition, Face Generation, Face Recognition, Representation Learning
Published	2018-09-02
URL	http://arxiv.org/abs/1809.00338v2
PDF	http://arxiv.org/pdf/1809.00338v2.pdf
PWC	https://paperswithcode.com/paper/look-across-elapse-disentangled
Repo	https://github.com/bruinxiong/xionglin.github.io
Framework	none

Learning to Represent Edits


Title	Learning to Represent Edits
Authors	Pengcheng Yin, Graham Neubig, Miltiadis Allamanis, Marc Brockschmidt, Alexander L. Gaunt
Abstract	We introduce the problem of learning distributed representations of edits. By combining a “neural editor” with an “edit encoder”, our models learn to represent the salient information of an edit and can be used to apply edits to new inputs. We experiment on natural language and source code edit data. Our evaluation yields promising results that suggest that our neural network models learn to capture the structure and semantics of edits. We hope that this interesting task and data source will inspire other researchers to work further on this problem.
Tasks
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13337v2
PDF	http://arxiv.org/pdf/1810.13337v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-represent-edits
Repo	https://github.com/Microsoft/msrc-dpu-learning-to-represent-edits
Framework	none

Interpreting Neural Networks With Nearest Neighbors


Title	Interpreting Neural Networks With Nearest Neighbors
Authors	Eric Wallace, Shi Feng, Jordan Boyd-Graber
Abstract	Local model interpretation methods explain individual predictions by assigning an importance value to each input feature. This value is often determined by measuring the change in confidence when a feature is removed. However, the confidence of neural networks is not a robust measure of model uncertainty. This issue makes reliably judging the importance of the input features difficult. We address this by changing the test-time behavior of neural networks using Deep k-Nearest Neighbors. Without harming text classification accuracy, this algorithm provides a more robust uncertainty metric which we use to generate feature importance values. The resulting interpretations better align with human perception than baseline methods. Finally, we use our interpretation method to analyze model predictions on dataset annotation artifacts.
Tasks	Feature Importance, Text Classification
Published	2018-09-08
URL	http://arxiv.org/abs/1809.02847v2
PDF	http://arxiv.org/pdf/1809.02847v2.pdf
PWC	https://paperswithcode.com/paper/interpreting-neural-networks-with-nearest
Repo	https://github.com/Eric-Wallace/trickme-interface
Framework	none

SilhoNet: An RGB Method for 6D Object Pose Estimation


Title	SilhoNet: An RGB Method for 6D Object Pose Estimation
Authors	Gideon Billings, Matthew Johnson-Roberson
Abstract	Autonomous robot manipulation involves estimating the translation and orientation of the object to be manipulated as a 6-degree-of-freedom (6D) pose. Methods using RGB-D data have shown great success in solving this problem. However, there are situations where cost constraints or the working environment may limit the use of RGB-D sensors. When limited to monocular camera data only, the problem of object pose estimation is very challenging. In this work, we introduce a novel method called SilhoNet that predicts 6D object pose from monocular images. We use a Convolutional Neural Network (CNN) pipeline that takes in Region of Interest (ROI) proposals to simultaneously predict an intermediate silhouette representation for objects with an associated occlusion mask and a 3D translation vector. The 3D orientation is then regressed from the predicted silhouettes. We show that our method achieves better overall performance on the YCB-Video dataset than two state-of-the art networks for 6D pose estimation from monocular image input.
Tasks	3D Pose Estimation, 6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation
Published	2018-09-18
URL	https://arxiv.org/abs/1809.06893v3
PDF	https://arxiv.org/pdf/1809.06893v3.pdf
PWC	https://paperswithcode.com/paper/silhonet-an-rgb-method-for-6d-object-pose
Repo	https://github.com/gidobot/SilhoNet
Framework	tf

A Regressive Convolution Neural network and Support Vector Regression Model for Electricity Consumption Forecasting


Title	A Regressive Convolution Neural network and Support Vector Regression Model for Electricity Consumption Forecasting
Authors	Youshan Zhang, Qi Li
Abstract	Electricity consumption forecasting has important implications for the mineral companies on guiding quarterly work, normal power system operation, and the management. However, electricity consumption prediction for the mineral company is different from traditional electricity load prediction since mineral company electricity consumption can be affected by various factors (e.g., ore grade, processing quantity of the crude ore, ball milling fill rate). The problem is non-trivial due to three major challenges for traditional methods: insufficient training data, high computational cost and low prediction accu-racy. To tackle these challenges, we firstly propose a Regressive Convolution Neural Network (RCNN) to predict the electricity consumption. While RCNN still suffers from high computation overhead, we utilize RCNN to extract features from the history data and Regressive Support Vector Machine (SVR) trained with the features to predict the electricity consumption. The experimental results show that the proposed RCNN-SVR model achieves higher accuracy than using the traditional RNN or SVM alone. The MSE, MAPE, and CV-RMSE of RCNN-SVR model are 0.8564, 1.975%, and 0.0687% respectively, which illustrates the low predicting error rate of the proposed model.
Tasks
Published	2018-10-21
URL	http://arxiv.org/abs/1810.08878v2
PDF	http://arxiv.org/pdf/1810.08878v2.pdf
PWC	https://paperswithcode.com/paper/a-regressive-convolution-neural-network-and
Repo	https://github.com/heaventian93/A-Regressive-Convolution-Neural-Network-and-Support-Vector-Regression-Model-for-Electricity-Consumpt
Framework	none

Towards Deep Cellular Phenotyping in Placental Histology


Title	Towards Deep Cellular Phenotyping in Placental Histology
Authors	Michael Ferlaino, Craig A. Glastonbury, Carolina Motta-Mejia, Manu Vatish, Ingrid Granne, Stephen Kennedy, Cecilia M. Lindgren, Christoffer Nellåker
Abstract	The placenta is a complex organ, playing multiple roles during fetal development. Very little is known about the association between placental morphological abnormalities and fetal physiology. In this work, we present an open sourced, computationally tractable deep learning pipeline to analyse placenta histology at the level of the cell. By utilising two deep Convolutional Neural Network architectures and transfer learning, we can robustly localise and classify placental cells within five classes with an accuracy of 89%. Furthermore, we learn deep embeddings encoding phenotypic knowledge that is capable of both stratifying five distinct cell populations and learn intraclass phenotypic variance. We envisage that the automation of this pipeline to population scale studies of placenta histology has the potential to improve our understanding of basic cellular placental biology and its variations, particularly its role in predicting adverse birth outcomes.
Tasks	Transfer Learning
Published	2018-04-09
URL	http://arxiv.org/abs/1804.03270v2
PDF	http://arxiv.org/pdf/1804.03270v2.pdf
PWC	https://paperswithcode.com/paper/towards-deep-cellular-phenotyping-in
Repo	https://github.com/Nellaker-group/TowardsDeepPhenotyping
Framework	none

Combining Advanced Methods in Japanese-Vietnamese Neural Machine Translation


Title	Combining Advanced Methods in Japanese-Vietnamese Neural Machine Translation
Authors	Thi-Vinh Ngo, Thanh-Le Ha, Phuong-Thai Nguyen, Le-Minh Nguyen
Abstract	Neural machine translation (NMT) systems have recently obtained state-of-the art in many machine translation systems between popular language pairs because of the availability of data. For low-resourced language pairs, there are few researches in this field due to the lack of bilingual data. In this paper, we attempt to build the first NMT systems for a low-resourced language pairs:Japanese-Vietnamese. We have also shown significant improvements when combining advanced methods to reduce the adverse impacts of data sparsity and improve the quality of NMT systems. In addition, we proposed a variant of Byte-Pair Encoding algorithm to perform effective word segmentation for Vietnamese texts and alleviate the rare-word problem that persists in NMT systems.
Tasks	Machine Translation
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07133v1
PDF	http://arxiv.org/pdf/1805.07133v1.pdf
PWC	https://paperswithcode.com/paper/combining-advanced-methods-in-japanese
Repo	https://github.com/ngovinhtn/JaViCorpus
Framework	none

Learning deep representations by mutual information estimation and maximization


Title	Learning deep representations by mutual information estimation and maximization
Authors	R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, Yoshua Bengio
Abstract	In this work, we perform unsupervised learning of representations by maximizing mutual information between an input and the output of a deep neural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality of the input to the objective can greatly influence a representation’s suitability for downstream tasks. We further control characteristics of the representation by matching to a prior distribution adversarially. Our method, which we call Deep InfoMax (DIM), outperforms a number of popular unsupervised learning methods and competes with fully-supervised learning on several classification tasks. DIM opens new avenues for unsupervised learning of representations and is an important step towards flexible formulations of representation-learning objectives for specific end-goals.
Tasks	Representation Learning
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06670v5
PDF	http://arxiv.org/pdf/1808.06670v5.pdf
PWC	https://paperswithcode.com/paper/learning-deep-representations-by-mutual
Repo	https://github.com/DuaneNielsen/DeepInfomaxPytorch
Framework	pytorch

LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations


Title	LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations
Authors	Michael Schaarschmidt, Alexander Kuhnle, Ben Ellis, Kai Fricke, Felix Gessert, Eiko Yoneki
Abstract	Reinforcement learning approaches have long appealed to the data management community due to their ability to learn to control dynamic behavior from raw system performance. Recent successes in combining deep neural networks with reinforcement learning have sparked significant new interest in this domain. However, practical solutions remain elusive due to large training data requirements, algorithmic instability, and lack of standard tools. In this work, we introduce LIFT, an end-to-end software stack for applying deep reinforcement learning to data management tasks. While prior work has frequently explored applications in simulations, LIFT centers on utilizing human expertise to learn from demonstrations, thus lowering online training times. We further introduce TensorForce, a TensorFlow library for applied deep reinforcement learning exposing a unified declarative interface to common RL algorithms, thus providing a backend to LIFT. We demonstrate the utility of LIFT in two case studies in database compound indexing and resource management in stream processing. Results show LIFT controllers initialized from demonstrations can outperform human baselines and heuristics across latency metrics and space usage by up to 70%.
Tasks
Published	2018-08-23
URL	http://arxiv.org/abs/1808.07903v1
PDF	http://arxiv.org/pdf/1808.07903v1.pdf
PWC	https://paperswithcode.com/paper/lift-reinforcement-learning-in-computer
Repo	https://github.com/tensorforce/tensorforce
Framework	tf

A General Multi-agent Epistemic Planner Based on Higher-order Belief Change


Title	A General Multi-agent Epistemic Planner Based on Higher-order Belief Change
Authors	Xiao Huang, Biqing Fang, Hai Wan, Yongmei Liu
Abstract	In recent years, multi-agent epistemic planning has received attention from both dynamic logic and planning communities. Existing implementations of multi-agent epistemic planning are based on compilation into classical planning and suffer from various limitations, such as generating only linear plans, restriction to public actions, and incapability to handle disjunctive beliefs. In this paper, we propose a general representation language for multi-agent epistemic planning where the initial KB and the goal, the preconditions and effects of actions can be arbitrary multi-agent epistemic formulas, and the solution is an action tree branching on sensing results. To support efficient reasoning in the multi-agent KD45 logic, we make use of a normal form called alternating cover disjunctive formulas (ACDFs). We propose basic revision and update algorithms for ACDFs. We also handle static propositional common knowledge, which we call constraints. Based on our reasoning, revision and update algorithms, adapting the PrAO algorithm for contingent planning from the literature, we implemented a multi-agent epistemic planner called MEPK. Our experimental results show the viability of our approach.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11298v2
PDF	http://arxiv.org/pdf/1806.11298v2.pdf
PWC	https://paperswithcode.com/paper/a-general-multi-agent-epistemic-planner-based
Repo	https://github.com/sysulic/MEPK
Framework	none

q-means: A quantum algorithm for unsupervised machine learning


Title	q-means: A quantum algorithm for unsupervised machine learning
Authors	Iordanis Kerenidis, Jonas Landman, Alessandro Luongo, Anupam Prakash
Abstract	Quantum machine learning is one of the most promising applications of a full-scale quantum computer. Over the past few years, many quantum machine learning algorithms have been proposed that can potentially offer considerable speedups over the corresponding classical algorithms. In this paper, we introduce q-means, a new quantum algorithm for clustering which is a canonical problem in unsupervised machine learning. The $q$-means algorithm has convergence and precision guarantees similar to $k$-means, and it outputs with high probability a good approximation of the $k$ cluster centroids like the classical algorithm. Given a dataset of $N$ $d$-dimensional vectors $v_i$ (seen as a matrix $V \in \mathbb{R}^{N \times d})$ stored in QRAM, the running time of q-means is $\widetilde{O}\left( k d \frac{\eta}{\delta^2}\kappa(V)(\mu(V) + k \frac{\eta}{\delta}) + k^2 \frac{\eta^{1.5}}{\delta^2} \kappa(V)\mu(V) \right)$ per iteration, where $\kappa(V)$ is the condition number, $\mu(V)$ is a parameter that appears in quantum linear algebra procedures and $\eta = \max_{i} v_{i}^{2}$. For a natural notion of well-clusterable datasets, the running time becomes $\widetilde{O}\left( k^2 d \frac{\eta^{2.5}}{\delta^3} + k^{2.5} \frac{\eta^2}{\delta^3} \right)$ per iteration, which is linear in the number of features $d$, and polynomial in the rank $k$, the maximum square norm $\eta$ and the error parameter $\delta$. Both running times are only polylogarithmic in the number of datapoints $N$. Our algorithm provides substantial savings compared to the classical $k$-means algorithm that runs in time $O(kdN)$ per iteration, particularly for the case of large datasets.
Tasks	Quantum Machine Learning
Published	2018-12-10
URL	http://arxiv.org/abs/1812.03584v2
PDF	http://arxiv.org/pdf/1812.03584v2.pdf
PWC	https://paperswithcode.com/paper/q-means-a-quantum-algorithm-for-unsupervised
Repo	https://github.com/Morcu/q-means
Framework	none