February 1, 2020

3224 words 16 mins read

Paper Group AWR 260

Ultrametric Fitting by Gradient Descent. Integration of adversarial autoencoders with residual dense convolutional networks for estimation of non-Gaussian hydraulic conductivities. Self-Supervised Generalisation with Meta Auxiliary Learning. OpenDenoising: an Extensible Benchmark for Building Comparative Studies of Image Denoisers. The Architectura …

Ultrametric Fitting by Gradient Descent


Title	Ultrametric Fitting by Gradient Descent
Authors	Giovanni Chierchia, Benjamin Perret
Abstract	We study the problem of fitting an ultrametric distance to a dissimilarity graph in the context of hierarchical cluster analysis. Standard hierarchical clustering methods are specified procedurally, rather than in terms of the cost function to be optimized. We aim to overcome this limitation by presenting a general optimization framework for ultrametric fitting. Our approach consists of modeling the latter as a constrained optimization problem over the continuous space of ultrametrics. So doing, we can leverage the simple, yet effective, idea of replacing the ultrametric constraint with a min-max operation injected directly into the cost function. The proposed reformulation leads to an unconstrained optimization problem that can be efficiently solved by gradient descent methods. The flexibility of our framework allows us to investigate several cost functions, following the classic paradigm of combining a data fidelity term with a regularization. While we provide no theoretical guarantee to find the global optimum, the numerical results obtained over a number of synthetic and real datasets demonstrate the good performance of our approach with respect to state-of-the-art agglomerative algorithms. This makes us believe that the proposed framework sheds new light on the way to design a new generation of hierarchical clustering methods. Our code is made publicly available at \url{https://github.com/PerretB/ultrametric-fitting}.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10566v3
PDF	https://arxiv.org/pdf/1905.10566v3.pdf
PWC	https://paperswithcode.com/paper/ultrametric-fitting-by-gradient-descent
Repo	https://github.com/PerretB/ultrametric-fitting
Framework	pytorch

Integration of adversarial autoencoders with residual dense convolutional networks for estimation of non-Gaussian hydraulic conductivities


Title	Integration of adversarial autoencoders with residual dense convolutional networks for estimation of non-Gaussian hydraulic conductivities
Authors	Shaoxing Mo, Nicholas Zabaras, Xiaoqing Shi, Jichun Wu
Abstract	Inverse modeling for the estimation of non-Gaussian hydraulic conductivity fields in subsurface flow and solute transport models remains a challenging problem. This is mainly due to the non-Gaussian property, the non-linear physics, and the fact that many repeated evaluations of the forward model are often required. In this study, we develop a convolutional adversarial autoencoder (CAAE) to parameterize non-Gaussian conductivity fields with heterogeneous conductivity within each facies using a low-dimensional latent representation. In addition, a deep residual dense convolutional network (DRDCN) is proposed for surrogate modeling of forward models with high-dimensional and highly-complex mappings. The two networks are both based on a multilevel residual learning architecture called residual-in-residual dense block. The multilevel residual learning strategy and the dense connection structure ease the training of deep networks, enabling us to efficiently build deeper networks that have an essentially increased capacity for approximating mappings of very high-complexity. The CCAE and DRDCN networks are incorporated into an iterative ensemble smoother to formulate an inversion framework. The numerical experiments performed using 2-D and 3-D solute transport models illustrate the performance of the integrated method. The obtained results indicate that the CAAE is a robust parameterization method for non-Gaussian conductivity fields with different heterogeneity patterns. The DRDCN is able to obtain accurate approximations of the forward models with high-dimensional and highly-complex mappings using relatively limited training data. The CAAE and DRDCN methods together significantly reduce the computation time required to achieve accurate inversion results.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11828v4
PDF	https://arxiv.org/pdf/1906.11828v4.pdf
PWC	https://paperswithcode.com/paper/integration-of-adversarial-autoencoders-with
Repo	https://github.com/cics-nd/CAAE-DRDCN-inverse
Framework	pytorch

Self-Supervised Generalisation with Meta Auxiliary Learning


Title	Self-Supervised Generalisation with Meta Auxiliary Learning
Authors	Shikun Liu, Andrew J. Davison, Edward Johns
Abstract	Learning with auxiliary tasks can improve the ability of a primary task to generalise. However, this comes at the cost of manually labelling auxiliary data. We propose a new method which automatically learns appropriate labels for an auxiliary task, such that any supervised learning task can be improved without requiring access to any further data. The approach is to train two neural networks: a label-generation network to predict the auxiliary labels, and a multi-task network to train the primary task alongside the auxiliary task. The loss for the label-generation network incorporates the loss of the multi-task network, and so this interaction between the two networks can be seen as a form of meta learning with a double gradient. We show that our proposed method, Meta AuXiliary Learning (MAXL), outperforms single-task learning on 7 image datasets, without requiring any additional data. We also show that MAXL outperforms several other baselines for generating auxiliary labels, and is even competitive when compared with human-defined auxiliary labels. The self-supervised nature of our method leads to a promising new direction towards automated generalisation. Source code can be found at https://github.com/lorenmt/maxl.
Tasks	Auxiliary Learning, Meta-Learning, Multi-Task Learning
Published	2019-01-25
URL	https://arxiv.org/abs/1901.08933v3
PDF	https://arxiv.org/pdf/1901.08933v3.pdf
PWC	https://paperswithcode.com/paper/self-supervised-generalisation-with-meta
Repo	https://github.com/lorenmt/maxl
Framework	pytorch

OpenDenoising: an Extensible Benchmark for Building Comparative Studies of Image Denoisers


Title	OpenDenoising: an Extensible Benchmark for Building Comparative Studies of Image Denoisers
Authors	Florian Lemarchand, Eduardo Fernandes Montesuma, Maxime Pelcat, Erwan Nogues
Abstract	Image denoising has recently taken a leap forward due to machine learning. However, image denoisers, both expert-based and learning-based, are mostly tested on well-behaved generated noises (usually Gaussian) rather than on real-life noises, making performance comparisons difficult in real-world conditions. This is especially true for learning-based denoisers which performance depends on training data. Hence, choosing which method to use for a specific denoising problem is difficult. This paper proposes a comparative study of existing denoisers, as well as an extensible open tool that makes it possible to reproduce and extend the study. MWCNN is shown to outperform other methods when trained for a real-world image interception noise, and additionally is the second least compute hungry of the tested methods. To evaluate the robustness of conclusions, three test sets are compared. A Kendall’s Tau correlation of only 60% is obtained on methods ranking between noise types, demonstrating the need for a benchmarking tool.
Tasks	Denoising, Image Denoising
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08328v1
PDF	https://arxiv.org/pdf/1910.08328v1.pdf
PWC	https://paperswithcode.com/paper/opendenoising-an-extensible-benchmark-for
Repo	https://github.com/opendenoising/opendenoising-benchmark
Framework	pytorch

The Architectural Implications of Facebook’s DNN-based Personalized Recommendation


Title	The Architectural Implications of Facebook’s DNN-based Personalized Recommendation
Authors	Udit Gupta, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, Xuan Zhang
Abstract	The widespread application of deep learning has changed the landscape of computation in the data center. In particular, personalized recommendation for content ranking is now largely accomplished leveraging deep neural networks. However, despite the importance of these models and the amount of compute cycles they consume, relatively little research attention has been devoted to systems for recommendation. To facilitate research and to advance the understanding of these workloads, this paper presents a set of real-world, production-scale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation. In addition to releasing a set of open-source workloads, we conduct in-depth analysis that underpins future system design and optimization for at-scale recommendation: Inference latency varies by 60% across three Intel server generations, batching and co-location of inferences can drastically improve latency-bounded throughput, and the diverse composition of recommendation models leads to different optimization strategies.
Tasks
Published	2019-06-06
URL	https://arxiv.org/abs/1906.03109v4
PDF	https://arxiv.org/pdf/1906.03109v4.pdf
PWC	https://paperswithcode.com/paper/the-architectural-implications-of-facebooks
Repo	https://github.com/facebookresearch/dlrm
Framework	pytorch

Learning Representations for Predicting Future Activities


Title	Learning Representations for Predicting Future Activities
Authors	Mohammadreza Zolfaghari, Özgün Çiçek, Syed Mohsin Ali, Farzaneh Mahdisoltani, Can Zhang, Thomas Brox
Abstract	Foreseeing the future is one of the key factors of intelligence. It involves understanding of the past and current environment as well as decent experience of its possible dynamics. In this work, we address future prediction at the abstract level of activities. We propose a network module for learning embeddings of the environment’s dynamics in a self-supervised way. To take the ambiguities and high variances in the future activities into account, we use a multi-hypotheses scheme that can represent multiple futures. We demonstrate the approach by classifying future activities on the Epic-Kitchens and Breakfast datasets. Moreover, we generate captions that describe the future activities
Tasks	Future prediction
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03578v1
PDF	https://arxiv.org/pdf/1905.03578v1.pdf
PWC	https://paperswithcode.com/paper/190503578
Repo	https://github.com/lmb-freiburg/PreFAct
Framework	none

Feature Map Transform Coding for Energy-Efficient CNN Inference


Title	Feature Map Transform Coding for Energy-Efficient CNN Inference
Authors	Brian Chmiel, Chaim Baskin, Ron Banner, Evgenii Zheltonozhskii, Yevgeny Yermolin, Alex Karbachevsky, Alex M. Bronstein, Avi Mendelson
Abstract	Convolutional neural networks (CNNs) achieve state-of-the-art accuracy in a variety of tasks in computer vision and beyond. One of the major obstacles hindering the ubiquitous use of CNNs for inference on low-power edge devices is their high computational complexity and memory bandwidth requirements. The latter often dominates the energy footprint on modern hardware. In this paper, we introduce a lossy transform coding approach, inspired by image and video compression, designed to reduce the memory bandwidth due to the storage of intermediate activation calculation results. Our method does not require fine-tuning the network weights and halves the data transfer volumes to the main memory by compressing feature maps, which are highly correlated, with variable length coding. Our method outperform previous approach in term of the number of bits per value with minor accuracy degradation on ResNet-34 and MobileNetV2. We analyze the performance of our approach on a variety of CNN architectures and demonstrate that FPGA implementation of ResNet-18 with our approach results in a reduction of around 40% in the memory energy footprint, compared to quantized network, with negligible impact on accuracy. When allowing accuracy degradation of up to 2%, the reduction of 60% is achieved. A reference implementation is available at https://github.com/CompressTeam/TransformCodingInference
Tasks	Video Compression
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10830v3
PDF	https://arxiv.org/pdf/1905.10830v3.pdf
PWC	https://paperswithcode.com/paper/feature-map-transform-coding-for-energy
Repo	https://github.com/CompressTeam/TransformCodingInference
Framework	pytorch

Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on n-Spheres


Title	Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on n-Spheres
Authors	Shuai Liao, Efstratios Gavves, Cees G. M. Snoek
Abstract	Many computer vision challenges require continuous outputs, but tend to be solved by discrete classification. The reason is classification’s natural containment within a probability $n$-simplex, as defined by the popular softmax activation function. Regular regression lacks such a closed geometry, leading to unstable training and convergence to suboptimal local minima. Starting from this insight we revisit regression in convolutional neural networks. We observe many continuous output problems in computer vision are naturally contained in closed geometrical manifolds, like the Euler angles in viewpoint estimation or the normals in surface normal estimation. A natural framework for posing such continuous output problems are $n$-spheres, which are naturally closed geometric manifolds defined in the $\mathbb{R}^{(n+1)}$ space. By introducing a spherical exponential mapping on $n$-spheres at the regression output, we obtain well-behaved gradients, leading to stable training. We show how our spherical regression can be utilized for several computer vision challenges, specifically viewpoint estimation, surface normal estimation and 3D rotation estimation. For all these problems our experiments demonstrate the benefit of spherical regression. All paper resources are available at https://github.com/leoshine/Spherical_Regression.
Tasks	Viewpoint Estimation
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05404v1
PDF	http://arxiv.org/pdf/1904.05404v1.pdf
PWC	https://paperswithcode.com/paper/spherical-regression-learning-viewpoints
Repo	https://github.com/leoshine/Spherical_Regression
Framework	pytorch

Automating the search for a patent’s prior art with a full text similarity search


Title	Automating the search for a patent’s prior art with a full text similarity search
Authors	Lea Helmers, Franziska Horn, Franziska Biegler, Tim Oppermann, Klaus-Robert Müller
Abstract	More than ever, technical inventions are the symbol of our society’s advance. Patents guarantee their creators protection against infringement. For an invention being patentable, its novelty and inventiveness have to be assessed. Therefore, a search for published work that describes similar inventions to a given patent application needs to be performed. Currently, this so-called search for prior art is executed with semi-automatically composed keyword queries, which is not only time consuming, but also prone to errors. In particular, errors may systematically arise by the fact that different keywords for the same technical concepts may exist across disciplines. In this paper, a novel approach is proposed, where the full text of a given patent application is compared to existing patents using machine learning and natural language processing techniques to automatically detect inventions that are similar to the one described in the submitted document. Various state-of-the-art approaches for feature extraction and document comparison are evaluated. In addition to that, the quality of the current search process is assessed based on ratings of a domain expert. The evaluation results show that our automated approach, besides accelerating the search process, also improves the search results for prior art with respect to their quality.
Tasks
Published	2019-01-10
URL	http://arxiv.org/abs/1901.03136v2
PDF	http://arxiv.org/pdf/1901.03136v2.pdf
PWC	https://paperswithcode.com/paper/automating-the-search-for-a-patents-prior-art
Repo	https://github.com/helmersl/patent_similarity_search
Framework	tf

Survey on Publicly Available Sinhala Natural Language Processing Tools and Research


Title	Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
Authors	Nisansa de Silva
Abstract	Sinhala is the native language of the Sinhalese people who make up the largest ethnic group of Sri Lanka. The language belongs to the globe-spanning language tree, Indo-European. However, due to poverty in both linguistic and economic capital, Sinhala, in the perspective of Natural Language Processing tools and research, remains a resource-poor language which has neither the economic drive its cousin English has nor the sheer push of the law of numbers a language such as Chinese has. A number of research groups from Sri Lanka have noticed this dearth and the resultant dire need for proper tools and research for Sinhala natural language processing. However, due to various reasons, these attempts seem to lack coordination and awareness of each other. The objective of this paper is to fill that gap of a comprehensive literature survey of the publicly available Sinhala natural language tools and research so that the researchers working in this field can better utilize contributions of their peers. As such, we shall be uploading this paper to arXiv and perpetually update it periodically to reflect the advances made in the field.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02358v5
PDF	https://arxiv.org/pdf/1906.02358v5.pdf
PWC	https://paperswithcode.com/paper/survey-on-publicly-available-sinhala-natural
Repo	https://github.com/lknlp/lknlp.github.io
Framework	none

Heterogeneous Graph Learning for Visual Commonsense Reasoning


Title	Heterogeneous Graph Learning for Visual Commonsense Reasoning
Authors	Weijiang Yu, Jingwen Zhou, Weihao Yu, Xiaodan Liang, Nong Xiao
Abstract	Visual commonsense reasoning task aims at leading the research field into solving cognition-level reasoning with the ability of predicting correct answers and meanwhile providing convincing reasoning paths, resulting in three sub-tasks i.e., Q->A, QA->R and Q->AR. It poses great challenges over the proper semantic alignment between vision and linguistic domains and knowledge reasoning to generate persuasive reasoning paths. Existing works either resort to a powerful end-to-end network that cannot produce interpretable reasoning paths or solely explore intra-relationship of visual objects (homogeneous graph) while ignoring the cross-domain semantic alignment among visual concepts and linguistic words. In this paper, we propose a new Heterogeneous Graph Learning (HGL) framework for seamlessly integrating the intra-graph and inter-graph reasoning in order to bridge vision and language domain. Our HGL consists of a primal vision-to-answer heterogeneous graph (VAHG) module and a dual question-to-answer heterogeneous graph (QAHG) module to interactively refine reasoning paths for semantic agreement. Moreover, our HGL integrates a contextual voting module to exploit a long-range visual context for better global reasoning. Experiments on the large-scale Visual Commonsense Reasoning benchmark demonstrate the superior performance of our proposed modules on three tasks (improving 5% accuracy on Q->A, 3.5% on QA->R, 5.8% on Q->AR)
Tasks	Visual Commonsense Reasoning
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11475v1
PDF	https://arxiv.org/pdf/1910.11475v1.pdf
PWC	https://paperswithcode.com/paper/heterogeneous-graph-learning-for-visual
Repo	https://github.com/yuweijiang/HGL-pytorch
Framework	pytorch

Fast transcription of speech in low-resource languages


Title	Fast transcription of speech in low-resource languages
Authors	Mark Hasegawa-Johnson, Camille Goudeseune, Gina-Anne Levow
Abstract	We present software that, in only a few hours, transcribes forty hours of recorded speech in a surprise language, using only a few tens of megabytes of noisy text in that language, and a zero-resource grapheme to phoneme (G2P) table. A pretrained acoustic model maps acoustic features to phonemes; a reversed G2P maps these to graphemes; then a language model maps these to a most-likely grapheme sequence, i.e., a transcription. This software has worked successfully with corpora in Arabic, Assam, Kinyarwanda, Russian, Sinhalese, Swahili, Tagalog, and Tamil.
Tasks	Language Modelling
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07285v1
PDF	https://arxiv.org/pdf/1909.07285v1.pdf
PWC	https://paperswithcode.com/paper/fast-transcription-of-speech-in-low-resource
Repo	https://github.com/uiuc-sst/asr24
Framework	none

Learning to Copy for Automatic Post-Editing


Title	Learning to Copy for Automatic Post-Editing
Authors	Xuancheng Huang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun
Abstract	Automatic post-editing (APE), which aims to correct errors in the output of machine translation systems in a post-processing step, is an important task in natural language processing. While recent work has achieved considerable performance gains by using neural networks, how to model the copying mechanism for APE remains a challenge. In this work, we propose a new method for modeling copying for APE. To better identify translation errors, our method learns the representations of source sentences and system outputs in an interactive way. These representations are used to explicitly indicate which words in the system outputs should be copied, which is useful to help CopyNet (Gu et al., 2016) better generate post-edited translations. Experiments on the datasets of the WMT 2016-2017 APE shared tasks show that our approach outperforms all best published results.
Tasks	Automatic Post-Editing, Machine Translation
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03627v1
PDF	https://arxiv.org/pdf/1911.03627v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-copy-for-automatic-post-editing-1
Repo	https://github.com/THUNLP-MT/THUMT
Framework	tf

Self-paced Ensemble for Highly Imbalanced Massive Data Classification


Title	Self-paced Ensemble for Highly Imbalanced Massive Data Classification
Authors	Zhining Liu, Wei Cao, Zhifeng Gao, Jiang Bian, Hechang Chen, Yi Chang, Tie-Yan Liu
Abstract	Many real-world applications reveal difficulties in learning classifiers from imbalanced data. The rising big data era has been witnessing more classification tasks with large-scale but extremely imbalance and low-quality datasets. Most of existing learning methods suffer from poor performance or low computation efficiency under such a scenario. To tackle this problem, we conduct deep investigations into the nature of class imbalance, which reveals that not only the disproportion between classes, but also other difficulties embedded in the nature of data, especially, noises and class overlapping, prevent us from learning effective classifiers. Taking those factors into consideration, we propose a novel framework for imbalance classification that aims to generate a strong ensemble by self-paced harmonizing data hardness via under-sampling. Extensive experiments have shown that this new framework, while being very computationally efficient, can lead to robust performance even under highly overlapping classes and extremely skewed distribution. Note that, our methods can be easily adapted to most of existing learning methods (e.g., C4.5, SVM, GBDT and Neural Network) to boost their performance on imbalanced data.
Tasks
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03500v2
PDF	https://arxiv.org/pdf/1909.03500v2.pdf
PWC	https://paperswithcode.com/paper/training-effective-ensemble-on-imbalanced
Repo	https://github.com/ZhiningLiu1998/self-paced-ensemble
Framework	none

Learning Invariants through Soft Unification


Title	Learning Invariants through Soft Unification
Authors	Nuri Cingillioglu, Alessandra Russo
Abstract	Human reasoning involves recognising common underlying principles across many examples by utilising variables. The by-products of such reasoning are invariants that capture patterns across examples such as “if someone went somewhere then they are there” without mentioning specific people or places. Humans learn what variables are and how to use them at a young age, and the question this paper addresses is whether machines can also learn and use variables solely from examples without requiring human pre-engineering. We propose Unification Networks that incorporate soft unification into neural networks to learn variables and by doing so lift examples into invariants that can then be used to solve a given task. We evaluate our approach on four datasets to demonstrate that learning invariants captures patterns in the data and can improve performance over baselines.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07328v1
PDF	https://arxiv.org/pdf/1909.07328v1.pdf
PWC	https://paperswithcode.com/paper/learning-invariants-through-soft-unification
Repo	https://github.com/nuric/softuni
Framework	none