January 28, 2020

3212 words 16 mins read

Paper Group ANR 997

On a convergence property of a geometrical algorithm for statistical manifolds. Theme Aware Aesthetic Distribution Prediction with Full Resolution Photos. SRGAN: Training Dataset Matters. Ship Instance Segmentation From Remote Sensing Images Using Sequence Local Context Module. Implicit Pairs for Boosting Unpaired Image-to-Image Translation. PHYRE: …

On a convergence property of a geometrical algorithm for statistical manifolds


Title	On a convergence property of a geometrical algorithm for statistical manifolds
Authors	Shotaro Akaho, Hideitsu Hino, Noboru Murata
Abstract	In this paper, we examine a geometrical projection algorithm for statistical inference. The algorithm is based on Pythagorean relation and it is derivative-free as well as representation-free that is useful in nonparametric cases. We derive a bound of learning rate to guarantee local convergence. In special cases of m-mixture and e-mixture estimation problems, we calculate specific forms of the bound that can be used easily in practice.
Tasks
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12644v1
PDF	https://arxiv.org/pdf/1909.12644v1.pdf
PWC	https://paperswithcode.com/paper/on-a-convergence-property-of-a-geometrical
Repo
Framework

Theme Aware Aesthetic Distribution Prediction with Full Resolution Photos


Title	Theme Aware Aesthetic Distribution Prediction with Full Resolution Photos
Authors	Gengyun Jia, Peipei Li, Ran He
Abstract	Aesthetic quality assessment (AQA) of photos is a challenging task due to the subjective and diverse factors in human assessment process. Nowadays, it is common to tackle AQA with deep neural networks (DNNs) for their superior performance on modeling such complex relations. However, traditional DNNs require fix-sized inputs, and resizing various inputs to a uniform size may significantly change their aesthetic features. Such transformations lead to the mismatches between photos and their aesthetic evaluations. Existing methods usually adopt two solutions for it. Some methods directly crop fix-sized patches from the inputs. The others alternately capture the aesthetic features from pre-defined multi-size inputs by inserting adaptive pooling or removing fully connected layers. However, the former destroys the global structures and layout information, which are crucial in most situations. The latter has to resize images into several pre-defined sizes, which is not enough to reflect the diversity of image sizes, and the aesthetic features are still destroyed. To address this issue, we propose a simple and effective method that can handle the arbitrary sizes of batch inputs to achieve AQA on the full resolution images by combining image padding with ROI (region of interest) pooling. Padding keeps inputs of the same size, while ROI pooling cuts off the forward propagation of features on padding regions, thus eliminates the side effects of padding. Besides, we observe that the same image may receive different scores under different themes, which we call the theme criterion bias. However, previous works only focus on the aesthetic features of the images and ignore the criterion bias brought by their themes. In this paper, we introduce the theme information and propose a theme aware model. Extensive experiments prove the effectiveness of the proposed method over the state-of-the-arts.
Tasks
Published	2019-08-04
URL	https://arxiv.org/abs/1908.01308v1
PDF	https://arxiv.org/pdf/1908.01308v1.pdf
PWC	https://paperswithcode.com/paper/theme-aware-aesthetic-distribution-prediction
Repo
Framework

SRGAN: Training Dataset Matters


Title	SRGAN: Training Dataset Matters
Authors	Nao Takano, Gita Alaghband
Abstract	Generative Adversarial Networks (GANs) in supervised settings can generate photo-realistic corresponding output from low-definition input (SRGAN). Using the architecture presented in the SRGAN original paper [2], we explore how selecting a dataset affects the outcome by using three different datasets to see that SRGAN fundamentally learns objects, with their shape, color, and texture, and redraws them in the output rather than merely attempting to sharpen edges. This is further underscored with our demonstration that once the network learns the images of the dataset, it can generate a photo-like image with even a slight hint of what it might look like for the original from a very blurry edged sketch. Given a set of inference images, the network trained with the same dataset results in a better outcome over the one trained with arbitrary set of images, and we report its significance numerically with Frechet Inception Distance score [22].
Tasks
Published	2019-03-24
URL	http://arxiv.org/abs/1903.09922v1
PDF	http://arxiv.org/pdf/1903.09922v1.pdf
PWC	https://paperswithcode.com/paper/srgan-training-dataset-matters
Repo
Framework

Ship Instance Segmentation From Remote Sensing Images Using Sequence Local Context Module


Title	Ship Instance Segmentation From Remote Sensing Images Using Sequence Local Context Module
Authors	Yingchao Feng, Wenhui Diao, Zhonghan Chang, Menglong Yan, Xian Sun, Xin Gao
Abstract	The performance of object instance segmentation in remote sensing images has been greatly improved through the introduction of many landmark frameworks based on convolutional neural network. However, the object densely issue still affects the accuracy of such segmentation frameworks. Objects of the same class are easily confused, which is most likely due to the close docking between objects. We think context information is critical to address this issue. So, we propose a novel framework called SLCMASK-Net, in which a sequence local context module (SLC) is introduced to avoid confusion between objects of the same class. The SLC module applies a sequence of dilation convolution blocks to progressively learn multi-scale context information in the mask branch. Besides, we try to add SLC module to different locations in our framework and experiment with the effect of different parameter settings. Comparative experiments are conducted on remote sensing images acquired by QuickBird with a resolution of $0.5m-1m$ and the results show that the proposed method achieves state-of-the-art performance.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2019-04-22
URL	http://arxiv.org/abs/1904.09823v1
PDF	http://arxiv.org/pdf/1904.09823v1.pdf
PWC	https://paperswithcode.com/paper/ship-instance-segmentation-from-remote
Repo
Framework

Implicit Pairs for Boosting Unpaired Image-to-Image Translation


Title	Implicit Pairs for Boosting Unpaired Image-to-Image Translation
Authors	Yiftach Ginger, Dov Danon, Hadar Averbuch-Elor, Daniel Cohen-Or
Abstract	In image-to-image translation the goal is to learn a mapping from one image domain to another. In the case of supervised approaches the mapping is learned from paired samples. However, collecting large sets of image pairs is often either prohibitively expensive or not possible. As a result, in recent years more attention has been given to techniques that learn the mapping from unpaired sets. In our work, we show that injecting implicit pairs into unpaired sets strengthens the mapping between the two domains, improves the compatibility of their distributions, and leads to performance boosting of unsupervised techniques by over 14% across several measurements. The competence of the implicit pairs is further pronounced with the use of pseudo-pairs, i.e., paired samples which only approximate a real pair. We demonstrate the effect of the approximated implicit samples on image-to-image translation problems, where such pseudo-pairs may be synthesized in one direction, but not in the other. We further show that pseudo-pairs are significantly more effective as implicit pairs in an unpaired setting, than directly using them explicitly in a paired setting.
Tasks	Image-to-Image Translation
Published	2019-04-15
URL	https://arxiv.org/abs/1904.06913v2
PDF	https://arxiv.org/pdf/1904.06913v2.pdf
PWC	https://paperswithcode.com/paper/implicit-pairs-for-boosting-unpaired-image-to
Repo
Framework

PHYRE: A New Benchmark for Physical Reasoning


Title	PHYRE: A New Benchmark for Physical Reasoning
Authors	Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick
Abstract	Understanding and reasoning about physics is an important ability of intelligent agents. We develop the PHYRE benchmark for physical reasoning that contains a set of simple classical mechanics puzzles in a 2D physical environment. The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles. We test several modern learning algorithms on PHYRE and find that these algorithms fall short in solving the puzzles efficiently. We expect that PHYRE will encourage the development of novel sample-efficient agents that learn efficient but useful models of physics. For code and to play PHYRE for yourself, please visit https://player.phyre.ai.
Tasks
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05656v1
PDF	https://arxiv.org/pdf/1908.05656v1.pdf
PWC	https://paperswithcode.com/paper/phyre-a-new-benchmark-for-physical-reasoning
Repo
Framework

Rethinking System Health Management


Title	Rethinking System Health Management
Authors	Edward Balaban, Stephen B. Johnson, Mykel J. Kochenderfer
Abstract	Health management of complex dynamic systems has traditionally evolved separately from automated control, planning, and scheduling (generally referred to in the paper as decision making). A goal of Integrated System Health Management has been to enable coordination between system health management and decision making, although successful practical implementations have remained limited. This paper proposes that, rather than being treated as connected, yet distinct entities, system health management and decision making should be unified in their formulations. Enabled by advances in modeling and computing, we argue that the unified approach will increase a system’s operational effectiveness and may also lead to a lower overall system complexity. We overview the prevalent system health management methodology and illustrate its limitations through numerical examples. We then describe the proposed unification approach and show how it accommodates the typical system health management concepts.
Tasks	Decision Making
Published	2019-03-10
URL	http://arxiv.org/abs/1903.03948v1
PDF	http://arxiv.org/pdf/1903.03948v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-system-health-management
Repo
Framework

How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies?


Title	How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies?
Authors	Quan Vuong, Sharad Vikram, Hao Su, Sicun Gao, Henrik I. Christensen
Abstract	Recently, reinforcement learning (RL) algorithms have demonstrated remarkable success in learning complicated behaviors from minimally processed input. However, most of this success is limited to simulation. While there are promising successes in applying RL algorithms directly on real systems, their performance on more complex systems remains bottle-necked by the relative data inefficiency of RL algorithms. Domain randomization is a promising direction of research that has demonstrated impressive results using RL algorithms to control real robots. At a high level, domain randomization works by training a policy on a distribution of environmental conditions in simulation. If the environments are diverse enough, then the policy trained on this distribution will plausibly generalize to the real world. A human-specified design choice in domain randomization is the form and parameters of the distribution of simulated environments. It is unclear how to the best pick the form and parameters of this distribution and prior work uses hand-tuned distributions. This extended abstract demonstrates that the choice of the distribution plays a major role in the performance of the trained policies in the real world and that the parameter of this distribution can be optimized to maximize the performance of the trained policies in the real world
Tasks
Published	2019-03-28
URL	http://arxiv.org/abs/1903.11774v1
PDF	http://arxiv.org/pdf/1903.11774v1.pdf
PWC	https://paperswithcode.com/paper/how-to-pick-the-domain-randomization
Repo
Framework

System Demo for Transfer Learning across Vision and Text using Domain Specific CNN Accelerator for On-Device NLP Applications


Title	System Demo for Transfer Learning across Vision and Text using Domain Specific CNN Accelerator for On-Device NLP Applications
Authors	Baohua Sun, Lin Yang, Michael Lin, Wenhan Zhang, Patrick Dong, Charles Young, Jason Dong
Abstract	Power-efficient CNN Domain Specific Accelerator (CNN-DSA) chips are currently available for wide use in mobile devices. These chips are mainly used in computer vision applications. However, the recent work of Super Characters method for text classification and sentiment analysis tasks using two-dimensional CNN models has also achieved state-of-the-art results through the method of transfer learning from vision to text. In this paper, we implemented the text classification and sentiment analysis applications on mobile devices using CNN-DSA chips. Compact network representations using one-bit and three-bits precision for coefficients and five-bits for activations are used in the CNN-DSA chip with power consumption less than 300mW. For edge devices under memory and compute constraints, the network is further compressed by approximating the external Fully Connected (FC) layers within the CNN-DSA chip. At the workshop, we have two system demonstrations for NLP tasks. The first demo classifies the input English Wikipedia sentence into one of the 14 ontologies. The second demo classifies the Chinese online-shopping review into positive or negative.
Tasks	Sentiment Analysis, Text Classification, Transfer Learning
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01145v1
PDF	https://arxiv.org/pdf/1906.01145v1.pdf
PWC	https://paperswithcode.com/paper/system-demo-for-transfer-learning-across
Repo
Framework

Machine Learning based Simulation Optimisation for Trailer Management


Title	Machine Learning based Simulation Optimisation for Trailer Management
Authors	Dylan Rijnen, Jason Rhuggenaath, Paulo R. de O. da Costa, Yingqian Zhang
Abstract	In many situations, simulation models are developed to handle complex real-world business optimisation problems. For example, a discrete-event simulation model is used to simulate the trailer management process in a big Fast-Moving Consumer Goods company. To address the problem of finding suitable inputs to this simulator for optimising fleet configuration, we propose a simulation optimisation approach in this paper. The simulation optimisation model combines a metaheuristic search (genetic algorithm), with an approximation model filter (feed-forward neural network) to optimise the parameter configuration of the simulation model. We introduce an ensure probability that overrules the rejection of potential solutions by the approximation model and we demonstrate its effectiveness. In addition, we evaluate the impact of the parameters of the optimisation model on its effectiveness and show the parameters such as population size, filter threshold, and mutation probability can have a significant impact on the overall optimisation performance. Moreover, we compare the proposed method with a single global approximation model approach and a random-based approach. The results show the effectiveness of our method in terms of computation time and solution quality.
Tasks
Published	2019-07-17
URL	https://arxiv.org/abs/1907.07568v1
PDF	https://arxiv.org/pdf/1907.07568v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-based-simulation
Repo
Framework

Semi-Decentralized Coordinated Online Learning for Continuous Games with Coupled Constraints via Augmented Lagrangian


Title	Semi-Decentralized Coordinated Online Learning for Continuous Games with Coupled Constraints via Augmented Lagrangian
Authors	Ezra Tampubolon, Holger Boche
Abstract	We consider a class of concave continuous games in which the corresponding admissible strategy profile of each player underlies affine coupling constraints. We propose a novel algorithm that leads the relevant population dynamic toward Nash equilibrium. This algorithm is based on a mirror ascent algorithm, which suits with the framework of no-regret online learning, and on the augmented Lagrangian method. The decentralization aspect of the algorithm corresponds to the aspects that the iterate of each player requires the local information about how she contributes to the coupling constraints and the price vector broadcasted by a central coordinator. So each player needs not know about the population action. Moreover, no specific control by the central primary coordinator is required. We give a condition on the step sizes and the degree of the augmentation of the Lagrangian, such that the proposed algorithm converges to a generalized Nash equilibrium.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09276v1
PDF	https://arxiv.org/pdf/1910.09276v1.pdf
PWC	https://paperswithcode.com/paper/semi-decentralized-coordinated-online
Repo
Framework

Multiple-Identity Image Attacks Against Face-based Identity Verification


Title	Multiple-Identity Image Attacks Against Face-based Identity Verification
Authors	Jerone T. A. Andrews, Thomas Tanay, Lewis D. Griffin
Abstract	Facial verification systems are vulnerable to poisoning attacks that make use of multiple-identity images (MIIs)—face images stored in a database that resemble multiple persons, such that novel images of any of the constituent persons are verified as matching the identity of the MII. Research on this mode of attack has focused on defence by detection, with no explanation as to why the vulnerability exists. New quantitative results are presented that support an explanation in terms of the geometry of the representations spaces used by the verification systems. In the spherical geometry of those spaces, the angular distance distributions of matching and non-matching pairs of face representations are only modestly separated, approximately centred at 90 and 40-60 degrees, respectively. This is sufficient for open-set verification on normal data but provides an opportunity for MII attacks. Our analysis considers ideal MII algorithms, demonstrating that, if realisable, they would deliver faces roughly 45 degrees from their constituent faces, thus classed as matching them. We study the performance of three methods for MII generation—gallery search, image space morphing, and representation space inversion—and show that the latter two realise the ideal well enough to produce effective attacks, while the former could succeed but only with an implausibly large gallery to search. Gallery search and inversion MIIs depend on having access to a facial comparator, for optimisation, but our results show that these attacks can still be effective when attacking disparate comparators, thus securing a deployed comparator is an insufficient defence.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08507v1
PDF	https://arxiv.org/pdf/1906.08507v1.pdf
PWC	https://paperswithcode.com/paper/multiple-identity-image-attacks-against-face
Repo
Framework

Bringing Giant Neural Networks Down to Earth with Unlabeled Data


Title	Bringing Giant Neural Networks Down to Earth with Unlabeled Data
Authors	Yehui Tang, Shan You, Chang Xu, Boxin Shi, Chao Xu
Abstract	Compressing giant neural networks has gained much attention for their extensive applications on edge devices such as cellphones. During the compressing process, one of the most important procedures is to retrain the pre-trained models using the original training dataset. However, due to the consideration of security, privacy or commercial profits, in practice, only a fraction of sample training data are made available, which makes the retraining infeasible. To solve this issue, this paper proposes to resort to unlabeled data in hand that can be cheaper to acquire. Specifically, we exploit the unlabeled data to mimic the classification characteristics of giant networks, so that the original capacity can be preserved nicely. Nevertheless, there exists a dataset bias between the labeled and unlabeled data, which may disturb the training and degrade the performance. We thus fix this bias by an adversarial loss to make an alignment on the distributions of their low-level feature representations. We further provide theoretical discussions about how the unlabeled data help compressed networks to generalize better. Experimental results demonstrate that the unlabeled data can significantly improve the performance of the compressed networks.
Tasks
Published	2019-07-13
URL	https://arxiv.org/abs/1907.06065v2
PDF	https://arxiv.org/pdf/1907.06065v2.pdf
PWC	https://paperswithcode.com/paper/bringing-giant-neural-networks-down-to-earth
Repo
Framework

Improving Bidirectional Decoding with Dynamic Target Semantics in Neural Machine Translation


Title	Improving Bidirectional Decoding with Dynamic Target Semantics in Neural Machine Translation
Authors	Yong Shan, Yang Feng, Jinchao Zhang, Fandong Meng, Wen Zhang
Abstract	Generally, Neural Machine Translation models generate target words in a left-to-right (L2R) manner and fail to exploit any future (right) semantics information, which usually produces an unbalanced translation. Recent works attempt to utilize the right-to-left (R2L) decoder in bidirectional decoding to alleviate this problem. In this paper, we propose a novel \textbf{D}ynamic \textbf{I}nteraction \textbf{M}odule (\textbf{DIM}) to dynamically exploit target semantics from R2L translation for enhancing the L2R translation quality. Different from other bidirectional decoding approaches, DIM firstly extracts helpful target information through addressing and reading operations, then updates target semantics for tracking the interactive history. Additionally, we further introduce an \textbf{agreement regularization} term into the training objective to narrow the gap between L2R and R2L translations. Experimental results on NIST Chinese$\Rightarrow$English and WMT’16 English$\Rightarrow$Romanian translation tasks show that our system achieves significant improvements over baseline systems, which also reaches comparable results compared to the state-of-the-art Transformer model with much fewer parameters of it.
Tasks	Machine Translation
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01597v1
PDF	https://arxiv.org/pdf/1911.01597v1.pdf
PWC	https://paperswithcode.com/paper/improving-bidirectional-decoding-with-dynamic
Repo
Framework

Three-dimensional Generative Adversarial Nets for Unsupervised Metal Artifact Reduction


Title	Three-dimensional Generative Adversarial Nets for Unsupervised Metal Artifact Reduction
Authors	Megumi Nakao, Keiho Imanishi, Nobuhiro Ueda, Yuichiro Imai, Tadaaki Kirita, Tetsuya Matsuda
Abstract	The reduction of metal artifacts in computed tomography (CT) images, specifically for strong artifacts generated from multiple metal objects, is a challenging issue in medical imaging research. Although there have been some studies on supervised metal artifact reduction through the learning of synthesized artifacts, it is difficult for simulated artifacts to cover the complexity of the real physical phenomena that may be observed in X-ray propagation. In this paper, we introduce metal artifact reduction methods based on an unsupervised volume-to-volume translation learned from clinical CT images. We construct three-dimensional adversarial nets with a regularized loss function designed for metal artifacts from multiple dental fillings. The results of experiments using 915 CT volumes from real patients demonstrate that the proposed framework has an outstanding capacity to reduce strong artifacts and to recover underlying missing voxels, while preserving the anatomical features of soft tissues and tooth structures from the original images.
Tasks	Computed Tomography (CT), Metal Artifact Reduction
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08105v2
PDF	https://arxiv.org/pdf/1911.08105v2.pdf
PWC	https://paperswithcode.com/paper/three-dimensional-generative-adversarial-nets
Repo
Framework