Paper Group ANR 183
Routing Driverless Transport Vehicles in Car Assembly with Answer Set Programming. Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition. CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance. Phase Retrieval Under a Generative Prior. Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named En …
Routing Driverless Transport Vehicles in Car Assembly with Answer Set Programming
Title | Routing Driverless Transport Vehicles in Car Assembly with Answer Set Programming |
Authors | Martin Gebser, Philipp Obermeier, Michel Ratsch-Heitmann, Mario Runge, Torsten Schaub |
Abstract | Automated storage and retrieval systems are principal components of modern production and warehouse facilities. In particular, automated guided vehicles nowadays substitute human-operated pallet trucks in transporting production materials between storage locations and assembly stations. While low-level control systems take care of navigating such driverless vehicles along programmed routes and avoid collisions even under unforeseen circumstances, in the common case of multiple vehicles sharing the same operation area, the problem remains how to set up routes such that a collection of transport tasks is accomplished most effectively. We address this prevalent problem in the context of car assembly at Mercedes-Benz Ludwigsfelde GmbH, a large-scale producer of commercial vehicles, where routes for automated guided vehicles used in the production process have traditionally been hand-coded by human engineers. Such ad-hoc methods may suffice as long as a running production process remains in place, while any change in the factory layout or production targets necessitates tedious manual reconfiguration, not to mention the missing portability between different production plants. Unlike this, we propose a declarative approach based on Answer Set Programming to optimize the routes taken by automated guided vehicles for accomplishing transport tasks. The advantages include a transparent and executable problem formalization, provable optimality of routes relative to objective criteria, as well as elaboration tolerance towards particular factory layouts and production targets. Moreover, we demonstrate that our approach is efficient enough to deal with the transport tasks evolving in realistic production processes at the car factory of Mercedes-Benz Ludwigsfelde GmbH. |
Tasks | |
Published | 2018-04-27 |
URL | http://arxiv.org/abs/1804.10437v1 |
http://arxiv.org/pdf/1804.10437v1.pdf | |
PWC | https://paperswithcode.com/paper/routing-driverless-transport-vehicles-in-car |
Repo | |
Framework | |
Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition
Title | Orthogonal Deep Features Decomposition for Age-Invariant Face Recognition |
Authors | Yitong Wang, Dihong Gong, Zheng Zhou, Xing Ji, Hao Wang, Zhifeng Li, Wei Liu, Tong Zhang |
Abstract | As facial appearance is subject to significant intra-class variations caused by the aging process over time, age-invariant face recognition (AIFR) remains a major challenge in face recognition community. To reduce the intra-class discrepancy caused by the aging, in this paper we propose a novel approach (namely, Orthogonal Embedding CNNs, or OE-CNNs) to learn the age-invariant deep face features. Specifically, we decompose deep face features into two orthogonal components to represent age-related and identity-related features. As a result, identity-related features that are robust to aging are then used for AIFR. Besides, for complementing the existing cross-age datasets and advancing the research in this field, we construct a brand-new large-scale Cross-Age Face dataset (CAF). Extensive experiments conducted on the three public domain face aging datasets (MORPH Album 2, CACD-VS and FG-NET) have shown the effectiveness of the proposed approach and the value of the constructed CAF dataset on AIFR. Benchmarking our algorithm on one of the most popular general face recognition (GFR) dataset LFW additionally demonstrates the comparable generalization performance on GFR. |
Tasks | Age-Invariant Face Recognition, Face Recognition |
Published | 2018-10-17 |
URL | http://arxiv.org/abs/1810.07599v1 |
http://arxiv.org/pdf/1810.07599v1.pdf | |
PWC | https://paperswithcode.com/paper/orthogonal-deep-features-decomposition-for |
Repo | |
Framework | |
CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance
Title | CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance |
Authors | Andrew Jesson, Nicolas Guizard, Sina Hamidi Ghalehjegh, Damien Goblot, Florian Soudan, Nicolas Chapados |
Abstract | We introduce CASED, a novel curriculum sampling algorithm that facilitates the optimization of deep learning segmentation or detection models on data sets with extreme class imbalance. We evaluate the CASED learning framework on the task of lung nodule detection in chest CT. In contrast to two-stage solutions, wherein nodule candidates are first proposed by a segmentation model and refined by a second detection stage, CASED improves the training of deep nodule segmentation models (e.g. UNet) to the point where state of the art results are achieved using only a trivial detection stage. CASED improves the optimization of deep segmentation models by allowing them to first learn how to distinguish nodules from their immediate surroundings, while continuously adding a greater proportion of difficult-to-classify global context, until uniformly sampling from the empirical data distribution. Using CASED during training yields a minimalist proposal to the lung nodule detection problem that tops the LUNA16 nodule detection benchmark with an average sensitivity score of 88.35%. Furthermore, we find that models trained using CASED are robust to nodule annotation quality by showing that comparable results can be achieved when only a point and radius for each ground truth nodule are provided during training. Finally, the CASED learning framework makes no assumptions with regard to imaging modality or segmentation target and should generalize to other medical imaging problems where class imbalance is a persistent problem. |
Tasks | Lung Nodule Detection |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10819v1 |
http://arxiv.org/pdf/1807.10819v1.pdf | |
PWC | https://paperswithcode.com/paper/cased-curriculum-adaptive-sampling-for |
Repo | |
Framework | |
Phase Retrieval Under a Generative Prior
Title | Phase Retrieval Under a Generative Prior |
Authors | Paul Hand, Oscar Leong, Vladislav Voroninski |
Abstract | The phase retrieval problem asks to recover a natural signal $y_0 \in \mathbb{R}^n$ from $m$ quadratic observations, where $m$ is to be minimized. As is common in many imaging problems, natural signals are considered sparse with respect to a known basis, and the generic sparsity prior is enforced via $\ell_1$ regularization. While successful in the realm of linear inverse problems, such $\ell_1$ methods have encountered possibly fundamental limitations, as no computationally efficient algorithm for phase retrieval of a $k$-sparse signal has been proven to succeed with fewer than $O(k^2\log n)$ generic measurements, exceeding the theoretical optimum of $O(k \log n)$. In this paper, we propose a novel framework for phase retrieval by 1) modeling natural signals as being in the range of a deep generative neural network $G : \mathbb{R}^k \rightarrow \mathbb{R}^n$ and 2) enforcing this prior directly by optimizing an empirical risk objective over the domain of the generator. Our formulation has provably favorable global geometry for gradient methods, as soon as $m = O(kd^2\log n)$, where $d$ is the depth of the network. Specifically, when suitable deterministic conditions on the generator and measurement matrix are met, we construct a descent direction for any point outside of a small neighborhood around the unique global minimizer and its negative multiple, and show that such conditions hold with high probability under Gaussian ensembles of multilayer fully-connected generator networks and measurement matrices. This formulation for structured phase retrieval thus has two advantages over sparsity based methods: 1) deep generative priors can more tightly represent natural signals and 2) information theoretically optimal sample complexity. We corroborate these results with experiments showing that exploiting generative models in phase retrieval tasks outperforms sparse phase retrieval methods. |
Tasks | |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.04261v1 |
http://arxiv.org/pdf/1807.04261v1.pdf | |
PWC | https://paperswithcode.com/paper/phase-retrieval-under-a-generative-prior |
Repo | |
Framework | |
Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition
Title | Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition |
Authors | Qi Wang, Yuhang Xia, Yangming Zhou, Tong Ruan, Daqi Gao, Ping He |
Abstract | Clinical Named Entity Recognition (CNER) aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translational research. In recent years, deep neural networks have achieved significant success in named entity recognition and many other Natural Language Processing (NLP) tasks. Most of these algorithms are trained end to end, and can automatically learn features from large scale labeled datasets. However, these data-driven methods typically lack the capability of processing rare or unseen entities. Previous statistical methods and feature engineering practice have demonstrated that human knowledge can provide valuable information for handling rare and unseen cases. In this paper, we address the problem by incorporating dictionaries into deep neural networks for the Chinese CNER task. Two different architectures that extend the Bi-directional Long Short-Term Memory (Bi-LSTM) neural network and five different feature representation schemes are proposed to handle the task. Computational results on the CCKS-2017 Task 2 benchmark dataset show that the proposed method achieves the highly competitive performance compared with the state-of-the-art deep learning methods. |
Tasks | Feature Engineering, Named Entity Recognition |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.05017v1 |
http://arxiv.org/pdf/1804.05017v1.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-dictionaries-into-deep-neural |
Repo | |
Framework | |
Nonparametric Gaussian Mixture Models for the Multi-Armed Contextual Bandit
Title | Nonparametric Gaussian Mixture Models for the Multi-Armed Contextual Bandit |
Authors | Iñigo Urteaga, Chris H. Wiggins |
Abstract | We here adopt Bayesian nonparametric mixture models to extend multi-armed bandits in general, and Thompson sampling in particular, to complex scenarios where there is reward model uncertainty. The multi-armed bandit is a sequential allocation task where an agent must learn a policy that maximizes long term payoff, where only the reward of the played arm is observed at each interaction with the world. In the stochastic bandit setting, at each interaction, the reward for the selected action is generated from an unknown distribution. Thompson sampling is a generative and interpretable multi-armed bandit algorithm that has been shown both to perform well in practice, and to enjoy optimality properties for certain reward functions. Nevertheless, Thompson sampling requires knowledge of the true reward model, for calculation of expected rewards and sampling from its parameter posterior. In this work, we extend Thompson sampling to complex scenarios where there is model uncertainty, by adopting a very flexible set of reward distributions: nonparametric Gaussian mixture models. The generative process of Bayesian nonparametric mixtures naturally aligns with the Bayesian modeling of multi-armed bandits: the nonparametric model autonomously determines its complexity in an online fashion, as new rewards are observed for the played arms. By characterizing each arm’s reward distribution with independent Dirichlet process mixtures and per-mixture parameters, the proposed method sequentially learns the model that best approximates the true underlying reward distribution, achieving successful performance in synthetic and real datasets. Our contribution is valuable for practical scenarios, as it avoids stringent case-by-case model specifications, and yet attains reduced regret in diverse bandit settings. |
Tasks | Multi-Armed Bandits |
Published | 2018-08-08 |
URL | https://arxiv.org/abs/1808.02932v2 |
https://arxiv.org/pdf/1808.02932v2.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-gaussian-mixture-models-for-the |
Repo | |
Framework | |
Intentions of Vulnerable Road Users - Detection and Forecasting by Means of Machine Learning
Title | Intentions of Vulnerable Road Users - Detection and Forecasting by Means of Machine Learning |
Authors | Michael Goldhammer, Sebastian Köhler, Stefan Zernetsch, Konrad Doll, Bernhard Sick, Klaus Dietmayer |
Abstract | Avoiding collisions with vulnerable road users (VRUs) using sensor-based early recognition of critical situations is one of the manifold opportunities provided by the current development in the field of intelligent vehicles. As especially pedestrians and cyclists are very agile and have a variety of movement options, modeling their behavior in traffic scenes is a challenging task. In this article we propose movement models based on machine learning methods, in particular artificial neural networks, in order to classify the current motion state and to predict the future trajectory of VRUs. Both model types are also combined to enable the application of specifically trained motion predictors based on a continuously updated pseudo probabilistic state classification. Furthermore, the architecture is used to evaluate motion-specific physical models for starting and stopping and video-based pedestrian motion classification. A comprehensive dataset consisting of 1068 pedestrian and 494 cyclist scenes acquired at an urban intersection is used for optimization, training, and evaluation of the different models. The results show substantial higher classification rates and the ability to earlier recognize motion state changes with the machine learning approaches compared to interacting multiple model (IMM) Kalman Filtering. The trajectory prediction quality is also improved for all kinds of test scenes, especially when starting and stopping motions are included. Here, 37% and 41% lower position errors were achieved on average, respectively. |
Tasks | Trajectory Prediction |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03577v1 |
http://arxiv.org/pdf/1803.03577v1.pdf | |
PWC | https://paperswithcode.com/paper/intentions-of-vulnerable-road-users-detection |
Repo | |
Framework | |
A Cross Entropy based Optimization Algorithm with Global Convergence Guarantees
Title | A Cross Entropy based Optimization Algorithm with Global Convergence Guarantees |
Authors | Ajin George Joseph, Shalabh Bhatnagar |
Abstract | The cross entropy (CE) method is a model based search method to solve optimization problems where the objective function has minimal structure. The Monte-Carlo version of the CE method employs the naive sample averaging technique which is inefficient, both computationally and space wise. We provide a novel stochastic approximation version of the CE method, where the sample averaging is replaced with incremental geometric averaging. This approach can save considerable computational and storage costs. Our algorithm is incremental in nature and possesses additional attractive features such as accuracy, stability, robustness and convergence to the global optimum for a particular class of objective functions. We evaluate the algorithm on a variety of global optimization benchmark problems and the results obtained corroborate our theoretical findings. |
Tasks | |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10291v1 |
http://arxiv.org/pdf/1801.10291v1.pdf | |
PWC | https://paperswithcode.com/paper/a-cross-entropy-based-optimization-algorithm |
Repo | |
Framework | |
Decision-support for the Masses by Enabling Conversations with Open Data
Title | Decision-support for the Masses by Enabling Conversations with Open Data |
Authors | Biplav Srivastava |
Abstract | Open data refers to data that is freely available for reuse. Although there has been rapid increase in availability of open data to public in the last decade, this has not translated into better decision-support tools for them. We propose intelligent conversation generators as a grand challenge that would automatically create data-driven conversation interfaces (CIs), also known as chatbots or dialog systems, from open data and deliver personalized analytical insights to users based on their contextual needs. Such generators will not only help bring Artificial Intelligence (AI)-based solutions for important societal problems to the masses but also advance AI by providing an integrative testbed for human-centric AI and filling gaps in the state-of-art towards this aim. |
Tasks | |
Published | 2018-09-16 |
URL | http://arxiv.org/abs/1809.06723v2 |
http://arxiv.org/pdf/1809.06723v2.pdf | |
PWC | https://paperswithcode.com/paper/decision-support-for-the-masses-by-enabling |
Repo | |
Framework | |
An Approximate Shading Model with Detail Decomposition for Object Relighting
Title | An Approximate Shading Model with Detail Decomposition for Object Relighting |
Authors | Zicheng Liao, Kevin Karsch, Hongyi Zhang, David Forsyth |
Abstract | We present an object relighting system that allows an artist to select an object from an image and insert it into a target scene. Through simple interactions, the system can adjust illumination on the inserted object so that it appears naturally in the scene. To support image-based relighting, we build object model from the image, and propose a \emph{perceptually-inspired} approximate shading model for the relighting. It decomposes the shading field into (a) a rough shape term that can be reshaded, (b) a parametric shading detail that encodes missing features from the first term, and (c) a geometric detail term that captures fine-scale material properties. With this decomposition, the shading model combines 3D rendering and image-based composition and allows more flexible compositing than image-based methods. Quantitative evaluation and a set of user studies suggest our method is a promising alternative to existing methods of object insertion. |
Tasks | |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07514v1 |
http://arxiv.org/pdf/1804.07514v1.pdf | |
PWC | https://paperswithcode.com/paper/an-approximate-shading-model-with-detail |
Repo | |
Framework | |
A Channel-based Exact Inference Algorithm for Bayesian Networks
Title | A Channel-based Exact Inference Algorithm for Bayesian Networks |
Authors | Bart Jacobs |
Abstract | This paper describes a new algorithm for exact Bayesian inference that is based on a recently proposed compositional semantics of Bayesian networks in terms of channels. The paper concentrates on the ideas behind this algorithm, involving a linearisation (`stretching’) of the Bayesian network, followed by a combination of forward state transformation and backward predicate transformation, while evidence is accumulated along the way. The performance of a prototype implementation of the algorithm in Python is briefly compared to a standard implementation (pgmpy): first results show competitive performance. | |
Tasks | Bayesian Inference |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.08032v1 |
http://arxiv.org/pdf/1804.08032v1.pdf | |
PWC | https://paperswithcode.com/paper/a-channel-based-exact-inference-algorithm-for |
Repo | |
Framework | |
PS-FCN: A Flexible Learning Framework for Photometric Stereo
Title | PS-FCN: A Flexible Learning Framework for Photometric Stereo |
Authors | Guanying Chen, Kai Han, Kwan-Yee K. Wong |
Abstract | This paper addresses the problem of photometric stereo for non-Lambertian surfaces. Existing approaches often adopt simplified reflectance models to make the problem more tractable, but this greatly hinders their applications on real-world objects. In this paper, we propose a deep fully convolutional network, called PS-FCN, that takes an arbitrary number of images of a static object captured under different light directions with a fixed camera as input, and predicts a normal map of the object in a fast feed-forward pass. Unlike the recently proposed learning based method, PS-FCN does not require a pre-defined set of light directions during training and testing, and can handle multiple images and light directions in an order-agnostic manner. Although we train PS-FCN on synthetic data, it can generalize well on real datasets. We further show that PS-FCN can be easily extended to handle the problem of uncalibrated photometric stereo.Extensive experiments on public real datasets show that PS-FCN outperforms existing approaches in calibrated photometric stereo, and promising results are achieved in uncalibrated scenario, clearly demonstrating its effectiveness. |
Tasks | |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08696v1 |
http://arxiv.org/pdf/1807.08696v1.pdf | |
PWC | https://paperswithcode.com/paper/ps-fcn-a-flexible-learning-framework-for |
Repo | |
Framework | |
Automated rating of recorded classroom presentations using speech analysis in kazakh
Title | Automated rating of recorded classroom presentations using speech analysis in kazakh |
Authors | Akzharkyn Izbassarova, Aidana Irmanova, A. P. James |
Abstract | Effective presentation skills can help to succeed in business, career and academy. This paper presents the design of speech assessment during the oral presentation and the algorithm for speech evaluation based on criteria of optimal intonation. As the pace of the speech and its optimal intonation varies from language to language, developing an automatic identification of language during the presentation is required. Proposed algorithm was tested with presentations delivered in Kazakh language. For testing purposes the features of Kazakh phonemes were extracted using MFCC and PLP methods and created a Hidden Markov Model (HMM) [5], [5] of Kazakh phonemes. Kazakh vowel formants were defined and the correlation between the deviation rate in fundamental frequency and the liveliness of the speech to evaluate intonation of the presentation was analyzed. It was established that the threshold value between monotone and dynamic speech is 0.16 and the error for intonation evaluation is 19%. |
Tasks | |
Published | 2018-01-01 |
URL | http://arxiv.org/abs/1801.00453v1 |
http://arxiv.org/pdf/1801.00453v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-rating-of-recorded-classroom |
Repo | |
Framework | |
Backdoor Decomposable Monotone Circuits and their Propagation Complete Encodings
Title | Backdoor Decomposable Monotone Circuits and their Propagation Complete Encodings |
Authors | Petr Kučera, Petr Savický |
Abstract | We describe a compilation language of backdoor decomposable monotone circuits (BDMCs) which generalizes several concepts appearing in the literature, e.g. DNNFs and backdoor trees. A $\mathcal{C}$-BDMC sentence is a monotone circuit which satisfies decomposability property (such as in DNNF) in which the inputs (or leaves) are associated with CNF encodings from a given base class $\mathcal{C}$. We consider different base classes which consist of encodings with various propagation strength. In particular, we consider encodings which implement consistency checker (CC) or domain consistency (DC) by unit propagation, unit refutation complete (URC) and propagation complete (PC) encodings. We show that a representation of a boolean function with a $\mathcal{C}$-BDMC can be transformed in polynomial time into an encoding from $\mathcal{C}$ for any of the classes of CNF encodings mentioned above. |
Tasks | |
Published | 2018-11-23 |
URL | https://arxiv.org/abs/1811.09435v3 |
https://arxiv.org/pdf/1811.09435v3.pdf | |
PWC | https://paperswithcode.com/paper/backdoor-decomposable-monotone-circuits-and |
Repo | |
Framework | |
Instance Map based Image Synthesis with a Denoising Generative Adversarial Network
Title | Instance Map based Image Synthesis with a Denoising Generative Adversarial Network |
Authors | Ziqiang Zheng, Chao Wang, Zhibin Yu, Haiyong Zheng, Bing Zheng |
Abstract | Semantic layouts based Image synthesizing, which has benefited from the success of Generative Adversarial Network (GAN), has drawn much attention in these days. How to enhance the synthesis image equality while keeping the stochasticity of the GAN is still a challenge. We propose a novel denoising framework to handle this problem. The overlapped objects generation is another challenging task when synthesizing images from a semantic layout to a realistic RGB photo. To overcome this deficiency, we include a one-hot semantic label map to force the generator paying more attention on the overlapped objects generation. Furthermore, we improve the loss function of the discriminator by considering perturb loss and cascade layer loss to guide the generation process. We applied our methods on the Cityscapes, Facades and NYU datasets and demonstrate the image generation ability of our model. |
Tasks | Denoising, Image Generation |
Published | 2018-01-10 |
URL | http://arxiv.org/abs/1801.03252v1 |
http://arxiv.org/pdf/1801.03252v1.pdf | |
PWC | https://paperswithcode.com/paper/instance-map-based-image-synthesis-with-a |
Repo | |
Framework | |