Paper Group ANR 968
Multi-Level Sensor Fusion with Deep Learning. Three-Dimensional GPU-Accelerated Active Contours for Automated Localization of Cells in Large Images. Fine-Grained Attention Mechanism for Neural Machine Translation. Deep-6DPose: Recovering 6D Object Pose from a Single RGB Image. Expressivity in TTS from Semantics and Pragmatics. Predictive Process Mo …
Multi-Level Sensor Fusion with Deep Learning
Title | Multi-Level Sensor Fusion with Deep Learning |
Authors | Valentin Vielzeuf, Alexis Lechervy, Stéphane Pateux, Frédéric Jurie |
Abstract | In the context of deep learning, this article presents an original deep network, namely CentralNet, for the fusion of information coming from different sensors. This approach is designed to efficiently and automatically balance the trade-off between early and late fusion (i.e. between the fusion of low-level vs high-level information). More specifically, at each level of abstraction-the different levels of deep networks-uni-modal representations of the data are fed to a central neural network which combines them into a common embedding. In addition, a multi-objective regularization is also introduced, helping to both optimize the central network and the unimodal networks. Experiments on four multimodal datasets not only show state-of-the-art performance, but also demonstrate that CentralNet can actually choose the best possible fusion strategy for a given problem. |
Tasks | Sensor Fusion |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.02447v1 |
http://arxiv.org/pdf/1811.02447v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-level-sensor-fusion-with-deep-learning |
Repo | |
Framework | |
Three-Dimensional GPU-Accelerated Active Contours for Automated Localization of Cells in Large Images
Title | Three-Dimensional GPU-Accelerated Active Contours for Automated Localization of Cells in Large Images |
Authors | Mahsa Lotfollahi, Sebastian Berisha, Leila Saadatifard, Laura Montier, Jokubas Ziburkus, David Mayerich |
Abstract | Cell segmentation in microscopy is a challenging problem, since cells are often asymmetric and densely packed. This becomes particularly challenging for extremely large images, since manual intervention and processing time can make segmentation intractable. In this paper, we present an efficient and highly parallel formulation for symmetric three-dimensional (3D) contour evolution that extends previous work on fast two-dimensional active contours. We provide a formulation for optimization on 3D images, as well as a strategy for accelerating computation on consumer graphics hardware. The proposed software takes advantage of Monte-Carlo sampling schemes in order to speed up convergence and reduce thread divergence. Experimental results show that this method provides superior performance for large 2D and 3D cell segmentation tasks when compared to existing methods on large 3D brain images. |
Tasks | Cell Segmentation |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06304v1 |
http://arxiv.org/pdf/1804.06304v1.pdf | |
PWC | https://paperswithcode.com/paper/three-dimensional-gpu-accelerated-active |
Repo | |
Framework | |
Fine-Grained Attention Mechanism for Neural Machine Translation
Title | Fine-Grained Attention Mechanism for Neural Machine Translation |
Authors | Heeyoul Choi, Kyunghyun Cho, Yoshua Bengio |
Abstract | Neural machine translation (NMT) has been a new paradigm in machine translation, and the attention mechanism has become the dominant approach with the state-of-the-art records in many language pairs. While there are variants of the attention mechanism, all of them use only temporal attention where one scalar value is assigned to one context vector corresponding to a source word. In this paper, we propose a fine-grained (or 2D) attention mechanism where each dimension of a context vector will receive a separate attention score. In experiments with the task of En-De and En-Fi translation, the fine-grained attention method improves the translation quality in terms of BLEU score. In addition, our alignment analysis reveals how the fine-grained attention mechanism exploits the internal structure of context vectors. |
Tasks | Machine Translation |
Published | 2018-03-30 |
URL | http://arxiv.org/abs/1803.11407v2 |
http://arxiv.org/pdf/1803.11407v2.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-attention-mechanism-for-neural |
Repo | |
Framework | |
Deep-6DPose: Recovering 6D Object Pose from a Single RGB Image
Title | Deep-6DPose: Recovering 6D Object Pose from a Single RGB Image |
Authors | Thanh-Toan Do, Ming Cai, Trung Pham, Ian Reid |
Abstract | Detecting objects and their 6D poses from only RGB images is an important task for many robotic applications. While deep learning methods have made significant progress in visual object detection and segmentation, the object pose estimation task is still challenging. In this paper, we introduce an end-toend deep learning framework, named Deep-6DPose, that jointly detects, segments, and most importantly recovers 6D poses of object instances from a single RGB image. In particular, we extend the recent state-of-the-art instance segmentation network Mask R-CNN with a novel pose estimation branch to directly regress 6D object poses without any post-refinements. Our key technical contribution is the decoupling of pose parameters into translation and rotation so that the rotation can be regressed via a Lie algebra representation. The resulting pose regression loss is differential and unconstrained, making the training tractable. The experiments on two standard pose benchmarking datasets show that our proposed approach compares favorably with the state-of-the-art RGB-based multi-stage pose estimation methods. Importantly, due to the end-to-end architecture, Deep-6DPose is considerably faster than competing multi-stage methods, offers an inference speed of 10 fps that is well suited for robotic applications. |
Tasks | Instance Segmentation, Object Detection, Pose Estimation, Semantic Segmentation |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10367v1 |
http://arxiv.org/pdf/1802.10367v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-6dpose-recovering-6d-object-pose-from-a |
Repo | |
Framework | |
Expressivity in TTS from Semantics and Pragmatics
Title | Expressivity in TTS from Semantics and Pragmatics |
Authors | Rodolfo Delmonte |
Abstract | In this paper we present ongoing work to produce an expressive TTS reader that can be used both in text and dialogue applications. The system called SPARSAR has been used to read (English) poetry so far but it can now be applied to any text. The text is fully analyzed both at phonetic and phonological level, and at syntactic and semantic level. In addition, the system has access to a restricted list of typical pragmatically marked phrases and expressions that are used to convey specific discourse function and speech acts and need specialized intonational contours. The text is transformed into a poem-like structures, where each line corresponds to a Breath Group, semantically and syntactically consistent. Stanzas correspond to paragraph boundaries. Analogical parameters are related to ToBI theoretical indices but their number is doubled. In this paper, we concentrate on short stories and fables. |
Tasks | |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07295v1 |
http://arxiv.org/pdf/1803.07295v1.pdf | |
PWC | https://paperswithcode.com/paper/expressivity-in-tts-from-semantics-and |
Repo | |
Framework | |
Predictive Process Monitoring Methods: Which One Suits Me Best?
Title | Predictive Process Monitoring Methods: Which One Suits Me Best? |
Authors | Chiara Di Francescomarino, Chiara Ghidini, Fabrizio Maria Maggi, Fredrik Milani |
Abstract | Predictive process monitoring has recently gained traction in academia and is maturing also in companies. However, with the growing body of research, it might be daunting for companies to navigate in this domain in order to find, provided certain data, what can be predicted and what methods to use. The main objective of this paper is developing a value-driven framework for classifying existing work on predictive process monitoring. This objective is achieved by systematically identifying, categorizing, and analyzing existing approaches for predictive process monitoring. The review is then used to develop a value-driven framework that can support organizations to navigate in the predictive process monitoring field and help them to find value and exploit the opportunities enabled by these analysis techniques. |
Tasks | |
Published | 2018-04-06 |
URL | http://arxiv.org/abs/1804.02422v1 |
http://arxiv.org/pdf/1804.02422v1.pdf | |
PWC | https://paperswithcode.com/paper/predictive-process-monitoring-methods-which |
Repo | |
Framework | |
A Convex Duality Framework for GANs
Title | A Convex Duality Framework for GANs |
Authors | Farzan Farnia, David Tse |
Abstract | Generative adversarial network (GAN) is a minimax game between a generator mimicking the true model and a discriminator distinguishing the samples produced by the generator from the real training samples. Given an unconstrained discriminator able to approximate any function, this game reduces to finding the generative model minimizing a divergence measure, e.g. the Jensen-Shannon (JS) divergence, to the data distribution. However, in practice the discriminator is constrained to be in a smaller class $\mathcal{F}$ such as neural nets. Then, a natural question is how the divergence minimization interpretation changes as we constrain $\mathcal{F}$. In this work, we address this question by developing a convex duality framework for analyzing GANs. For a convex set $\mathcal{F}$, this duality framework interprets the original GAN formulation as finding the generative model with minimum JS-divergence to the distributions penalized to match the moments of the data distribution, with the moments specified by the discriminators in $\mathcal{F}$. We show that this interpretation more generally holds for f-GAN and Wasserstein GAN. As a byproduct, we apply the duality framework to a hybrid of f-divergence and Wasserstein distance. Unlike the f-divergence, we prove that the proposed hybrid divergence changes continuously with the generative model, which suggests regularizing the discriminator’s Lipschitz constant in f-GAN and vanilla GAN. We numerically evaluate the power of the suggested regularization schemes for improving GAN’s training performance. |
Tasks | |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11740v1 |
http://arxiv.org/pdf/1810.11740v1.pdf | |
PWC | https://paperswithcode.com/paper/a-convex-duality-framework-for-gans |
Repo | |
Framework | |
Block Mean Approximation for Efficient Second Order Optimization
Title | Block Mean Approximation for Efficient Second Order Optimization |
Authors | Yao Lu, Mehrtash Harandi, Richard Hartley, Razvan Pascanu |
Abstract | Advanced optimization algorithms such as Newton method and AdaGrad benefit from second order derivative or second order statistics to achieve better descent directions and faster convergence rates. At their heart, such algorithms need to compute the inverse or inverse square root of a matrix whose size is quadratic of the dimensionality of the search space. For high dimensional search spaces, the matrix inversion or inversion of square root becomes overwhelming which in turn demands for approximate methods. In this work, we propose a new matrix approximation method which divides a matrix into blocks and represents each block by one or two numbers. The method allows efficient computation of matrix inverse and inverse square root. We apply our method to AdaGrad in training deep neural networks. Experiments show encouraging results compared to the diagonal approximation. |
Tasks | |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05484v3 |
http://arxiv.org/pdf/1804.05484v3.pdf | |
PWC | https://paperswithcode.com/paper/block-mean-approximation-for-efficient-second |
Repo | |
Framework | |
Subtask Gated Networks for Non-Intrusive Load Monitoring
Title | Subtask Gated Networks for Non-Intrusive Load Monitoring |
Authors | Changho Shin, Sunghwan Joo, Jaeryun Yim, Hyoseop Lee, Taesup Moon, Wonjong Rhee |
Abstract | Non-intrusive load monitoring (NILM), also known as energy disaggregation, is a blind source separation problem where a household’s aggregate electricity consumption is broken down into electricity usages of individual appliances. In this way, the cost and trouble of installing many measurement devices over numerous household appliances can be avoided, and only one device needs to be installed. The problem has been well-known since Hart’s seminal paper in 1992, and recently significant performance improvements have been achieved by adopting deep networks. In this work, we focus on the idea that appliances have on/off states, and develop a deep network for further performance improvements. Specifically, we propose a subtask gated network that combines the main regression network with an on/off classification subtask network. Unlike typical multitask learning algorithms where multiple tasks simply share the network parameters to take advantage of the relevance among tasks, the subtask gated network multiply the main network’s regression output with the subtask’s classification probability. When standby-power is additionally learned, the proposed solution surpasses the state-of-the-art performance for most of the benchmark cases. The subtask gated network can be very effective for any problem that inherently has on/off states. |
Tasks | Non-Intrusive Load Monitoring |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.06692v1 |
http://arxiv.org/pdf/1811.06692v1.pdf | |
PWC | https://paperswithcode.com/paper/subtask-gated-networks-for-non-intrusive-load |
Repo | |
Framework | |
Chart Parsing Multimodal Grammars
Title | Chart Parsing Multimodal Grammars |
Authors | Richard Moot |
Abstract | The short note describes the chart parser for multimodal type-logical grammars which has been developed in conjunction with the type-logical treebank for French. The chart parser presents an incomplete but fast implementation of proof search for multimodal type-logical grammars using the “deductive parsing” framework. Proofs found can be transformed to natural deduction proofs. |
Tasks | |
Published | 2018-04-06 |
URL | http://arxiv.org/abs/1804.02286v1 |
http://arxiv.org/pdf/1804.02286v1.pdf | |
PWC | https://paperswithcode.com/paper/chart-parsing-multimodal-grammars |
Repo | |
Framework | |
Wearable-based Mediation State Detection in Individuals with Parkinson’s Disease
Title | Wearable-based Mediation State Detection in Individuals with Parkinson’s Disease |
Authors | Murtadha D. Hssayeni, Michelle A. Burack, M. D., Joohi Jimenez-Shahed, M. D., Behnaz Ghoraani, Ph. D |
Abstract | One of the most prevalent complaints of individuals with mid-stage and advanced Parkinson’s disease (PD) is the fluctuating response to their medication (i.e., ON state with maximum benefit from medication and OFF state with no benefit from medication). In order to address these motor fluctuations, the patients go through periodic clinical examination where the treating physician reviews the patients’ self-report about duration in different medication states and optimize therapy accordingly. Unfortunately, the patients’ self-report can be unreliable and suffer from recall bias. There is a need to a technology-based system that can provide objective measures about the duration in different medication states that can be used by the treating physician to successfully adjust the therapy. In this paper, we developed a medication state detection algorithm to detect medication states using two wearable motion sensors. A series of significant features are extracted from the motion data and used in a classifier that is based on a support vector machine with fuzzy labeling. The developed algorithm is evaluated using a dataset with 19 PD subjects and a total duration of 1,052.24 minutes (17.54 hours). The algorithm resulted in an average classification accuracy of 90.5%, sensitivity of 94.2%, and specificity of 85.4%. |
Tasks | |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.06973v1 |
http://arxiv.org/pdf/1809.06973v1.pdf | |
PWC | https://paperswithcode.com/paper/wearable-based-mediation-state-detection-in |
Repo | |
Framework | |
Coregionalised Locomotion Envelopes - A Qualitative Approach
Title | Coregionalised Locomotion Envelopes - A Qualitative Approach |
Authors | Neil Dhir, Houman Dallali, Mo Rastgaar |
Abstract | ‘Sharing of statistical strength’ is a phrase often employed in machine learning and signal processing. In sensor networks, for example, missing signals from certain sensors may be predicted by exploiting their correlation with observed signals acquired from other sensors. For humans, our hands move synchronously with our legs, and we can exploit these implicit correlations for predicting new poses and for generating new natural-looking walking sequences. We can also go much further and exploit this form of transfer learning, to develop new control schemas for robust control of rehabilitation robots. In this short paper we introduce coregionalised locomotion envelopes - a method for multi-dimensional manifold regression, on human locomotion variates. Herein we render a qualitative description of this method. |
Tasks | Transfer Learning |
Published | 2018-03-13 |
URL | http://arxiv.org/abs/1803.04965v1 |
http://arxiv.org/pdf/1803.04965v1.pdf | |
PWC | https://paperswithcode.com/paper/coregionalised-locomotion-envelopes-a |
Repo | |
Framework | |
Multi-scale Geometric Summaries for Similarity-based Sensor Fusion
Title | Multi-scale Geometric Summaries for Similarity-based Sensor Fusion |
Authors | Christopher J. Tralie, Paul Bendich, John Harer |
Abstract | In this work, we address fusion of heterogeneous sensor data using wavelet-based summaries of fused self-similarity information from each sensor. The technique we develop is quite general, does not require domain specific knowledge or physical models, and requires no training. Nonetheless, it can perform surprisingly well at the general task of differentiating classes of time-ordered behavior sequences which are sensed by more than one modality. As a demonstration of our capabilities in the audio to video context, we focus on the differentiation of speech sequences. Data from two or more modalities first are represented using self-similarity matrices(SSMs) corresponding to time-ordered point clouds in feature spaces of each of these data sources; we note that these feature spaces can be of entirely different scale and dimensionality. A fused similarity template is then derived from the modality-specific SSMs using a technique called similarity network fusion (SNF). We investigate pipelines using SNF as both an upstream (feature-level) and a downstream (ranking-level) fusion technique. Multiscale geometric features of this template are then extracted using a recently-developed technique called the scattering transform, and these features are then used to differentiate speech sequences. This method outperforms unsupervised techniques which operate directly on the raw data, and it also outperforms stovepiped methods which operate on SSMs separately derived from the distinct modalities. The benefits of this method become even more apparent as the simulated peak signal to noise ratio decreases. |
Tasks | Sensor Fusion |
Published | 2018-10-13 |
URL | http://arxiv.org/abs/1810.10324v2 |
http://arxiv.org/pdf/1810.10324v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-scale-geometric-summaries-for |
Repo | |
Framework | |
Osteoarthritis Disease Detection System using Self Organizing Maps Method based on Ossa Manus X-Ray
Title | Osteoarthritis Disease Detection System using Self Organizing Maps Method based on Ossa Manus X-Ray |
Authors | Putri Kurniasih, Dian Pratiwi |
Abstract | Osteoarthritis is a disease found in the world, including in Indonesia. The purpose of this study was to detect the disease Osteoarthritis using Self Organizing mapping (SOM), and to know the procedure of artificial intelligence on the methods of Self Organizing Mapping (SOM). In this system, there are several stages to preserve to detect disease Osteoarthritis using Self Organizing maps is the result of photographic images rontgen Ossa Manus normal and sick with the resolution (150 x 200 pixels) do the repair phase contrast, the Gray scale, thresholding process, Histogram of process , and do the last process, where the process of doing training (Training) and testing on images that have kept the shape data (.text). the conclusion is the result of testing by using a data image, where 42 of data have 12 Normal image data and image data 30 sick. On the results of the process of training data there are 8 X-ray image revealed normal right and 19 data x-ray image of pain expressed is correct. Then the accuracy on the process of training was 96.42%, and in the process of testing normal true image 4 obtained revealed Normal, 9 data pain stated true pain and 1 data imagery hurts stated incorrectly, then the accuracy gained from the results of testing are 92,8%. |
Tasks | |
Published | 2018-02-19 |
URL | http://arxiv.org/abs/1802.06624v1 |
http://arxiv.org/pdf/1802.06624v1.pdf | |
PWC | https://paperswithcode.com/paper/osteoarthritis-disease-detection-system-using |
Repo | |
Framework | |
A Local Regret in Nonconvex Online Learning
Title | A Local Regret in Nonconvex Online Learning |
Authors | Sergul Aydore, Lee Dicker, Dean Foster |
Abstract | We consider an online learning process to forecast a sequence of outcomes for nonconvex models. A typical measure to evaluate online learning algorithms is regret but such standard definition of regret is intractable for nonconvex models even in offline settings. Hence, gradient based definition of regrets are common for both offline and online nonconvex problems. Recently, a notion of local gradient based regret was introduced. Inspired by the concept of calibration and a local gradient based regret, we introduce another definition of regret and we discuss why our definition is more interpretable for forecasting problems. We also provide bound analysis for our regret under certain assumptions. |
Tasks | Calibration |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05095v2 |
http://arxiv.org/pdf/1811.05095v2.pdf | |
PWC | https://paperswithcode.com/paper/a-local-regret-in-nonconvex-online-learning |
Repo | |
Framework | |