Paper Group ANR 710
Predicting Remaining Useful Life using Time Series Embeddings based on Recurrent Neural Networks. Path Planning with Kinematic Constraints for Robot Groups. Deep Sparse Coding Using Optimized Linear Expansion of Thresholds. A multiobjective deep learning approach for predictive classification in Neuroblastoma. Learning the structure of Bayesian Net …
Predicting Remaining Useful Life using Time Series Embeddings based on Recurrent Neural Networks
Title | Predicting Remaining Useful Life using Time Series Embeddings based on Recurrent Neural Networks |
Authors | Narendhar Gugulothu, Vishnu TV, Pankaj Malhotra, Lovekesh Vig, Puneet Agarwal, Gautam Shroff |
Abstract | We consider the problem of estimating the remaining useful life (RUL) of a system or a machine from sensor data. Many approaches for RUL estimation based on sensor data make assumptions about how machines degrade. Additionally, sensor data from machines is noisy and often suffers from missing values in many practical settings. We propose Embed-RUL: a novel approach for RUL estimation from sensor data that does not rely on any degradation-trend assumptions, is robust to noise, and handles missing values. Embed-RUL utilizes a sequence-to-sequence model based on Recurrent Neural Networks (RNNs) to generate embeddings for multivariate time series subsequences. The embeddings for normal and degraded machines tend to be different, and are therefore found to be useful for RUL estimation. We show that the embeddings capture the overall pattern in the time series while filtering out the noise, so that the embeddings of two machines with similar operational behavior are close to each other, even when their sensor readings have significant and varying levels of noise content. We perform experiments on publicly available turbofan engine dataset and a proprietary real-world dataset, and demonstrate that Embed-RUL outperforms the previously reported state-of-the-art on several metrics. |
Tasks | Time Series |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.01073v2 |
http://arxiv.org/pdf/1709.01073v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-remaining-useful-life-using-time |
Repo | |
Framework | |
Path Planning with Kinematic Constraints for Robot Groups
Title | Path Planning with Kinematic Constraints for Robot Groups |
Authors | Wolfgang Hönig, T. K. Satish Kumar, Liron Cohen, Hang Ma, Sven Koenig, Nora Ayanian |
Abstract | Path planning for multiple robots is well studied in the AI and robotics communities. For a given discretized environment, robots need to find collision-free paths to a set of specified goal locations. Robots can be fully anonymous, non-anonymous, or organized in groups. Although powerful solvers for this abstract problem exist, they make simplifying assumptions by ignoring kinematic constraints, making it difficult to use the resulting plans on actual robots. In this paper, we present a solution which takes kinematic constraints, such as maximum velocities, into account, while guaranteeing a user-specified minimum safety distance between robots. We demonstrate our approach in simulation and on real robots in 2D and 3D environments. |
Tasks | |
Published | 2017-04-25 |
URL | http://arxiv.org/abs/1704.07538v1 |
http://arxiv.org/pdf/1704.07538v1.pdf | |
PWC | https://paperswithcode.com/paper/path-planning-with-kinematic-constraints-for |
Repo | |
Framework | |
Deep Sparse Coding Using Optimized Linear Expansion of Thresholds
Title | Deep Sparse Coding Using Optimized Linear Expansion of Thresholds |
Authors | Debabrata Mahapatra, Subhadip Mukherjee, Chandra Sekhar Seelamantula |
Abstract | We address the problem of reconstructing sparse signals from noisy and compressive measurements using a feed-forward deep neural network (DNN) with an architecture motivated by the iterative shrinkage-thresholding algorithm (ISTA). We maintain the weights and biases of the network links as prescribed by ISTA and model the nonlinear activation function using a linear expansion of thresholds (LET), which has been very successful in image denoising and deconvolution. The optimal set of coefficients of the parametrized activation is learned over a training dataset containing measurement-sparse signal pairs, corresponding to a fixed sensing matrix. For training, we develop an efficient second-order algorithm, which requires only matrix-vector product computations in every training epoch (Hessian-free optimization) and offers superior convergence performance than gradient-descent optimization. Subsequently, we derive an improved network architecture inspired by FISTA, a faster version of ISTA, to achieve similar signal estimation performance with about 50% of the number of layers. The resulting architecture turns out to be a deep residual network, which has recently been shown to exhibit superior performance in several visual recognition tasks. Numerical experiments demonstrate that the proposed DNN architectures lead to 3 to 4 dB improvement in the reconstruction signal-to-noise ratio (SNR), compared with the state-of-the-art sparse coding algorithms. |
Tasks | Denoising, Image Denoising |
Published | 2017-05-20 |
URL | http://arxiv.org/abs/1705.07290v1 |
http://arxiv.org/pdf/1705.07290v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-sparse-coding-using-optimized-linear |
Repo | |
Framework | |
A multiobjective deep learning approach for predictive classification in Neuroblastoma
Title | A multiobjective deep learning approach for predictive classification in Neuroblastoma |
Authors | Valerio Maggio, Marco Chierici, Giuseppe Jurman, Cesare Furlanello |
Abstract | Neuroblastoma is a strongly heterogeneous cancer with very diverse clinical courses that may vary from spontaneous regression to fatal progression; an accurate patient’s risk estimation at diagnosis is essential to design appropriate tumor treatment strategies. Neuroblastoma is a paradigm disease where different diagnostic and prognostic endpoints should be predicted from common molecular and clinical information, with increasing complexity, as shown in the FDA MAQC-II study. Here we introduce the novel multiobjective deep learning architecture CDRP (Concatenated Diagnostic Relapse Prognostic) composed by 8 layers to obtain a combined diagnostic and prognostic prediction from high-throughput transcriptomics data. Two distinct loss functions are optimized for the Event Free Survival (EFS) and Overall Survival (OS) prognosis, respectively. We use the High-Risk (HR) diagnostic information as an additional input generated by an autoencoder embedding. The latter is used as network regulariser, based on a clinical algorithm commonly adopted for stratifying patients from cancer stage, age at insurgence of disease, and MYCN, the specific molecular marker. The architecture was applied to Illumina HiSeq2000 RNA-Seq for 498 neuroblastoma patients (176 at high risk) from the Sequencing Quality Control (SEQC) study, obtaining state-of-art on the diagnostic endpoint and improving prediction of prognosis over the HR cohort. |
Tasks | |
Published | 2017-11-22 |
URL | http://arxiv.org/abs/1711.08198v3 |
http://arxiv.org/pdf/1711.08198v3.pdf | |
PWC | https://paperswithcode.com/paper/a-multiobjective-deep-learning-approach-for |
Repo | |
Framework | |
Learning the structure of Bayesian Networks: A quantitative assessment of the effect of different algorithmic schemes
Title | Learning the structure of Bayesian Networks: A quantitative assessment of the effect of different algorithmic schemes |
Authors | Stefano Beretta, Mauro Castelli, Ivo Goncalves, Roberto Henriques, Daniele Ramazzotti |
Abstract | One of the most challenging tasks when adopting Bayesian Networks (BNs) is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions, and by the fact that the problem is NP-hard. Hence, full enumeration of all the possible solutions is not always feasible and approximations are often required. However, to the best of our knowledge, a quantitative analysis of the performance and characteristics of the different heuristics to solve this problem has never been done before. For this reason, in this work, we provide a detailed comparison of many different state-of-the-arts methods for structural learning on simulated data considering both BNs with discrete and continuous variables, and with different rates of noise in the data. In particular, we investigate the performance of different widespread scores and algorithmic approaches proposed for the inference and the statistical pitfalls within them. |
Tasks | |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1704.08676v2 |
http://arxiv.org/pdf/1704.08676v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-the-structure-of-bayesian-networks-a |
Repo | |
Framework | |
Incremental Boosting Convolutional Neural Network for Facial Action Unit Recognition
Title | Incremental Boosting Convolutional Neural Network for Facial Action Unit Recognition |
Authors | Shizhong Han, Zibo Meng, Ahmed Shehab Khan, Yan Tong |
Abstract | Recognizing facial action units (AUs) from spontaneous facial expressions is still a challenging problem. Most recently, CNNs have shown promise on facial AU recognition. However, the learned CNNs are often overfitted and do not generalize well to unseen subjects due to limited AU-coded training images. We proposed a novel Incremental Boosting CNN (IB-CNN) to integrate boosting into the CNN via an incremental boosting layer that selects discriminative neurons from the lower layer and is incrementally updated on successive mini-batches. In addition, a novel loss function that accounts for errors from both the incremental boosted classifier and individual weak classifiers was proposed to fine-tune the IB-CNN. Experimental results on four benchmark AU databases have demonstrated that the IB-CNN yields significant improvement over the traditional CNN and the boosting CNN without incremental learning, as well as outperforming the state-of-the-art CNN-based methods in AU recognition. The improvement is more impressive for the AUs that have the lowest frequencies in the databases. |
Tasks | Facial Action Unit Detection |
Published | 2017-07-17 |
URL | http://arxiv.org/abs/1707.05395v1 |
http://arxiv.org/pdf/1707.05395v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-boosting-convolutional-neural |
Repo | |
Framework | |
Sentence Correction Based on Large-scale Language Modelling
Title | Sentence Correction Based on Large-scale Language Modelling |
Authors | Ji Wen |
Abstract | With the further development of informatization, more and more data is stored in the form of text. There are some loss of text during their generation and transmission. The paper aims to establish a language model based on the large-scale corpus to complete the restoration of missing text. In this paper, we introduce a novel measurement to find the missing words, and a way of establishing a comprehensive candidate lexicon to insert the correct choice of words. The paper also introduces some effective optimization methods, which largely improve the efficiency of the text restoration and shorten the time of dealing with 1000 sentences into 3.6 seconds. \keywords{ language model, sentence correction, word imputation, parallel optimization |
Tasks | Imputation, Language Modelling |
Published | 2017-09-22 |
URL | http://arxiv.org/abs/1709.07777v2 |
http://arxiv.org/pdf/1709.07777v2.pdf | |
PWC | https://paperswithcode.com/paper/sentence-correction-based-on-large-scale |
Repo | |
Framework | |
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator
Title | Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator |
Authors | Stephen Tu, Benjamin Recht |
Abstract | Reinforcement learning (RL) has been successfully used to solve many continuous control tasks. Despite its impressive results however, fundamental questions regarding the sample complexity of RL on continuous problems remain open. We study the performance of RL in this setting by considering the behavior of the Least-Squares Temporal Difference (LSTD) estimator on the classic Linear Quadratic Regulator (LQR) problem from optimal control. We give the first finite-time analysis of the number of samples needed to estimate the value function for a fixed static state-feedback policy to within $\varepsilon$-relative error. In the process of deriving our result, we give a general characterization for when the minimum eigenvalue of the empirical covariance matrix formed along the sample path of a fast-mixing stochastic process concentrates above zero, extending a result by Koltchinskii and Mendelson in the independent covariates setting. Finally, we provide experimental evidence indicating that our analysis correctly captures the qualitative behavior of LSTD on several LQR instances. |
Tasks | Continuous Control |
Published | 2017-12-22 |
URL | http://arxiv.org/abs/1712.08642v1 |
http://arxiv.org/pdf/1712.08642v1.pdf | |
PWC | https://paperswithcode.com/paper/least-squares-temporal-difference-learning |
Repo | |
Framework | |
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Title | ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes |
Authors | Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner |
Abstract | A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available – current datasets cover a small range of scene views and have limited semantic annotations. To address this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowdsourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval. The dataset is freely available at http://www.scan-net.org. |
Tasks | 3D Object Classification, Object Classification, Scene Understanding, Semantic Segmentation |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04405v2 |
http://arxiv.org/pdf/1702.04405v2.pdf | |
PWC | https://paperswithcode.com/paper/scannet-richly-annotated-3d-reconstructions |
Repo | |
Framework | |
Identifying Reference Spans: Topic Modeling and Word Embeddings help IR
Title | Identifying Reference Spans: Topic Modeling and Word Embeddings help IR |
Authors | Luis Moraes, Shahryar Baki, Rakesh Verma, Daniel Lee |
Abstract | The CL-SciSumm 2016 shared task introduced an interesting problem: given a document D and a piece of text that cites D, how do we identify the text spans of D being referenced by the piece of text? The shared task provided the first annotated dataset for studying this problem. We present an analysis of our continued work in improving our system’s performance on this task. We demonstrate how topic models and word embeddings can be used to surpass the previously best performing system. |
Tasks | Topic Models, Word Embeddings |
Published | 2017-08-09 |
URL | http://arxiv.org/abs/1708.02989v1 |
http://arxiv.org/pdf/1708.02989v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-reference-spans-topic-modeling |
Repo | |
Framework | |
Weighted parallel SGD for distributed unbalanced-workload training system
Title | Weighted parallel SGD for distributed unbalanced-workload training system |
Authors | Cheng Daning, Li Shigang, Zhang Yunquan |
Abstract | Stochastic gradient descent (SGD) is a popular stochastic optimization method in machine learning. Traditional parallel SGD algorithms, e.g., SimuParallel SGD, often require all nodes to have the same performance or to consume equal quantities of data. However, these requirements are difficult to satisfy when the parallel SGD algorithms run in a heterogeneous computing environment; low-performance nodes will exert a negative influence on the final result. In this paper, we propose an algorithm called weighted parallel SGD (WP-SGD). WP-SGD combines weighted model parameters from different nodes in the system to produce the final output. WP-SGD makes use of the reduction in standard deviation to compensate for the loss from the inconsistency in performance of nodes in the cluster, which means that WP-SGD does not require that all nodes consume equal quantities of data. We also analyze the theoretical feasibility of running two other parallel SGD algorithms combined with WP-SGD in a heterogeneous environment. The experimental results show that WP-SGD significantly outperforms the traditional parallel SGD algorithms on distributed training systems with an unbalanced workload. |
Tasks | Stochastic Optimization |
Published | 2017-08-16 |
URL | http://arxiv.org/abs/1708.04801v1 |
http://arxiv.org/pdf/1708.04801v1.pdf | |
PWC | https://paperswithcode.com/paper/weighted-parallel-sgd-for-distributed |
Repo | |
Framework | |
Acting Thoughts: Towards a Mobile Robotic Service Assistant for Users with Limited Communication Skills
Title | Acting Thoughts: Towards a Mobile Robotic Service Assistant for Users with Limited Communication Skills |
Authors | Felix Burget, Lukas Dominique Josef Fiederer, Daniel Kuhner, Martin Völker, Johannes Aldinger, Robin Tibor Schirrmeister, Chau Do, Joschka Boedecker, Bernhard Nebel, Tonio Ball, Wolfram Burgard |
Abstract | As autonomous service robots become more affordable and thus available also for the general public, there is a growing need for user friendly interfaces to control the robotic system. Currently available control modalities typically expect users to be able to express their desire through either touch, speech or gesture commands. While this requirement is fulfilled for the majority of users, paralyzed users may not be able to use such systems. In this paper, we present a novel framework, that allows these users to interact with a robotic service assistant in a closed-loop fashion, using only thoughts. The brain-computer interface (BCI) system is composed of several interacting components, i.e., non-invasive neuronal signal recording and decoding, high-level task planning, motion and manipulation planning as well as environment perception. In various experiments, we demonstrate its applicability and robustness in real world scenarios, considering fetch-and-carry tasks and tasks involving human-robot interaction. As our results demonstrate, our system is capable of adapting to frequent changes in the environment and reliably completing given tasks within a reasonable amount of time. Combined with high-level planning and autonomous robotic systems, interesting new perspectives open up for non-invasive BCI-based human-robot interactions. |
Tasks | |
Published | 2017-07-20 |
URL | http://arxiv.org/abs/1707.06633v4 |
http://arxiv.org/pdf/1707.06633v4.pdf | |
PWC | https://paperswithcode.com/paper/acting-thoughts-towards-a-mobile-robotic |
Repo | |
Framework | |
Improving Legal Information Retrieval by Distributional Composition with Term Order Probabilities
Title | Improving Legal Information Retrieval by Distributional Composition with Term Order Probabilities |
Authors | Danilo S. Carvalho, Duc-Vu Tran, Van-Khanh Tran, Le-Nguyen Minh |
Abstract | Legal professionals worldwide are currently trying to get up-to-pace with the explosive growth in legal document availability through digital means. This drives a need for high efficiency Legal Information Retrieval (IR) and Question Answering (QA) methods. The IR task in particular has a set of unique challenges that invite the use of semantic motivated NLP techniques. In this work, a two-stage method for Legal Information Retrieval is proposed, combining lexical statistics and distributional sentence representations in the context of Competition on Legal Information Extraction/Entailment (COLIEE). The combination is done with the use of disambiguation rules, applied over the rankings obtained through n-gram statistics. After the ranking is done, its results are evaluated for ambiguity, and disambiguation is done if a result is decided to be unreliable for a given query. Competition and experimental results indicate small gains in overall retrieval performance using the proposed approach. Additionally, an analysis of error and improvement cases is presented for a better understanding of the contributions. |
Tasks | Information Retrieval, Question Answering |
Published | 2017-06-04 |
URL | http://arxiv.org/abs/1706.01038v2 |
http://arxiv.org/pdf/1706.01038v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-legal-information-retrieval-by |
Repo | |
Framework | |
Estimating Quality in Multi-Objective Bandits Optimization
Title | Estimating Quality in Multi-Objective Bandits Optimization |
Authors | Audrey Durand, Christian Gagné |
Abstract | Many real-world applications are characterized by a number of conflicting performance measures. As optimizing in a multi-objective setting leads to a set of non-dominated solutions, a preference function is required for selecting the solution with the appropriate trade-off between the objectives. The question is: how good do estimations of these objectives have to be in order for the solution maximizing the preference function to remain unchanged? In this paper, we introduce the concept of preference radius to characterize the robustness of the preference function and provide guidelines for controlling the quality of estimations in the multi-objective setting. More specifically, we provide a general formulation of multi-objective optimization under the bandits setting. We show how the preference radius relates to the optimal gap and we use this concept to provide a theoretical analysis of the Thompson sampling algorithm from multivariate normal priors. We finally present experiments to support the theoretical results and highlight the fact that one cannot simply scalarize multi-objective problems into single-objective problems. |
Tasks | |
Published | 2017-01-04 |
URL | http://arxiv.org/abs/1701.01095v3 |
http://arxiv.org/pdf/1701.01095v3.pdf | |
PWC | https://paperswithcode.com/paper/estimating-quality-in-multi-objective-bandits |
Repo | |
Framework | |
Efficient coordinate-wise leading eigenvector computation
Title | Efficient coordinate-wise leading eigenvector computation |
Authors | Jialei Wang, Weiran Wang, Dan Garber, Nathan Srebro |
Abstract | We develop and analyze efficient “coordinate-wise” methods for finding the leading eigenvector, where each step involves only a vector-vector product. We establish global convergence with overall runtime guarantees that are at least as good as Lanczos’s method and dominate it for slowly decaying spectrum. Our methods are based on combining a shift-and-invert approach with coordinate-wise algorithms for linear regression. |
Tasks | |
Published | 2017-02-25 |
URL | http://arxiv.org/abs/1702.07834v1 |
http://arxiv.org/pdf/1702.07834v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-coordinate-wise-leading-eigenvector |
Repo | |
Framework | |