Paper Group ANR 479
End-to-End Training of Hybrid CNN-CRF Models for Stereo. Modular Multitask Reinforcement Learning with Policy Sketches. Contextual Relationship-based Activity Segmentation on an Event Stream in the IoT Environment with Multi-user Activities. Finding Significant Fourier Coefficients: Clarifications, Simplifications, Applications and Limitations. Gen …
End-to-End Training of Hybrid CNN-CRF Models for Stereo
Title | End-to-End Training of Hybrid CNN-CRF Models for Stereo |
Authors | Patrick Knöbelreiter, Christian Reinbacher, Alexander Shekhovtsov, Thomas Pock |
Abstract | We propose a novel and principled hybrid CNN+CRF model for stereo estimation. Our model allows to exploit the advantages of both, convolutional neural networks (CNNs) and conditional random fields (CRFs) in an unified approach. The CNNs compute expressive features for matching and distinctive color edges, which in turn are used to compute the unary and binary costs of the CRF. For inference, we apply a recently proposed highly parallel dual block descent algorithm which only needs a small fixed number of iterations to compute a high-quality approximate minimizer. As the main contribution of the paper, we propose a theoretically sound method based on the structured output support vector machine (SSVM) to train the hybrid CNN+CRF model on large-scale data end-to-end. Our trained models perform very well despite the fact that we are using shallow CNNs and do not apply any kind of post-processing to the final output of the CRF. We evaluate our combined models on challenging stereo benchmarks such as Middlebury 2014 and Kitti 2015 and also investigate the performance of each individual component. |
Tasks | |
Published | 2016-11-30 |
URL | http://arxiv.org/abs/1611.10229v2 |
http://arxiv.org/pdf/1611.10229v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-training-of-hybrid-cnn-crf-models |
Repo | |
Framework | |
Modular Multitask Reinforcement Learning with Policy Sketches
Title | Modular Multitask Reinforcement Learning with Policy Sketches |
Authors | Jacob Andreas, Dan Klein, Sergey Levine |
Abstract | We describe a framework for multitask deep reinforcement learning guided by policy sketches. Sketches annotate tasks with sequences of named subtasks, providing information about high-level structural relationships among tasks but not how to implement them—specifically not providing the detailed guidance used by much previous work on learning policy abstractions for RL (e.g. intermediate rewards, subtask completion signals, or intrinsic motivations). To learn from sketches, we present a model that associates every subtask with a modular subpolicy, and jointly maximizes reward over full task-specific policies by tying parameters across shared subpolicies. Optimization is accomplished via a decoupled actor–critic training objective that facilitates learning common behaviors from multiple dissimilar reward functions. We evaluate the effectiveness of our approach in three environments featuring both discrete and continuous control, and with sparse rewards that can be obtained only after completing a number of high-level subgoals. Experiments show that using our approach to learn policies guided by sketches gives better performance than existing techniques for learning task-specific or shared policies, while naturally inducing a library of interpretable primitive behaviors that can be recombined to rapidly adapt to new tasks. |
Tasks | Continuous Control |
Published | 2016-11-06 |
URL | http://arxiv.org/abs/1611.01796v2 |
http://arxiv.org/pdf/1611.01796v2.pdf | |
PWC | https://paperswithcode.com/paper/modular-multitask-reinforcement-learning-with |
Repo | |
Framework | |
Contextual Relationship-based Activity Segmentation on an Event Stream in the IoT Environment with Multi-user Activities
Title | Contextual Relationship-based Activity Segmentation on an Event Stream in the IoT Environment with Multi-user Activities |
Authors | Minkyoung Cho, Younggi Kim, Younghee Lee |
Abstract | The human activity recognition in the IoT environment plays the central role in the ambient assisted living, where the human activities can be represented as a concatenated event stream generated from various smart objects. From the concatenated event stream, each activity should be distinguished separately for the human activity recognition to provide services that users may need. In this regard, accurately segmenting the entire stream at the precise boundary of each activity is indispensable high priority task to realize the activity recognition. Multiple human activities in an IoT environment generate varying event stream patterns, and the unpredictability of these patterns makes them include redundant or missing events. In dealing with this complex segmentation problem, we figured out that the dynamic and confusing patterns cause major problems due to: inclusive event stream, redundant events, and shared events. To address these problems, we exploited the contextual relationships associated with the activity status about either ongoing or terminated/started. To discover the intrinsic relationships between the events in a stream, we utilized the LSTM model by rendering it for the activity segmentation. Then, the inferred boundaries were revised by our validation algorithm for a bit shifted boundaries. Our experiments show the surprising result of high accuracy above 95%, on our own testbed with various smart objects. This is superior to the prior works that even do not assume the environment with multi-user activities, where their accuracies are slightly above 80% in their test environment. It proves that our work is feasible enough to be applied in the IoT environment. |
Tasks | Activity Recognition, Human Activity Recognition |
Published | 2016-09-20 |
URL | http://arxiv.org/abs/1609.06024v1 |
http://arxiv.org/pdf/1609.06024v1.pdf | |
PWC | https://paperswithcode.com/paper/contextual-relationship-based-activity |
Repo | |
Framework | |
Finding Significant Fourier Coefficients: Clarifications, Simplifications, Applications and Limitations
Title | Finding Significant Fourier Coefficients: Clarifications, Simplifications, Applications and Limitations |
Authors | Steven D. Galbraith, Joel Laity, Barak Shani |
Abstract | Ideas from Fourier analysis have been used in cryptography for the last three decades. Akavia, Goldwasser and Safra unified some of these ideas to give a complete algorithm that finds significant Fourier coefficients of functions on any finite abelian group. Their algorithm stimulated a lot of interest in the cryptography community, especially in the context of bit security'. This manuscript attempts to be a friendly and comprehensive guide to the tools and results in this field. The intended readership is cryptographers who have heard about these tools and seek an understanding of their mechanics and their usefulness and limitations. A compact overview of the algorithm is presented with emphasis on the ideas behind it. We show how these ideas can be extended to a modulus-switching’ variant of the algorithm. We survey some applications of this algorithm, and explain that several results should be taken in the right context. In particular, we point out that some of the most important bit security problems are still open. Our original contributions include: a discussion of the limitations on the usefulness of these tools; an answer to an open question about the modular inversion hidden number problem. |
Tasks | |
Published | 2016-07-06 |
URL | http://arxiv.org/abs/1607.01842v4 |
http://arxiv.org/pdf/1607.01842v4.pdf | |
PWC | https://paperswithcode.com/paper/finding-significant-fourier-coefficients |
Repo | |
Framework | |
Generating Visual Explanations
Title | Generating Visual Explanations |
Authors | Lisa Anne Hendricks, Zeynep Akata, Marcus Rohrbach, Jeff Donahue, Bernt Schiele, Trevor Darrell |
Abstract | Clearly explaining a rationale for a classification decision to an end-user can be as important as the decision itself. Existing approaches for deep visual recognition are generally opaque and do not output any justification text; contemporary vision-language models can describe image content but fail to take into account class-discriminative image aspects which justify visual predictions. We propose a new model that focuses on the discriminating properties of the visible object, jointly predicts a class label, and explains why the predicted label is appropriate for the image. We propose a novel loss function based on sampling and reinforcement learning that learns to generate sentences that realize a global sentence property, such as class specificity. Our results on a fine-grained bird species classification dataset show that our model is able to generate explanations which are not only consistent with an image but also more discriminative than descriptions produced by existing captioning methods. |
Tasks | |
Published | 2016-03-28 |
URL | http://arxiv.org/abs/1603.08507v1 |
http://arxiv.org/pdf/1603.08507v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-visual-explanations |
Repo | |
Framework | |
De-noising, Stabilizing and Completing 3D Reconstructions On-the-go using Plane Priors
Title | De-noising, Stabilizing and Completing 3D Reconstructions On-the-go using Plane Priors |
Authors | Maksym Dzitsiuk, Jürgen Sturm, Robert Maier, Lingni Ma, Daniel Cremers |
Abstract | Creating 3D maps on robots and other mobile devices has become a reality in recent years. Online 3D reconstruction enables many exciting applications in robotics and AR/VR gaming. However, the reconstructions are noisy and generally incomplete. Moreover, during onine reconstruction, the surface changes with every newly integrated depth image which poses a significant challenge for physics engines and path planning algorithms. This paper presents a novel, fast and robust method for obtaining and using information about planar surfaces, such as walls, floors, and ceilings as a stage in 3D reconstruction based on Signed Distance Fields. Our algorithm recovers clean and accurate surfaces, reduces the movement of individual mesh vertices caused by noise during online reconstruction and fills in the occluded and unobserved regions. We implemented and evaluated two different strategies to generate plane candidates and two strategies for merging them. Our implementation is optimized to run in real-time on mobile devices such as the Tango tablet. In an extensive set of experiments, we validated that our approach works well in a large number of natural environments despite the presence of significant amount of occlusion, clutter and noise, which occur frequently. We further show that plane fitting enables in many cases a meaningful semantic segmentation of real-world scenes. |
Tasks | 3D Reconstruction, Semantic Segmentation |
Published | 2016-09-27 |
URL | http://arxiv.org/abs/1609.08267v2 |
http://arxiv.org/pdf/1609.08267v2.pdf | |
PWC | https://paperswithcode.com/paper/de-noising-stabilizing-and-completing-3d |
Repo | |
Framework | |
Automated Inference on Sociopsychological Impressions of Attractive Female Faces
Title | Automated Inference on Sociopsychological Impressions of Attractive Female Faces |
Authors | Xiaolin Wu, Xi Zhang, Chang Liu |
Abstract | This article is a sequel to our earlier work [25]. The main objective of our research is to explore the potential of supervised machine learning in face-induced social computing and cognition, riding on the momentum of much heralded successes of face processing, analysis and recognition on the tasks of biometric-based identification. We present a case study of automated statistical inference on sociopsychological perceptions of female faces controlled for race, attractiveness, age and nationality. Our empirical evidences point to the possibility of training machine learning algorithms, using example face images characterized by internet users, to predict perceptions of personality traits and demeanors. |
Tasks | |
Published | 2016-12-11 |
URL | http://arxiv.org/abs/1612.04158v2 |
http://arxiv.org/pdf/1612.04158v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-inference-on-sociopsychological |
Repo | |
Framework | |
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections
Title | Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections |
Authors | Zakaria Mhammedi, Andrew Hellicar, Ashfaqur Rahman, James Bailey |
Abstract | The problem of learning long-term dependencies in sequences using Recurrent Neural Networks (RNNs) is still a major challenge. Recent methods have been suggested to solve this problem by constraining the transition matrix to be unitary during training which ensures that its norm is equal to one and prevents exploding gradients. These methods either have limited expressiveness or scale poorly with the size of the network when compared with the simple RNN case, especially when using stochastic gradient descent with a small mini-batch size. Our contributions are as follows; we first show that constraining the transition matrix to be unitary is a special case of an orthogonal constraint. Then we present a new parametrisation of the transition matrix which allows efficient training of an RNN while ensuring that the matrix is always orthogonal. Our results show that the orthogonal constraint on the transition matrix applied through our parametrisation gives similar benefits to the unitary constraint, without the time complexity limitations. |
Tasks | |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00188v5 |
http://arxiv.org/pdf/1612.00188v5.pdf | |
PWC | https://paperswithcode.com/paper/efficient-orthogonal-parametrisation-of |
Repo | |
Framework | |
Greedy Deep Dictionary Learning
Title | Greedy Deep Dictionary Learning |
Authors | Snigdha Tariyal, Angshul Majumdar, Richa Singh, Mayank Vatsa |
Abstract | In this work we propose a new deep learning tool called deep dictionary learning. Multi-level dictionaries are learnt in a greedy fashion, one layer at a time. This requires solving a simple (shallow) dictionary learning problem, the solution to this is well known. We apply the proposed technique on some benchmark deep learning datasets. We compare our results with other deep learning tools like stacked autoencoder and deep belief network; and state of the art supervised dictionary learning tools like discriminative KSVD and label consistent KSVD. Our method yields better results than all. |
Tasks | Dictionary Learning |
Published | 2016-01-31 |
URL | http://arxiv.org/abs/1602.00203v1 |
http://arxiv.org/pdf/1602.00203v1.pdf | |
PWC | https://paperswithcode.com/paper/greedy-deep-dictionary-learning |
Repo | |
Framework | |
A flexible state space model for learning nonlinear dynamical systems
Title | A flexible state space model for learning nonlinear dynamical systems |
Authors | Andreas Svensson, Thomas B. Schön |
Abstract | We consider a nonlinear state-space model with the state transition and observation functions expressed as basis function expansions. The coefficients in the basis function expansions are learned from data. Using a connection to Gaussian processes we also develop priors on the coefficients, for tuning the model flexibility and to prevent overfitting to data, akin to a Gaussian process state-space model. The priors can alternatively be seen as a regularization, and helps the model in generalizing the data without sacrificing the richness offered by the basis function expansion. To learn the coefficients and other unknown parameters efficiently, we tailor an algorithm using state-of-the-art sequential Monte Carlo methods, which comes with theoretical guarantees on the learning. Our approach indicates promising results when evaluated on a classical benchmark as well as real data. |
Tasks | Gaussian Processes |
Published | 2016-03-17 |
URL | http://arxiv.org/abs/1603.05486v2 |
http://arxiv.org/pdf/1603.05486v2.pdf | |
PWC | https://paperswithcode.com/paper/a-flexible-state-space-model-for-learning |
Repo | |
Framework | |
From virtual demonstration to real-world manipulation using LSTM and MDN
Title | From virtual demonstration to real-world manipulation using LSTM and MDN |
Authors | Rouhollah Rahmatizadeh, Pooya Abolghasemi, Aman Behal, Ladislau Bölöni |
Abstract | Robots assisting the disabled or elderly must perform complex manipulation tasks and must adapt to the home environment and preferences of their user. Learning from demonstration is a promising choice, that would allow the non-technical user to teach the robot different tasks. However, collecting demonstrations in the home environment of a disabled user is time consuming, disruptive to the comfort of the user, and presents safety challenges. It would be desirable to perform the demonstrations in a virtual environment. In this paper we describe a solution to the challenging problem of behavior transfer from virtual demonstration to a physical robot. The virtual demonstrations are used to train a deep neural network based controller, which is using a Long Short Term Memory (LSTM) recurrent neural network to generate trajectories. The training process uses a Mixture Density Network (MDN) to calculate an error signal suitable for the multimodal nature of demonstrations. The controller learned in the virtual environment is transferred to a physical robot (a Rethink Robotics Baxter). An off-the-shelf vision component is used to substitute for geometric knowledge available in the simulation and an inverse kinematics module is used to allow the Baxter to enact the trajectory. Our experimental studies validate the three contributions of the paper: (1) the controller learned from virtual demonstrations can be used to successfully perform the manipulation tasks on a physical robot, (2) the LSTM+MDN architectural choice outperforms other choices, such as the use of feedforward networks and mean-squared error based training signals and (3) allowing imperfect demonstrations in the training set also allows the controller to learn how to correct its manipulation mistakes. |
Tasks | |
Published | 2016-03-12 |
URL | http://arxiv.org/abs/1603.03833v4 |
http://arxiv.org/pdf/1603.03833v4.pdf | |
PWC | https://paperswithcode.com/paper/from-virtual-demonstration-to-real-world |
Repo | |
Framework | |
Learning Continuous Semantic Representations of Symbolic Expressions
Title | Learning Continuous Semantic Representations of Symbolic Expressions |
Authors | Miltiadis Allamanis, Pankajan Chanthirasegaran, Pushmeet Kohli, Charles Sutton |
Abstract | Combining abstract, symbolic reasoning with continuous neural reasoning is a grand challenge of representation learning. As a step in this direction, we propose a new architecture, called neural equivalence networks, for the problem of learning continuous semantic representations of algebraic and logical expressions. These networks are trained to represent semantic equivalence, even of expressions that are syntactically very different. The challenge is that semantic representations must be computed in a syntax-directed manner, because semantics is compositional, but at the same time, small changes in syntax can lead to very large changes in semantics, which can be difficult for continuous neural architectures. We perform an exhaustive evaluation on the task of checking equivalence on a highly diverse class of symbolic algebraic and boolean expression types, showing that our model significantly outperforms existing architectures. |
Tasks | Representation Learning |
Published | 2016-11-04 |
URL | http://arxiv.org/abs/1611.01423v2 |
http://arxiv.org/pdf/1611.01423v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-continuous-semantic-representations |
Repo | |
Framework | |
Identifying and Categorizing Anomalies in Retinal Imaging Data
Title | Identifying and Categorizing Anomalies in Retinal Imaging Data |
Authors | Philipp Seeböck, Sebastian Waldstein, Sophie Klimscha, Bianca S. Gerendas, René Donner, Thomas Schlegl, Ursula Schmidt-Erfurth, Georg Langs |
Abstract | The identification and quantification of markers in medical images is critical for diagnosis, prognosis and management of patients in clinical practice. Supervised- or weakly supervised training enables the detection of findings that are known a priori. It does not scale well, and a priori definition limits the vocabulary of markers to known entities reducing the accuracy of diagnosis and prognosis. Here, we propose the identification of anomalies in large-scale medical imaging data using healthy examples as a reference. We detect and categorize candidates for anomaly findings untypical for the observed data. A deep convolutional autoencoder is trained on healthy retinal images. The learned model generates a new feature representation, and the distribution of healthy retinal patches is estimated by a one-class support vector machine. Results demonstrate that we can identify pathologic regions in images without using expert annotations. A subsequent clustering categorizes findings into clinically meaningful classes. In addition the learned features outperform standard embedding approaches in a classification task. |
Tasks | |
Published | 2016-12-02 |
URL | http://arxiv.org/abs/1612.00686v1 |
http://arxiv.org/pdf/1612.00686v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-and-categorizing-anomalies-in |
Repo | |
Framework | |
Weightless neural network parameters and architecture selection in a quantum computer
Title | Weightless neural network parameters and architecture selection in a quantum computer |
Authors | Adenilton J. da Silva, Wilson R. de Oliveira, Teresa B. Ludermir |
Abstract | Training artificial neural networks requires a tedious empirical evaluation to determine a suitable neural network architecture. To avoid this empirical process several techniques have been proposed to automatise the architecture selection process. In this paper, we propose a method to perform parameter and architecture selection for a quantum weightless neural network (qWNN). The architecture selection is performed through the learning procedure of a qWNN with a learning algorithm that uses the principle of quantum superposition and a non-linear quantum operator. The main advantage of the proposed method is that it performs a global search in the space of qWNN architecture and parameters rather than a local search. |
Tasks | |
Published | 2016-01-12 |
URL | http://arxiv.org/abs/1601.03277v1 |
http://arxiv.org/pdf/1601.03277v1.pdf | |
PWC | https://paperswithcode.com/paper/weightless-neural-network-parameters-and |
Repo | |
Framework | |
Solving the Wastewater Treatment Plant Problem with SMT
Title | Solving the Wastewater Treatment Plant Problem with SMT |
Authors | Miquel Bofill, Víctor Muñoz, Javier Murillo |
Abstract | In this paper we introduce the Wastewater Treatment Plant Problem, a real-world scheduling problem, and compare the performance of several tools on it. We show that, for a naive modeling, state-of-the-art SMT solvers outperform other tools ranging from mathematical programming to constraint programming. We use both real and randomly generated benchmarks. From this and similar results, we claim for the convenience of developing compiler front-ends being able to translate from constraint programming languages to the SMT-LIB standard language. |
Tasks | |
Published | 2016-09-17 |
URL | http://arxiv.org/abs/1609.05367v1 |
http://arxiv.org/pdf/1609.05367v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-the-wastewater-treatment-plant |
Repo | |
Framework | |