Paper Group ANR 1143
Proceedings of the fourth “international Traveling Workshop on Interactions between low-complexity data models and Sensing Techniques” (iTWIST’18). Learning from multivariate discrete sequential data using a restricted Boltzmann machine model. Learning and Evaluating Sparse Interpretable Sentence Embeddings. Aerial LaneNet: Lane Marking Semantic Se …
Proceedings of the fourth “international Traveling Workshop on Interactions between low-complexity data models and Sensing Techniques” (iTWIST’18)
Title | Proceedings of the fourth “international Traveling Workshop on Interactions between low-complexity data models and Sensing Techniques” (iTWIST’18) |
Authors | Sandrine Anthoine, Yannick Boursier, Laurent Jacques |
Abstract | The iTWIST workshop series aim at fostering collaboration between international scientific teams for developing new theories, applications and generalizations of low-complexity models. These events emphasize dissemination of ideas through both specific oral and poster presentations, as well as free discussions. For this fourth edition, iTWIST’18 gathered in CIRM, Marseille, France, 74 international participants and featured 7 invited talks, 16 oral presentations, and 21 posters. From iTWIST’18, the scientific committee has decided that the workshop proceedings will adopt the episcience.org philosophy, combined with arXiv.org: in a nutshell, “the proceedings are equivalent to an overlay page, built above arXiv.org; they add value to these archives by attaching a scientific caution to the validated papers.” This means that all papers listed in the HTML page of this arxiv publication (see the menu on the right) have been thoroughly evaluated and approved by two independent reviewers, and authors have revised their work according to the comments provided by these reviewers. |
Tasks | |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.00648v2 |
http://arxiv.org/pdf/1812.00648v2.pdf | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-fourth-international-2 |
Repo | |
Framework | |
Learning from multivariate discrete sequential data using a restricted Boltzmann machine model
Title | Learning from multivariate discrete sequential data using a restricted Boltzmann machine model |
Authors | Jefferson Hernandez, Andres G. Abad |
Abstract | A restricted Boltzmann machine (RBM) is a generative neural-network model with many novel applications such as collaborative filtering and acoustic modeling. An RBM lacks the capacity to retain memory, making it inappropriate for dynamic data modeling as in time-series analysis. In this paper we address this issue by proposing the p-RBM model, a generalization of the regular RBM model, capable of retaining memory of p past states. We further show how to train the p-RBM model using contrastive divergence and test our model on the problem of predicting the stock market direction considering 100 stocks of the NASDAQ-100 index. Obtained results show that the p-RBM offer promising prediction potential. |
Tasks | Time Series, Time Series Analysis |
Published | 2018-04-28 |
URL | http://arxiv.org/abs/1804.10839v1 |
http://arxiv.org/pdf/1804.10839v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-multivariate-discrete |
Repo | |
Framework | |
Learning and Evaluating Sparse Interpretable Sentence Embeddings
Title | Learning and Evaluating Sparse Interpretable Sentence Embeddings |
Authors | Valentin Trifonov, Octavian-Eugen Ganea, Anna Potapenko, Thomas Hofmann |
Abstract | Previous research on word embeddings has shown that sparse representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased interpretability properties: to some degree, each dimension can be understood by a human and associated with a recognizable feature in the data. In this paper, we transfer this idea to sentence embeddings and explore several approaches to obtain a sparse representation. We further introduce a novel, quantitative and automated evaluation metric for sentence embedding interpretability, based on topic coherence methods. We observe an increase in interpretability compared to dense models, on a dataset of movie dialogs and on the scene descriptions from the MS COCO dataset. |
Tasks | Sentence Embedding, Sentence Embeddings, Word Embeddings |
Published | 2018-09-23 |
URL | http://arxiv.org/abs/1809.08621v2 |
http://arxiv.org/pdf/1809.08621v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-and-evaluating-sparse-interpretable |
Repo | |
Framework | |
Aerial LaneNet: Lane Marking Semantic Segmentation in Aerial Imagery using Wavelet-Enhanced Cost-sensitive Symmetric Fully Convolutional Neural Networks
Title | Aerial LaneNet: Lane Marking Semantic Segmentation in Aerial Imagery using Wavelet-Enhanced Cost-sensitive Symmetric Fully Convolutional Neural Networks |
Authors | Seyed Majid Azimi, Peter Fischer, Marco Körner, Peter Reinartz |
Abstract | The knowledge about the placement and appearance of lane markings is a prerequisite for the creation of maps with high precision, necessary for autonomous driving, infrastructure monitoring, lane-wise traffic management, and urban planning. Lane markings are one of the important components of such maps. Lane markings convey the rules of roads to drivers. While these rules are learned by humans, an autonomous driving vehicle should be taught to learn them to localize itself. Therefore, accurate and reliable lane marking semantic segmentation in the imagery of roads and highways is needed to achieve such goals. We use airborne imagery which can capture a large area in a short period of time by introducing an aerial lane marking dataset. In this work, we propose a Symmetric Fully Convolutional Neural Network enhanced by Wavelet Transform in order to automatically carry out lane marking segmentation in aerial imagery. Due to a heavily unbalanced problem in terms of number of lane marking pixels compared with background pixels, we use a customized loss function as well as a new type of data augmentation step. We achieve a very high accuracy in pixel-wise localization of lane markings without using 3rd-party information. In this work, we introduce the first high-quality dataset used within our experiments which contains a broad range of situations and classes of lane markings representative of current transportation systems. This dataset will be publicly available and hence, it can be used as the benchmark dataset for future algorithms within this domain. |
Tasks | Autonomous Driving, Data Augmentation, Semantic Segmentation |
Published | 2018-03-19 |
URL | http://arxiv.org/abs/1803.06904v2 |
http://arxiv.org/pdf/1803.06904v2.pdf | |
PWC | https://paperswithcode.com/paper/aerial-lanenet-lane-marking-semantic |
Repo | |
Framework | |
Binarized Convolutional Neural Networks for Efficient Inference on GPUs
Title | Binarized Convolutional Neural Networks for Efficient Inference on GPUs |
Authors | Mir Khan, Heikki Huttunen, Jani Boutellier |
Abstract | Convolutional neural networks have recently achieved significant breakthroughs in various image classification tasks. However, they are computationally expensive,which can make their feasible mplementation on embedded and low-power devices difficult. In this paper convolutional neural network binarization is implemented on GPU-based platforms for real-time inference on resource constrained devices. In binarized networks, all weights and intermediate computations between layers are quantized to +1 and -1, allowing multiplications and additions to be replaced with bit-wise operations between 32-bit words. This representation completely eliminates the need for floating point multiplications and additions and decreases both the computational load and the memory footprint compared to a full-precision network implemented in floating point, making it well-suited for resource-constrained environments. We compare the performance of our implementation with an equivalent floating point implementation on one desktop and two embedded GPU platforms. Our implementation achieves a maximum speed up of 7. 4X with only 4.4% loss in accuracy compared to a reference implementation. |
Tasks | Image Classification |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00209v1 |
http://arxiv.org/pdf/1808.00209v1.pdf | |
PWC | https://paperswithcode.com/paper/binarized-convolutional-neural-networks-for |
Repo | |
Framework | |
One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks
Title | One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks |
Authors | Tianhe Yu, Pieter Abbeel, Sergey Levine, Chelsea Finn |
Abstract | We consider the problem of learning multi-stage vision-based tasks on a real robot from a single video of a human performing the task, while leveraging demonstration data of subtasks with other objects. This problem presents a number of major challenges. Video demonstrations without teleoperation are easy for humans to provide, but do not provide any direct supervision. Learning policies from raw pixels enables full generality but calls for large function approximators with many parameters to be learned. Finally, compound tasks can require impractical amounts of demonstration data, when treated as a monolithic skill. To address these challenges, we propose a method that learns both how to learn primitive behaviors from video demonstrations and how to dynamically compose these behaviors to perform multi-stage tasks by “watching” a human demonstrator. Our results on a simulated Sawyer robot and real PR2 robot illustrate our method for learning a variety of order fulfillment and kitchen serving tasks with novel objects and raw pixel inputs. |
Tasks | Imitation Learning |
Published | 2018-10-25 |
URL | http://arxiv.org/abs/1810.11043v1 |
http://arxiv.org/pdf/1810.11043v1.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-hierarchical-imitation-learning-of |
Repo | |
Framework | |
Improving Multi-Person Pose Estimation using Label Correction
Title | Improving Multi-Person Pose Estimation using Label Correction |
Authors | Naoki Kato, Tianqi Li, Kohei Nishino, Yusuke Uchida |
Abstract | Significant attention is being paid to multi-person pose estimation methods recently, as there has been rapid progress in the field owing to convolutional neural networks. Especially, recent method which exploits part confidence maps and Part Affinity Fields (PAFs) has achieved accurate real-time prediction of multi-person keypoints. However, human annotated labels are sometimes inappropriate for learning models. For example, if there is a limb that extends outside an image, a keypoint for the limb may not have annotations because it exists outside of the image, and thus the labels for the limb can not be generated. If a model is trained with data including such missing labels, the output of the model for the location, even though it is correct, is penalized as a false positive, which is likely to cause negative effects on the performance of the model. In this paper, we point out the existence of some patterns of inappropriate labels, and propose a novel method for correcting such labels with a teacher model trained on such incomplete data. Experiments on the COCO dataset show that training with the corrected labels improves the performance of the model and also speeds up training. |
Tasks | Multi-Person Pose Estimation, Pose Estimation |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03331v1 |
http://arxiv.org/pdf/1811.03331v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-multi-person-pose-estimation-using |
Repo | |
Framework | |
Weakly Supervised One-Shot Detection with Attention Similarity Networks
Title | Weakly Supervised One-Shot Detection with Attention Similarity Networks |
Authors | Gil Keren, Maximilian Schmitt, Thomas Kehrenberg, Björn Schuller |
Abstract | Neural network models that are not conditioned on class identities were shown to facilitate knowledge transfer between classes and to be well-suited for one-shot learning tasks. Following this motivation, we further explore and establish such models and present a novel neural network architecture for the task of weakly supervised one-shot detection. Our model is only conditioned on a single exemplar of an unseen class and a larger target example that may or may not contain an instance of the same class as the exemplar. By pairing a Siamese similarity network with an attention mechanism, we design a model that manages to simultaneously identify and localise instances of classes unseen at training time. In experiments with datasets from the computer vision and audio domains, the proposed method considerably outperforms the baseline methods for the weakly supervised one-shot detection task. |
Tasks | One-Shot Learning, Transfer Learning |
Published | 2018-01-10 |
URL | http://arxiv.org/abs/1801.03329v3 |
http://arxiv.org/pdf/1801.03329v3.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-one-shot-detection-with |
Repo | |
Framework | |
Embedded-State Latent Conditional Random Fields for Sequence Labeling
Title | Embedded-State Latent Conditional Random Fields for Sequence Labeling |
Authors | Dung Thai, Sree Harsha Ramesh, Shikhar Murty, Luke Vilnis, Andrew McCallum |
Abstract | Complex textual information extraction tasks are often posed as sequence labeling or \emph{shallow parsing}, where fields are extracted using local labels made consistent through probabilistic inference in a graphical model with constrained transitions. Recently, it has become common to locally parametrize these models using rich features extracted by recurrent neural networks (such as LSTM), while enforcing consistent outputs through a simple linear-chain model, representing Markovian dependencies between successive labels. However, the simple graphical model structure belies the often complex non-local constraints between output labels. For example, many fields, such as a first name, can only occur a fixed number of times, or in the presence of other fields. While RNNs have provided increasingly powerful context-aware local features for sequence tagging, they have yet to be integrated with a global graphical model of similar expressivity in the output distribution. Our model goes beyond the linear chain CRF to incorporate multiple hidden states per output label, but parametrizes their transitions parsimoniously with low-rank log-potential scoring matrices, effectively learning an embedding space for hidden states. This augmented latent space of inference variables complements the rich feature representation of the RNN, and allows exact global inference obeying complex, learned non-local output constraints. We experiment with several datasets and show that the model outperforms baseline CRF+RNN models when global output constraints are necessary at inference-time, and explore the interpretable latent structure. |
Tasks | |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1809.10835v1 |
http://arxiv.org/pdf/1809.10835v1.pdf | |
PWC | https://paperswithcode.com/paper/embedded-state-latent-conditional-random |
Repo | |
Framework | |
Change Point Estimation in a Dynamic Stochastic Block Model
Title | Change Point Estimation in a Dynamic Stochastic Block Model |
Authors | Monika Bhattacharjee, Moulinath Banerjee, George Michailidis |
Abstract | We consider the problem of estimating the location of a single change point in a dynamic stochastic block model. We propose two methods of estimating the change point, together with the model parameters. The first employs a least squares criterion function and takes into consideration the full structure of the stochastic block model and is evaluated at each point in time. Hence, as an intermediate step, it requires estimating the community structure based on a clustering algorithm at every time point. The second method comprises of the following two steps: in the first one, a least squares function is used and evaluated at each time point, but ignores the community structures and just considers a random graph generating mechanism exhibiting a change point. Once the change point is identified, in the second step, all network data before and after it are used together with a clustering algorithm to obtain the corresponding community structures and subsequently estimate the generating stochastic block model parameters. A comparison between these two methods is illustrated. Further, for both methods under their respective identifiability and certain additional regularity conditions, we establish rates of convergence and derive the asymptotic distributions of the change point estimators. The results are illustrated on synthetic data. |
Tasks | |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.03090v1 |
http://arxiv.org/pdf/1812.03090v1.pdf | |
PWC | https://paperswithcode.com/paper/change-point-estimation-in-a-dynamic |
Repo | |
Framework | |
On Characterizing the Capacity of Neural Networks using Algebraic Topology
Title | On Characterizing the Capacity of Neural Networks using Algebraic Topology |
Authors | William H. Guss, Ruslan Salakhutdinov |
Abstract | The learnability of different neural architectures can be characterized directly by computable measures of data complexity. In this paper, we reframe the problem of architecture selection as understanding how data determines the most expressive and generalizable architectures suited to that data, beyond inductive bias. After suggesting algebraic topology as a measure for data complexity, we show that the power of a network to express the topological complexity of a dataset in its decision region is a strictly limiting factor in its ability to generalize. We then provide the first empirical characterization of the topological capacity of neural networks. Our empirical analysis shows that at every level of dataset complexity, neural networks exhibit topological phase transitions. This observation allowed us to connect existing theory to empirically driven conjectures on the choice of architectures for fully-connected neural networks. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04443v1 |
http://arxiv.org/pdf/1802.04443v1.pdf | |
PWC | https://paperswithcode.com/paper/on-characterizing-the-capacity-of-neural |
Repo | |
Framework | |
Improving PSO Global Method for Feature Selection According to Iterations Global Search and Chaotic Theory
Title | Improving PSO Global Method for Feature Selection According to Iterations Global Search and Chaotic Theory |
Authors | Shahin Pourbahrami |
Abstract | Making a simple model by choosing a limited number of features with the purpose of reducing the computational complexity of the algorithms involved in classification is one of the main issues in machine learning and data mining. The aim of Feature Selection (FS) is to reduce the number of redundant and irrelevant features and improve the accuracy of classification in a data set. We propose an efficient ISPSO-GLOBAL (Improved Seeding Particle Swarm Optimization GLOBAL) method which investigates the specified iterations to produce prominent features and store them in storage list. The goal is to find informative features based on its iteration frequency with favorable fitness for the next generation and high exploration. Our method exploits of a new initialization strategy in PSO which improves space search and utilizes chaos theory to enhance the population initialization, then we offer a new formula to determine the features size used in proposed method. Our experiments with real-world data sets show that the performance of the ISPSO-GLOBAL is superior comparing with state-of-the-art methods in most of the data sets. |
Tasks | Feature Selection |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08701v1 |
http://arxiv.org/pdf/1811.08701v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-pso-global-method-for-feature |
Repo | |
Framework | |
An Efficient Deep Reinforcement Learning Model for Urban Traffic Control
Title | An Efficient Deep Reinforcement Learning Model for Urban Traffic Control |
Authors | Yilun Lin, Xingyuan Dai, Li Li, Fei-Yue Wang |
Abstract | Urban Traffic Control (UTC) plays an essential role in Intelligent Transportation System (ITS) but remains difficult. Since model-based UTC methods may not accurately describe the complex nature of traffic dynamics in all situations, model-free data-driven UTC methods, especially reinforcement learning (RL) based UTC methods, received increasing interests in the last decade. However, existing DL approaches did not propose an efficient algorithm to solve the complicated multiple intersections control problems whose state-action spaces are vast. To solve this problem, we propose a Deep Reinforcement Learning (DRL) algorithm that combines several tricks to master an appropriate control strategy within an acceptable time. This new algorithm relaxes the fixed traffic demand pattern assumption and reduces human invention in parameter tuning. Simulation experiments have shown that our method outperforms traditional rule-based approaches and has the potential to handle more complex traffic problems in the real world. |
Tasks | |
Published | 2018-08-06 |
URL | http://arxiv.org/abs/1808.01876v2 |
http://arxiv.org/pdf/1808.01876v2.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-deep-reinforcement-learning |
Repo | |
Framework | |
Wasserstein Autoencoders for Collaborative Filtering
Title | Wasserstein Autoencoders for Collaborative Filtering |
Authors | Jingbin Zhong, Xiaofeng Zhang |
Abstract | The recommender systems have long been investigated in the literature. Recently, users’ implicit feedback like click' or browse’ are considered to be able to enhance the recommendation performance. Therefore, a number of attempts have been made to resolve this issue. Among them, the variational autoencoders (VAE) approach already achieves a superior performance. However, the distributions of the encoded latent variables overlap a lot which may restrict its recommendation ability. To cope with this challenge, this paper tries to extend the Wasserstein autoencoders (WAE) for collaborative filtering. Particularly, the loss function of the adapted WAE is re-designed by introducing two additional loss terms: (1) the mutual information loss between the distribution of latent variables and the assumed ground truth distribution, and (2) the L1 regularization loss introduced to restrict the encoded latent variables to be sparse. Two different cost functions are designed for measuring the distance between the implicit feedback data and its re-generated version of data. Experiments are valuated on three widely adopted data sets, i.e., ML-20M, Netflix and LASTFM. Both the baseline and the state-of-the-art approaches are chosen for the performance comparison which are Mult-DAE, Mult-VAE, CDAE and Slim. The performance of the proposed approach outperforms the compared methods with respect to evaluation criteria Recall@1, Recall@5 and NDCG@10, and this demonstrates the efficacy of the proposed approach. |
Tasks | Recommendation Systems |
Published | 2018-09-15 |
URL | http://arxiv.org/abs/1809.05662v3 |
http://arxiv.org/pdf/1809.05662v3.pdf | |
PWC | https://paperswithcode.com/paper/wasserstein-autoencoders-for-collaborative |
Repo | |
Framework | |
Sample-Derived Disjunctive Rules for Secure Power System Operation
Title | Sample-Derived Disjunctive Rules for Secure Power System Operation |
Authors | Jochen L. Cremer, Ioannis Konstantelos, Simon H. Tindemans, Goran Strbac |
Abstract | Machine learning techniques have been used in the past using Monte Carlo samples to construct predictors of the dynamic stability of power systems. In this paper we move beyond the task of prediction and propose a comprehensive approach to use predictors, such as Decision Trees (DT), within a standard optimization framework for pre- and post-fault control purposes. In particular, we present a generalizable method for embedding rules derived from DTs in an operation decision-making model. We begin by pointing out the specific challenges entailed when moving from a prediction to a control framework. We proceed with introducing the solution strategy based on generalized disjunctive programming (GDP) as well as a two-step search method for identifying optimal hyper-parameters for balancing cost and control accuracy. We showcase how the proposed approach constructs security proxies that cover multiple contingencies while facing high-dimensional uncertainty with respect to operating conditions with the use of a case study on the IEEE 39-bus system. The method is shown to achieve efficient system control at a marginal increase in system price compared to an oracle model. |
Tasks | Decision Making |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.02948v1 |
http://arxiv.org/pdf/1804.02948v1.pdf | |
PWC | https://paperswithcode.com/paper/sample-derived-disjunctive-rules-for-secure |
Repo | |
Framework | |