Paper Group AWR 125
Event Representations with Tensor-based Compositions. Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification. SGD for robot motion? The effectiveness of stochastic optimization on a new benchmark for biped locomotion tasks. NormFace: L2 Hypersphere Embedding for Face Verification. Predicting Visual Features f …
Event Representations with Tensor-based Compositions
Title | Event Representations with Tensor-based Compositions |
Authors | Noah Weber, Niranjan Balasubramanian, Nathanael Chambers |
Abstract | Robust and flexible event representations are important to many core areas in language understanding. Scripts were proposed early on as a way of representing sequences of events for such understanding, and has recently attracted renewed attention. However, obtaining effective representations for modeling script-like event sequences is challenging. It requires representations that can capture event-level and scenario-level semantics. We propose a new tensor-based composition method for creating event representations. The method captures more subtle semantic interactions between an event and its entities and yields representations that are effective at multiple event-related tasks. With the continuous representations, we also devise a simple schema generation method which produces better schemas compared to a prior discrete representation based method. Our analysis shows that the tensors capture distinct usages of a predicate even when there are only subtle differences in their surface realizations. |
Tasks | |
Published | 2017-11-21 |
URL | http://arxiv.org/abs/1711.07611v1 |
http://arxiv.org/pdf/1711.07611v1.pdf | |
PWC | https://paperswithcode.com/paper/event-representations-with-tensor-based |
Repo | https://github.com/stonybrooknlp/event-tensors |
Framework | tf |
Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification
Title | Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification |
Authors | Jan Deriu, Aurelien Lucchi, Valeria De Luca, Aliaksei Severyn, Simon Müller, Mark Cieliebak, Thomas Hofmann, Martin Jaggi |
Abstract | This paper presents a novel approach for multi-lingual sentiment classification in short texts. This is a challenging task as the amount of training data in languages other than English is very limited. Previously proposed multi-lingual approaches typically require to establish a correspondence to English for which powerful classifiers are already available. In contrast, our method does not require such supervision. We leverage large amounts of weakly-supervised data in various languages to train a multi-layer convolutional network and demonstrate the importance of using pre-training of such networks. We thoroughly evaluate our approach on various multi-lingual datasets, including the recent SemEval-2016 sentiment prediction benchmark (Task 4), where we achieved state-of-the-art performance. We also compare the performance of our model trained individually for each language to a variant trained for all languages at once. We show that the latter model reaches slightly worse - but still acceptable - performance when compared to the single language model, while benefiting from better generalization properties across languages. |
Tasks | Language Modelling, Sentiment Analysis |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02504v1 |
http://arxiv.org/pdf/1703.02504v1.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-large-amounts-of-weakly-supervised |
Repo | https://github.com/spinningbytes/deep-mlsa |
Framework | tf |
SGD for robot motion? The effectiveness of stochastic optimization on a new benchmark for biped locomotion tasks
Title | SGD for robot motion? The effectiveness of stochastic optimization on a new benchmark for biped locomotion tasks |
Authors | Martim Brandao, Kenji Hashimoto, Atsuo Takanishi |
Abstract | Trajectory optimization and posture generation are hard problems in robot locomotion, which can be non-convex and have multiple local optima. Progress on these problems is further hindered by a lack of open benchmarks, since comparisons of different solutions are difficult to make. In this paper we introduce a new benchmark for trajectory optimization and posture generation of legged robots, using a pre-defined scenario, robot and constraints, as well as evaluation criteria. We evaluate state-of-the-art trajectory optimization algorithms based on sequential quadratic programming (SQP) on the benchmark, as well as new stochastic and incremental optimization methods borrowed from the large-scale machine learning literature. Interestingly we show that some of these stochastic and incremental methods, which are based on stochastic gradient descent (SGD), achieve higher success rates than SQP on tough initializations. Inspired by this observation we also propose a new incremental variant of SQP which updates only a random subset of the costs and constraints at each iteration. The algorithm is the best performing in both success rate and convergence speed, improving over SQP by up to 30% in both criteria. The benchmark’s resources and a solution evaluation script are made openly available. |
Tasks | Legged Robots, Stochastic Optimization |
Published | 2017-10-09 |
URL | http://arxiv.org/abs/1710.03029v1 |
http://arxiv.org/pdf/1710.03029v1.pdf | |
PWC | https://paperswithcode.com/paper/sgd-for-robot-motion-the-effectiveness-of |
Repo | https://github.com/martimbrandao/legopt-benchmark |
Framework | none |
NormFace: L2 Hypersphere Embedding for Face Verification
Title | NormFace: L2 Hypersphere Embedding for Face Verification |
Authors | Feng Wang, Xiang Xiang, Jian Cheng, Alan L. Yuille |
Abstract | Thanks to the recent developments of Convolutional Neural Networks, the performance of face verification methods has increased rapidly. In a typical face verification method, feature normalization is a critical step for boosting performance. This motivates us to introduce and study the effect of normalization during training. But we find this is non-trivial, despite normalization being differentiable. We identify and study four issues related to normalization through mathematical analysis, which yields understanding and helps with parameter settings. Based on this analysis we propose two strategies for training using normalized features. The first is a modification of softmax loss, which optimizes cosine similarity instead of inner-product. The second is a reformulation of metric learning by introducing an agent vector for each class. We show that both strategies, and small variants, consistently improve performance by between 0.2% to 0.4% on the LFW dataset based on two models. This is significant because the performance of the two models on LFW dataset is close to saturation at over 98%. Codes and models are released on https://github.com/happynear/NormFace |
Tasks | Face Verification, Metric Learning |
Published | 2017-04-21 |
URL | http://arxiv.org/abs/1704.06369v4 |
http://arxiv.org/pdf/1704.06369v4.pdf | |
PWC | https://paperswithcode.com/paper/normface-l2-hypersphere-embedding-for-face |
Repo | https://github.com/anax32/dimensionality-reduction |
Framework | tf |
Predicting Visual Features from Text for Image and Video Caption Retrieval
Title | Predicting Visual Features from Text for Image and Video Caption Retrieval |
Authors | Jianfeng Dong, Xirong Li, Cees G. M. Snoek |
Abstract | This paper strives to find amidst a set of sentences the one best describing the content of a given image or video. Different from existing works, which rely on a joint subspace for their image and video caption retrieval, we propose to do so in a visual space exclusively. Apart from this conceptual novelty, we contribute \emph{Word2VisualVec}, a deep neural network architecture that learns to predict a visual feature representation from textual input. Example captions are encoded into a textual embedding based on multi-scale sentence vectorization and further transferred into a deep visual feature of choice via a simple multi-layer perceptron. We further generalize Word2VisualVec for video caption retrieval, by predicting from text both 3-D convolutional neural network features as well as a visual-audio representation. Experiments on Flickr8k, Flickr30k, the Microsoft Video Description dataset and the very recent NIST TrecVid challenge for video caption retrieval detail Word2VisualVec’s properties, its benefit over textual embeddings, the potential for multimodal query composition and its state-of-the-art results. |
Tasks | Video Description |
Published | 2017-09-05 |
URL | http://arxiv.org/abs/1709.01362v3 |
http://arxiv.org/pdf/1709.01362v3.pdf | |
PWC | https://paperswithcode.com/paper/predicting-visual-features-from-text-for |
Repo | https://github.com/danieljf24/w2vv |
Framework | tf |
Temporal Stability in Predictive Process Monitoring
Title | Temporal Stability in Predictive Process Monitoring |
Authors | Irene Teinemaa, Marlon Dumas, Anna Leontjeva, Fabrizio Maria Maggi |
Abstract | Predictive process monitoring is concerned with the analysis of events produced during the execution of a business process in order to predict as early as possible the final outcome of an ongoing case. Traditionally, predictive process monitoring methods are optimized with respect to accuracy. However, in environments where users make decisions and take actions in response to the predictions they receive, it is equally important to optimize the stability of the successive predictions made for each case. To this end, this paper defines a notion of temporal stability for binary classification tasks in predictive process monitoring and evaluates existing methods with respect to both temporal stability and accuracy. We find that methods based on XGBoost and LSTM neural networks exhibit the highest temporal stability. We then show that temporal stability can be enhanced by hyperparameter-optimizing random forests and XGBoost classifiers with respect to inter-run stability. Finally, we show that time series smoothing techniques can further enhance temporal stability at the expense of slightly lower accuracy. |
Tasks | Time Series |
Published | 2017-12-12 |
URL | http://arxiv.org/abs/1712.04165v3 |
http://arxiv.org/pdf/1712.04165v3.pdf | |
PWC | https://paperswithcode.com/paper/temporal-stability-in-predictive-process |
Repo | https://github.com/irhete/stability-predictive-monitoring |
Framework | tf |
Learning from Between-class Examples for Deep Sound Recognition
Title | Learning from Between-class Examples for Deep Sound Recognition |
Authors | Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada |
Abstract | Deep learning methods have achieved high performance in sound recognition tasks. Deciding how to feed the training data is important for further performance improvement. We propose a novel learning method for deep sound recognition: Between-Class learning (BC learning). Our strategy is to learn a discriminative feature space by recognizing the between-class sounds as between-class sounds. We generate between-class sounds by mixing two sounds belonging to different classes with a random ratio. We then input the mixed sound to the model and train the model to output the mixing ratio. The advantages of BC learning are not limited only to the increase in variation of the training data; BC learning leads to an enlargement of Fisher’s criterion in the feature space and a regularization of the positional relationship among the feature distributions of the classes. The experimental results show that BC learning improves the performance on various sound recognition networks, datasets, and data augmentation schemes, in which BC learning proves to be always beneficial. Furthermore, we construct a new deep sound recognition network (EnvNet-v2) and train it with BC learning. As a result, we achieved a performance surpasses the human level. |
Tasks | Data Augmentation |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10282v2 |
http://arxiv.org/pdf/1711.10282v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-between-class-examples-for-deep |
Repo | https://github.com/mil-tokyo/bc_learning_image |
Framework | torch |
A Bridge Between Hyperparameter Optimization and Learning-to-learn
Title | A Bridge Between Hyperparameter Optimization and Learning-to-learn |
Authors | Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil |
Abstract | We consider a class of a nested optimization problems involving inner and outer objectives. We observe that by taking into explicit account the optimization dynamics for the inner objective it is possible to derive a general framework that unifies gradient-based hyperparameter optimization and meta-learning (or learning-to-learn). Depending on the specific setting, the variables of the outer objective take either the meaning of hyperparameters in a supervised learning problem or parameters of a meta-learner. We show that some recently proposed methods in the latter setting can be instantiated in our framework and tackled with the same gradient-based algorithms. Finally, we discuss possible design patterns for learning-to-learn and present encouraging preliminary experiments for few-shot learning. |
Tasks | Few-Shot Learning, Hyperparameter Optimization, Meta-Learning |
Published | 2017-12-18 |
URL | https://arxiv.org/abs/1712.06283v3 |
https://arxiv.org/pdf/1712.06283v3.pdf | |
PWC | https://paperswithcode.com/paper/a-bridge-between-hyperparameter-optimization |
Repo | https://github.com/lucfra/FAR-HO |
Framework | tf |
Masked Autoregressive Flow for Density Estimation
Title | Masked Autoregressive Flow for Density Estimation |
Authors | George Papamakarios, Theo Pavlakou, Iain Murray |
Abstract | Autoregressive models are among the best performing neural density estimators. We describe an approach for increasing the flexibility of an autoregressive model, based on modelling the random numbers that the model uses internally when generating data. By constructing a stack of autoregressive models, each modelling the random numbers of the next model in the stack, we obtain a type of normalizing flow suitable for density estimation, which we call Masked Autoregressive Flow. This type of flow is closely related to Inverse Autoregressive Flow and is a generalization of Real NVP. Masked Autoregressive Flow achieves state-of-the-art performance in a range of general-purpose density estimation tasks. |
Tasks | Density Estimation |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07057v4 |
http://arxiv.org/pdf/1705.07057v4.pdf | |
PWC | https://paperswithcode.com/paper/masked-autoregressive-flow-for-density |
Repo | https://github.com/e-hulten/maf |
Framework | pytorch |
Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images
Title | Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images |
Authors | Vahid Mirjalili, Sebastian Raschka, Anoop Namboodiri, Arun Ross |
Abstract | In this paper, we design and evaluate a convolutional autoencoder that perturbs an input face image to impart privacy to a subject. Specifically, the proposed autoencoder transforms an input face image such that the transformed image can be successfully used for face recognition but not for gender classification. In order to train this autoencoder, we propose a novel training scheme, referred to as semi-adversarial training in this work. The training is facilitated by attaching a semi-adversarial module consisting of a pseudo gender classifier and a pseudo face matcher to the autoencoder. The objective function utilized for training this network has three terms: one to ensure that the perturbed image is a realistic face image; another to ensure that the gender attributes of the face are confounded; and a third to ensure that biometric recognition performance due to the perturbed image is not impacted. Extensive experiments confirm the efficacy of the proposed architecture in extending gender privacy to face images. |
Tasks | Face Recognition |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00321v3 |
http://arxiv.org/pdf/1712.00321v3.pdf | |
PWC | https://paperswithcode.com/paper/semi-adversarial-networks-convolutional |
Repo | https://github.com/gianluca-pepe/semi-adversarial-network |
Framework | tf |
NoScope: Optimizing Neural Network Queries over Video at Scale
Title | NoScope: Optimizing Neural Network Queries over Video at Scale |
Authors | Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, Matei Zaharia |
Abstract | Recent advances in computer vision-in the form of deep neural networks-have made it possible to query increasing volumes of video data with high accuracy. However, neural network inference is computationally expensive at scale: applying a state-of-the-art object detector in real time (i.e., 30+ frames per second) to a single video requires a $4000 GPU. In response, we present NoScope, a system for querying videos that can reduce the cost of neural network video analysis by up to three orders of magnitude via inference-optimized model search. Given a target video, object to detect, and reference neural network, NoScope automatically searches for and trains a sequence, or cascade, of models that preserves the accuracy of the reference network but is specialized to the target video and are therefore far less computationally expensive. NoScope cascades two types of models: specialized models that forego the full generality of the reference model but faithfully mimic its behavior for the target video and object; and difference detectors that highlight temporal differences across frames. We show that the optimal cascade architecture differs across videos and objects, so NoScope uses an efficient cost-based optimizer to search across models and cascades. With this approach, NoScope achieves two to three order of magnitude speed-ups (265-15,500x real-time) on binary classification tasks over fixed-angle webcam and surveillance video while maintaining accuracy within 1-5% of state-of-the-art neural networks. |
Tasks | |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02529v3 |
http://arxiv.org/pdf/1703.02529v3.pdf | |
PWC | https://paperswithcode.com/paper/noscope-optimizing-neural-network-queries |
Repo | https://github.com/stanford-futuredata/noscope |
Framework | tf |
Learning for Disparity Estimation through Feature Constancy
Title | Learning for Disparity Estimation through Feature Constancy |
Authors | Zhengfa Liang, Yiliu Feng, Yulan Guo, Hengzhu Liu, Wei Chen, Linbo Qiao, Li Zhou, Jianfeng Zhang |
Abstract | Stereo matching algorithms usually consist of four steps, including matching cost calculation, matching cost aggregation, disparity calculation, and disparity refinement. Existing CNN-based methods only adopt CNN to solve parts of the four steps, or use different networks to deal with different steps, making them difficult to obtain the overall optimal solution. In this paper, we propose a network architecture to incorporate all steps of stereo matching. The network consists of three parts. The first part calculates the multi-scale shared features. The second part performs matching cost calculation, matching cost aggregation and disparity calculation to estimate the initial disparity using shared features. The initial disparity and the shared features are used to calculate the feature constancy that measures correctness of the correspondence between two input images. The initial disparity and the feature constancy are then fed to a sub-network to refine the initial disparity. The proposed method has been evaluated on the Scene Flow and KITTI datasets. It achieves the state-of-the-art performance on the KITTI 2012 and KITTI 2015 benchmarks while maintaining a very fast running time. |
Tasks | Disparity Estimation, Stereo Matching, Stereo Matching Hand |
Published | 2017-12-04 |
URL | http://arxiv.org/abs/1712.01039v2 |
http://arxiv.org/pdf/1712.01039v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-for-disparity-estimation-through |
Repo | https://github.com/leonzfa/iResNet |
Framework | none |
Multiple Instance Detection Network with Online Instance Classifier Refinement
Title | Multiple Instance Detection Network with Online Instance Classifier Refinement |
Authors | Peng Tang, Xinggang Wang, Xiang Bai, Wenyu Liu |
Abstract | Of late, weakly supervised object detection is with great importance in object recognition. Based on deep learning, weakly supervised detectors have achieved many promising results. However, compared with fully supervised detection, it is more challenging to train deep network based detectors in a weakly supervised manner. Here we formulate weakly supervised detection as a Multiple Instance Learning (MIL) problem, where instance classifiers (object detectors) are put into the network as hidden nodes. We propose a novel online instance classifier refinement algorithm to integrate MIL and the instance classifier refinement procedure into a single deep network, and train the network end-to-end with only image-level supervision, i.e., without object location information. More precisely, instance labels inferred from weak supervision are propagated to their spatially overlapped instances to refine instance classifier online. The iterative instance classifier refinement procedure is implemented using multiple streams in deep network, where each stream supervises its latter stream. Weakly supervised object detection experiments are carried out on the challenging PASCAL VOC 2007 and 2012 benchmarks. We obtain 47% mAP on VOC 2007 that significantly outperforms the previous state-of-the-art. |
Tasks | Multiple Instance Learning, Object Detection, Object Recognition, Weakly Supervised Object Detection |
Published | 2017-04-01 |
URL | http://arxiv.org/abs/1704.00138v1 |
http://arxiv.org/pdf/1704.00138v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-instance-detection-network-with |
Repo | https://github.com/ppengtang/oicr |
Framework | pytorch |
MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs
Title | MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs |
Authors | Pranav Rajpurkar, Jeremy Irvin, Aarti Bagul, Daisy Ding, Tony Duan, Hershel Mehta, Brandon Yang, Kaylie Zhu, Dillon Laird, Robyn L. Ball, Curtis Langlotz, Katie Shpanskaya, Matthew P. Lungren, Andrew Y. Ng |
Abstract | We introduce MURA, a large dataset of musculoskeletal radiographs containing 40,561 images from 14,863 studies, where each study is manually labeled by radiologists as either normal or abnormal. To evaluate models robustly and to get an estimate of radiologist performance, we collect additional labels from six board-certified Stanford radiologists on the test set, consisting of 207 musculoskeletal studies. On this test set, the majority vote of a group of three radiologists serves as gold standard. We train a 169-layer DenseNet baseline model to detect and localize abnormalities. Our model achieves an AUROC of 0.929, with an operating point of 0.815 sensitivity and 0.887 specificity. We compare our model and radiologists on the Cohen’s kappa statistic, which expresses the agreement of our model and of each radiologist with the gold standard. Model performance is comparable to the best radiologist performance in detecting abnormalities on finger and wrist studies. However, model performance is lower than best radiologist performance in detecting abnormalities on elbow, forearm, hand, humerus, and shoulder studies. We believe that the task is a good challenge for future research. To encourage advances, we have made our dataset freely available at https://stanfordmlgroup.github.io/competitions/mura . |
Tasks | Anomaly Detection |
Published | 2017-12-11 |
URL | http://arxiv.org/abs/1712.06957v4 |
http://arxiv.org/pdf/1712.06957v4.pdf | |
PWC | https://paperswithcode.com/paper/mura-large-dataset-for-abnormality-detection |
Repo | https://github.com/rajkumargithub/densenet.mura |
Framework | pytorch |
A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem
Title | A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem |
Authors | Zhengyao Jiang, Dixing Xu, Jinjun Liang |
Abstract | Financial portfolio management is the process of constant redistribution of a fund into different financial products. This paper presents a financial-model-free Reinforcement Learning framework to provide a deep machine learning solution to the portfolio management problem. The framework consists of the Ensemble of Identical Independent Evaluators (EIIE) topology, a Portfolio-Vector Memory (PVM), an Online Stochastic Batch Learning (OSBL) scheme, and a fully exploiting and explicit reward function. This framework is realized in three instants in this work with a Convolutional Neural Network (CNN), a basic Recurrent Neural Network (RNN), and a Long Short-Term Memory (LSTM). They are, along with a number of recently reviewed or published portfolio-selection strategies, examined in three back-test experiments with a trading period of 30 minutes in a cryptocurrency market. Cryptocurrencies are electronic and decentralized alternatives to government-issued money, with Bitcoin as the best-known example of a cryptocurrency. All three instances of the framework monopolize the top three positions in all experiments, outdistancing other compared trading algorithms. Although with a high commission rate of 0.25% in the backtests, the framework is able to achieve at least 4-fold returns in 50 days. |
Tasks | Portfolio Optimization |
Published | 2017-06-30 |
URL | http://arxiv.org/abs/1706.10059v2 |
http://arxiv.org/pdf/1706.10059v2.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-reinforcement-learning-framework-for |
Repo | https://github.com/AlphaHounds/UBS |
Framework | tf |