Paper Group AWR 86
Towards Wide Learning: Experiments in Healthcare. Optimal structure and parameter learning of Ising models. Polysemous codes. Temporal Ensembling for Semi-Supervised Learning. A Primer on the Signature Method in Machine Learning. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. Generating Abstractive Summaries from Mee …
Towards Wide Learning: Experiments in Healthcare
Title | Towards Wide Learning: Experiments in Healthcare |
Authors | Snehasis Banerjee, Tanushyam Chattopadhyay, Swagata Biswas, Rohan Banerjee, Anirban Dutta Choudhury, Arpan Pal, Utpal Garain |
Abstract | In this paper, a Wide Learning architecture is proposed that attempts to automate the feature engineering portion of the machine learning (ML) pipeline. Feature engineering is widely considered as the most time consuming and expert knowledge demanding portion of any ML task. The proposed feature recommendation approach is tested on 3 healthcare datasets: a) PhysioNet Challenge 2016 dataset of phonocardiogram (PCG) signals, b) MIMIC II blood pressure classification dataset of photoplethysmogram (PPG) signals and c) an emotion classification dataset of PPG signals. While the proposed method beats the state of the art techniques for 2nd and 3rd dataset, it reaches 94.38% of the accuracy level of the winner of PhysioNet Challenge 2016. In all cases, the effort to reach a satisfactory performance was drastically less (a few days) than manual feature engineering. |
Tasks | Emotion Classification, Feature Engineering |
Published | 2016-12-17 |
URL | http://arxiv.org/abs/1612.05730v2 |
http://arxiv.org/pdf/1612.05730v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-wide-learning-experiments-in |
Repo | https://github.com/sayakpaul/Generating-categories-from-arXiv-paper-titles |
Framework | none |
Optimal structure and parameter learning of Ising models
Title | Optimal structure and parameter learning of Ising models |
Authors | Andrey Y. Lokhov, Marc Vuffray, Sidhant Misra, Michael Chertkov |
Abstract | Reconstruction of structure and parameters of an Ising model from binary samples is a problem of practical importance in a variety of disciplines, ranging from statistical physics and computational biology to image processing and machine learning. The focus of the research community shifted towards developing universal reconstruction algorithms which are both computationally efficient and require the minimal amount of expensive data. We introduce a new method, Interaction Screening, which accurately estimates the model parameters using local optimization problems. The algorithm provably achieves perfect graph structure recovery with an information-theoretically optimal number of samples, notably in the low-temperature regime which is known to be the hardest for learning. The efficacy of Interaction Screening is assessed through extensive numerical tests on synthetic Ising models of various topologies with different types of interactions, as well as on a real data produced by a D-Wave quantum computer. This study shows that the Interaction Screening method is an exact, tractable and optimal technique universally solving the inverse Ising problem. |
Tasks | |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.05024v2 |
http://arxiv.org/pdf/1612.05024v2.pdf | |
PWC | https://paperswithcode.com/paper/optimal-structure-and-parameter-learning-of |
Repo | https://github.com/lanl-ansi/inverse_ising |
Framework | none |
Polysemous codes
Title | Polysemous codes |
Authors | Matthijs Douze, Hervé Jégou, Florent Perronnin |
Abstract | This paper considers the problem of approximate nearest neighbor search in the compressed domain. We introduce polysemous codes, which offer both the distance estimation quality of product quantization and the efficient comparison of binary codes with Hamming distance. Their design is inspired by algorithms introduced in the 90’s to construct channel-optimized vector quantizers. At search time, this dual interpretation accelerates the search. Most of the indexed vectors are filtered out with Hamming distance, letting only a fraction of the vectors to be ranked with an asymmetric distance estimator. The method is complementary with a coarse partitioning of the feature space such as the inverted multi-index. This is shown by our experiments performed on several public benchmarks such as the BIGANN dataset comprising one billion vectors, for which we report state-of-the-art results for query times below 0.3,millisecond per core. Last but not least, our approach allows the approximate computation of the k-NN graph associated with the Yahoo Flickr Creative Commons 100M, described by CNN image descriptors, in less than 8 hours on a single machine. |
Tasks | Quantization |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.01882v2 |
http://arxiv.org/pdf/1609.01882v2.pdf | |
PWC | https://paperswithcode.com/paper/polysemous-codes |
Repo | https://github.com/bitsun/faiss-windows |
Framework | none |
Temporal Ensembling for Semi-Supervised Learning
Title | Temporal Ensembling for Semi-Supervised Learning |
Authors | Samuli Laine, Timo Aila |
Abstract | In this paper, we present a simple and efficient method for training deep neural networks in a semi-supervised setting where only a small portion of training data is labeled. We introduce self-ensembling, where we form a consensus prediction of the unknown labels using the outputs of the network-in-training on different epochs, and most importantly, under different regularization and input augmentation conditions. This ensemble prediction can be expected to be a better predictor for the unknown labels than the output of the network at the most recent training epoch, and can thus be used as a target for training. Using our method, we set new records for two standard semi-supervised learning benchmarks, reducing the (non-augmented) classification error rate from 18.44% to 7.05% in SVHN with 500 labels and from 18.63% to 16.55% in CIFAR-10 with 4000 labels, and further to 5.12% and 12.16% by enabling the standard augmentations. We additionally obtain a clear improvement in CIFAR-100 classification accuracy by using random images from the Tiny Images dataset as unlabeled extra inputs during training. Finally, we demonstrate good tolerance to incorrect labels. |
Tasks | Semi-Supervised Image Classification |
Published | 2016-10-07 |
URL | http://arxiv.org/abs/1610.02242v3 |
http://arxiv.org/pdf/1610.02242v3.pdf | |
PWC | https://paperswithcode.com/paper/temporal-ensembling-for-semi-supervised |
Repo | https://github.com/tensorfreitas/Temporal-Ensembling-for-Semi-Supervised-Learning |
Framework | tf |
A Primer on the Signature Method in Machine Learning
Title | A Primer on the Signature Method in Machine Learning |
Authors | Ilya Chevyrev, Andrey Kormilitzin |
Abstract | In these notes, we wish to provide an introduction to the signature method, focusing on its basic theoretical properties and recent numerical applications. The notes are split into two parts. The first part focuses on the definition and fundamental properties of the signature of a path, or the path signature. We have aimed for a minimalistic approach, assuming only familiarity with classical real analysis and integration theory, and supplementing theory with straightforward examples. We have chosen to focus in detail on the principle properties of the signature which we believe are fundamental to understanding its role in applications. We also present an informal discussion on some of its deeper properties and briefly mention the role of the signature in rough paths theory, which we hope could serve as a light introduction to rough paths for the interested reader. The second part of these notes discusses practical applications of the path signature to the area of machine learning. The signature approach represents a non-parametric way for extraction of characteristic features from data. The data are converted into a multi-dimensional path by means of various embedding algorithms and then processed for computation of individual terms of the signature which summarise certain information contained in the data. The signature thus transforms raw data into a set of features which are used in machine learning tasks. We will review current progress in applications of signatures to machine learning problems. |
Tasks | |
Published | 2016-03-11 |
URL | http://arxiv.org/abs/1603.03788v1 |
http://arxiv.org/pdf/1603.03788v1.pdf | |
PWC | https://paperswithcode.com/paper/a-primer-on-the-signature-method-in-machine |
Repo | https://github.com/patrick-kidger/signatory |
Framework | pytorch |
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
Title | Temporal Segment Networks: Towards Good Practices for Deep Action Recognition |
Authors | Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc Van Gool |
Abstract | Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles to design effective ConvNet architectures for action recognition in videos and learn these models given limited training samples. Our first contribution is temporal segment network (TSN), a novel framework for video-based action recognition. which is based on the idea of long-range temporal structure modeling. It combines a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video. The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network. Our approach obtains the state-the-of-art performance on the datasets of HMDB51 ( $ 69.4% $) and UCF101 ($ 94.2% $). We also visualize the learned ConvNet models, which qualitatively demonstrates the effectiveness of temporal segment network and the proposed good practices. |
Tasks | Action Recognition In Videos, Multimodal Activity Recognition, Temporal Action Localization |
Published | 2016-08-02 |
URL | http://arxiv.org/abs/1608.00859v1 |
http://arxiv.org/pdf/1608.00859v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-segment-networks-towards-good |
Repo | https://github.com/johnhuang87/temporal-segment-networks |
Framework | pytorch |
Generating Abstractive Summaries from Meeting Transcripts
Title | Generating Abstractive Summaries from Meeting Transcripts |
Authors | Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama |
Abstract | Summaries of meetings are very important as they convey the essential content of discussions in a concise form. Generally, it is time consuming to read and understand the whole documents. Therefore, summaries play an important role as the readers are interested in only the important context of discussions. In this work, we address the task of meeting document summarization. Automatic summarization systems on meeting conversations developed so far have been primarily extractive, resulting in unacceptable summaries that are hard to read. The extracted utterances contain disfluencies that affect the quality of the extractive summaries. To make summaries much more readable, we propose an approach to generating abstractive summaries by fusing important content from several utterances. We first separate meeting transcripts into various topic segments, and then identify the important utterances in each segment using a supervised learning approach. The important utterances are then combined together to generate a one-sentence summary. In the text generation step, the dependency parses of the utterances in each segment are combined together to create a directed graph. The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment. The ILP formulation reduces disfluencies by leveraging grammatical relations that are more prominent in non-conversational style of text, and therefore generates summaries that is comparable to human-written abstractive summaries. Experimental results show that our method can generate more informative summaries than the baselines. In addition, readability assessments by human judges as well as log-likelihood estimates obtained from the dependency parser show that our generated summaries are significantly readable and well-formed. |
Tasks | Document Summarization, Text Generation |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.07033v1 |
http://arxiv.org/pdf/1609.07033v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-abstractive-summaries-from-meeting |
Repo | https://github.com/hussam123/Text-Summarization |
Framework | none |
Bank distress in the news: Describing events through deep learning
Title | Bank distress in the news: Describing events through deep learning |
Authors | Samuel Rönnqvist, Peter Sarlin |
Abstract | While many models are purposed for detecting the occurrence of significant events in financial systems, the task of providing qualitative detail on the developments is not usually as well automated. We present a deep learning approach for detecting relevant discussion in text and extracting natural language descriptions of events. Supervised by only a small set of event information, comprising entity names and dates, the model is leveraged by unsupervised learning of semantic vector representations on extensive text data. We demonstrate applicability to the study of financial risk based on news (6.6M articles), particularly bank distress and government interventions (243 events), where indices can signal the level of bank-stress-related reporting at the entity level, or aggregated at national or European level, while being coupled with explanations. Thus, we exemplify how text, as timely, widely available and descriptive data, can serve as a useful complementary source of information for financial and systemic risk analytics. |
Tasks | |
Published | 2016-03-17 |
URL | http://arxiv.org/abs/1603.05670v2 |
http://arxiv.org/pdf/1603.05670v2.pdf | |
PWC | https://paperswithcode.com/paper/bank-distress-in-the-news-describing-events |
Repo | https://github.com/michaelhur/BankDistress |
Framework | none |
ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements
Title | ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements |
Authors | Kuldeep Kulkarni, Suhas Lohit, Pavan Turaga, Ronan Kerviche, Amit Ashok |
Abstract | The goal of this paper is to present a non-iterative and more importantly an extremely fast algorithm to reconstruct images from compressively sensed (CS) random measurements. To this end, we propose a novel convolutional neural network (CNN) architecture which takes in CS measurements of an image as input and outputs an intermediate reconstruction. We call this network, ReconNet. The intermediate reconstruction is fed into an off-the-shelf denoiser to obtain the final reconstructed image. On a standard dataset of images we show significant improvements in reconstruction results (both in terms of PSNR and time complexity) over state-of-the-art iterative CS reconstruction algorithms at various measurement rates. Further, through qualitative experiments on real data collected using our block single pixel camera (SPC), we show that our network is highly robust to sensor noise and can recover visually better quality images than competitive algorithms at extremely low sensing rates of 0.1 and 0.04. To demonstrate that our algorithm can recover semantically informative images even at a low measurement rate of 0.01, we present a very robust proof of concept real-time visual tracking application. |
Tasks | Real-Time Visual Tracking, Visual Tracking |
Published | 2016-01-26 |
URL | http://arxiv.org/abs/1601.06892v2 |
http://arxiv.org/pdf/1601.06892v2.pdf | |
PWC | https://paperswithcode.com/paper/reconnet-non-iterative-reconstruction-of |
Repo | https://github.com/Chinmayrane16/ReconNet-PyTorch |
Framework | pytorch |
Superpixel Hierarchy
Title | Superpixel Hierarchy |
Authors | Xing Wei, Qingxiong Yang, Yihong Gong, Ming-Hsuan Yang, Narendra Ahuja |
Abstract | Superpixel segmentation is becoming ubiquitous in computer vision. In practice, an object can either be represented by a number of segments in finer levels of detail or included in a surrounding region at coarser levels of detail, and thus a superpixel segmentation hierarchy is useful for applications that require different levels of image segmentation detail depending on the particular image objects segmented. Unfortunately, there is no method that can generate all scales of superpixels accurately in real-time. As a result, a simple yet effective algorithm named Super Hierarchy (SH) is proposed in this paper. It is as accurate as the state-of-the-art but 1-2 orders of magnitude faster. The proposed method can be directly integrated with recent efficient edge detectors like the structured forest edges to significantly outperforms the state-of-the-art in terms of segmentation accuracy. Quantitative and qualitative evaluation on a number of computer vision applications was conducted, demonstrating that the proposed method is the top performer. |
Tasks | Semantic Segmentation |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06325v1 |
http://arxiv.org/pdf/1605.06325v1.pdf | |
PWC | https://paperswithcode.com/paper/superpixel-hierarchy |
Repo | https://github.com/semiquark1/boruvka-superpixel |
Framework | none |
Rank-One NMF-Based Initialization for NMF and Relative Error Bounds under a Geometric Assumption
Title | Rank-One NMF-Based Initialization for NMF and Relative Error Bounds under a Geometric Assumption |
Authors | Zhaoqiang Liu, Vincent Y. F. Tan |
Abstract | We propose a geometric assumption on nonnegative data matrices such that under this assumption, we are able to provide upper bounds (both deterministic and probabilistic) on the relative error of nonnegative matrix factorization (NMF). The algorithm we propose first uses the geometric assumption to obtain an exact clustering of the columns of the data matrix; subsequently, it employs several rank-one NMFs to obtain the final decomposition. When applied to data matrices generated from our statistical model, we observe that our proposed algorithm produces factor matrices with comparable relative errors vis-`a-vis classical NMF algorithms but with much faster speeds. On face image and hyperspectral imaging datasets, we demonstrate that our algorithm provides an excellent initialization for applying other NMF algorithms at a low computational cost. Finally, we show on face and text datasets that the combinations of our algorithm and several classical NMF algorithms outperform other algorithms in terms of clustering performance. |
Tasks | |
Published | 2016-12-27 |
URL | http://arxiv.org/abs/1612.08549v2 |
http://arxiv.org/pdf/1612.08549v2.pdf | |
PWC | https://paperswithcode.com/paper/rank-one-nmf-based-initialization-for-nmf-and |
Repo | https://github.com/zhaoqiangliu/cr1-nmf |
Framework | none |
Learning to Generate with Memory
Title | Learning to Generate with Memory |
Authors | Chongxuan Li, Jun Zhu, Bo Zhang |
Abstract | Memory units have been widely used to enrich the capabilities of deep networks on capturing long-term dependencies in reasoning and prediction tasks, but little investigation exists on deep generative models (DGMs) which are good at inferring high-level invariant representations from unlabeled data. This paper presents a deep generative model with a possibly large external memory and an attention mechanism to capture the local detail information that is often lost in the bottom-up abstraction process in representation learning. By adopting a smooth attention model, the whole network is trained end-to-end by optimizing a variational bound of data likelihood via auto-encoding variational Bayesian methods, where an asymmetric recognition network is learnt jointly to infer high-level invariant representations. The asymmetric architecture can reduce the competition between bottom-up invariant feature extraction and top-down generation of instance details. Our experiments on several datasets demonstrate that memory can significantly boost the performance of DGMs and even achieve state-of-the-art results on various tasks, including density estimation, image generation, and missing value imputation. |
Tasks | Density Estimation, Image Generation, Imputation, Representation Learning |
Published | 2016-02-24 |
URL | http://arxiv.org/abs/1602.07416v2 |
http://arxiv.org/pdf/1602.07416v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-generate-with-memory |
Repo | https://github.com/zhenxuan00/MEM_DGM |
Framework | none |
Temporal Tessellation: A Unified Approach for Video Analysis
Title | Temporal Tessellation: A Unified Approach for Video Analysis |
Authors | Dotan Kaufman, Gil Levi, Tal Hassner, Lior Wolf |
Abstract | We present a general approach to video understanding, inspired by semantic transfer techniques that have been successfully used for 2D image analysis. Our method considers a video to be a 1D sequence of clips, each one associated with its own semantics. The nature of these semantics – natural language captions or other labels – depends on the task at hand. A test video is processed by forming correspondences between its clips and the clips of reference videos with known semantics, following which, reference semantics can be transferred to the test video. We describe two matching methods, both designed to ensure that (a) reference clips appear similar to test clips and (b), taken together, the semantics of the selected reference clips is consistent and maintains temporal coherence. We use our method for video captioning on the LSMDC’16 benchmark, video summarization on the SumMe and TVSum benchmarks, Temporal Action Detection on the Thumos2014 benchmark, and sound prediction on the Greatest Hits benchmark. Our method not only surpasses the state of the art, in four out of five benchmarks, but importantly, it is the only single method we know of that was successfully applied to such a diverse range of tasks. |
Tasks | Action Detection, Video Captioning, Video Summarization, Video Understanding |
Published | 2016-12-21 |
URL | http://arxiv.org/abs/1612.06950v2 |
http://arxiv.org/pdf/1612.06950v2.pdf | |
PWC | https://paperswithcode.com/paper/temporal-tessellation-a-unified-approach-for |
Repo | https://github.com/dot27/temporal-tessellation |
Framework | tf |
Adversarial Images for Variational Autoencoders
Title | Adversarial Images for Variational Autoencoders |
Authors | Pedro Tabacof, Julia Tavares, Eduardo Valle |
Abstract | We investigate adversarial attacks for autoencoders. We propose a procedure that distorts the input image to mislead the autoencoder in reconstructing a completely different target image. We attack the internal latent representations, attempting to make the adversarial input produce an internal representation as similar as possible as the target’s. We find that autoencoders are much more robust to the attack than classifiers: while some examples have tolerably small input distortion, and reasonable similarity to the target image, there is a quasi-linear trade-off between those aims. We report results on MNIST and SVHN datasets, and also test regular deterministic autoencoders, reaching similar conclusions in all cases. Finally, we show that the usual adversarial attack for classifiers, while being much easier, also presents a direct proportion between distortion on the input, and misdirection on the output. That proportionality however is hidden by the normalization of the output, which maps a linear layer into non-linear probabilities. |
Tasks | Adversarial Attack |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00155v1 |
http://arxiv.org/pdf/1612.00155v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-images-for-variational |
Repo | https://github.com/tabacof/adv_vae |
Framework | none |
Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge
Title | Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge |
Authors | Nicholas Locascio, Karthik Narasimhan, Eduardo DeLeon, Nate Kushman, Regina Barzilay |
Abstract | This paper explores the task of translating natural language queries into regular expressions which embody their meaning. In contrast to prior work, the proposed neural model does not utilize domain-specific crafting, learning to translate directly from a parallel corpus. To fully explore the potential of neural models, we propose a methodology for collecting a large corpus of regular expression, natural language pairs. Our resulting model achieves a performance gain of 19.6% over previous state-of-the-art models. |
Tasks | |
Published | 2016-08-09 |
URL | http://arxiv.org/abs/1608.03000v1 |
http://arxiv.org/pdf/1608.03000v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-generation-of-regular-expressions-from |
Repo | https://github.com/nicholaslocascio/deep-regex |
Framework | torch |