Paper Group ANR 333
Spatio-Temporal Instance Learning: Action Tubes from Class Supervision. Self Configuration in Machine Learning. Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction. Training LSTM Networks with Resistive Cross-Point Devices. Selectivity or Invariance: Boundary-aware Salient Object Detection. Performance Analysis of Plug-and-Pla …
Spatio-Temporal Instance Learning: Action Tubes from Class Supervision
Title | Spatio-Temporal Instance Learning: Action Tubes from Class Supervision |
Authors | Pascal Mettes, Cees G. M. Snoek |
Abstract | The goal of this work is spatio-temporal action localization in videos, using only the supervision from video-level class labels. The state-of-the-art casts this weakly-supervised action localization regime as a Multiple Instance Learning problem, where instances are a priori computed spatio-temporal proposals. Rather than disconnecting the spatio-temporal learning from the training, we propose Spatio-Temporal Instance Learning, which enables action localization directly from box proposals in video frames. We outline the assumptions of our model and propose a max-margin objective and optimization with latent variables that enable spatio-temporal learning of actions from video labels. We also provide an efficient linking algorithm and two reranking strategies to facilitate and further improve the action localization. Experimental evaluation on four action datasets demonstrate the effectiveness of our approach for localization from weak supervision. Moreover, we show how to incorporate other supervision levels and mixtures, as a step towards determining optimal supervision strategies for action localization. |
Tasks | Action Localization, Multiple Instance Learning, Spatio-Temporal Action Localization, Temporal Action Localization, Weakly Supervised Action Localization |
Published | 2018-07-08 |
URL | http://arxiv.org/abs/1807.02800v2 |
http://arxiv.org/pdf/1807.02800v2.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-instance-learning-action |
Repo | |
Framework | |
Self Configuration in Machine Learning
Title | Self Configuration in Machine Learning |
Authors | Eugene Wong |
Abstract | In this paper we first present a class of algorithms for training multi-level neural networks with a quadratic cost function one layer at a time starting from the input layer. The algorithm is based on the fact that for any layer to be trained, the effect of a direct connection to an optimized linear output layer can be computed without the connection being made. Thus, starting from the input layer, we can train each layer in succession in isolation from the other layers. Once trained, the weights are kept fixed and the outputs of the trained layer then serve as the inputs to the next layer to be trained. The result is a very fast algorithm. The simplicity of this training arrangement allows the activation function and step size in weight adjustment to be adaptive and self-adjusting. Furthermore, the stability of the training process allows relatively large steps to be taken and thereby achieving in even greater speeds. Finally, in our context configuring the network means determining the number of outputs for each layer. By decomposing the overall cost function into separate components related to approximation and estimation, we obtain an optimization formula for determining the number of outputs for each layer. With the ability to self-configure and set parameters, we now have more than a fast training algorithm, but the ability to build automatically a fully trained deep neural network starting with nothing more than data. |
Tasks | |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06463v1 |
http://arxiv.org/pdf/1809.06463v1.pdf | |
PWC | https://paperswithcode.com/paper/self-configuration-in-machine-learning |
Repo | |
Framework | |
Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction
Title | Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction |
Authors | Alejandro Mottini, Rodrigo Acuna-Agost |
Abstract | Travel providers such as airlines and on-line travel agents are becoming more and more interested in understanding how passengers choose among alternative itineraries when searching for flights. This knowledge helps them better display and adapt their offer, taking into account market conditions and customer needs. Some common applications are not only filtering and sorting alternatives, but also changing certain attributes in real-time (e.g., changing the price). In this paper, we concentrate with the problem of modeling air passenger choices of flight itineraries. This problem has historically been tackled using classical Discrete Choice Modelling techniques. Traditional statistical approaches, in particular the Multinomial Logit model (MNL), is widely used in industrial applications due to its simplicity and general good performance. However, MNL models present several shortcomings and assumptions that might not hold in real applications. To overcome these difficulties, we present a new choice model based on Pointer Networks. Given an input sequence, this type of deep neural architecture combines Recurrent Neural Networks with the Attention Mechanism to learn the conditional probability of an output whose values correspond to positions in an input sequence. Therefore, given a sequence of different alternatives presented to a customer, the model can learn to point to the one most likely to be chosen by the customer. The proposed method was evaluated on a real dataset that combines on-line user search logs and airline flight bookings. Experimental results show that the proposed model outperforms the traditional MNL model on several metrics. |
Tasks | |
Published | 2018-03-15 |
URL | http://arxiv.org/abs/1803.05976v1 |
http://arxiv.org/pdf/1803.05976v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-choice-model-using-pointer-networks-for |
Repo | |
Framework | |
Training LSTM Networks with Resistive Cross-Point Devices
Title | Training LSTM Networks with Resistive Cross-Point Devices |
Authors | Tayfun Gokmen, Malte Rasch, Wilfried Haensch |
Abstract | In our previous work we have shown that resistive cross point devices, so called Resistive Processing Unit (RPU) devices, can provide significant power and speed benefits when training deep fully connected networks as well as convolutional neural networks. In this work, we further extend the RPU concept for training recurrent neural networks (RNNs) namely LSTMs. We show that the mapping of recurrent layers is very similar to the mapping of fully connected layers and therefore the RPU concept can potentially provide large acceleration factors for RNNs as well. In addition, we study the effect of various device imperfections and system parameters on training performance. Symmetry of updates becomes even more crucial for RNNs; already a few percent asymmetry results in an increase in the test error compared to the ideal case trained with floating point numbers. Furthermore, the input signal resolution to device arrays needs to be at least 7 bits for successful training. However, we show that a stochastic rounding scheme can reduce the input signal resolution back to 5 bits. Further, we find that RPU device variations and hardware noise are enough to mitigate overfitting, so that there is less need for using dropout. We note that the models trained here are roughly 1500 times larger than the fully connected network trained on MNIST dataset in terms of the total number of multiplication and summation operations performed per epoch. Thus, here we attempt to study the validity of the RPU approach for large scale networks. |
Tasks | |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.00166v1 |
http://arxiv.org/pdf/1806.00166v1.pdf | |
PWC | https://paperswithcode.com/paper/training-lstm-networks-with-resistive-cross |
Repo | |
Framework | |
Selectivity or Invariance: Boundary-aware Salient Object Detection
Title | Selectivity or Invariance: Boundary-aware Salient Object Detection |
Authors | Jinming Su, Jia Li, Yu Zhang, Changqun Xia, Yonghong Tian |
Abstract | Typically, a salient object detection (SOD) model faces opposite requirements in processing object interiors and boundaries. The features of interiors should be invariant to strong appearance change so as to pop-out the salient object as a whole, while the features of boundaries should be selective to slight appearance change to distinguish salient objects and background. To address this selectivity-invariance dilemma, we propose a novel boundary-aware network with successive dilation for image-based SOD. In this network, the feature selectivity at boundaries is enhanced by incorporating a boundary localization stream, while the feature invariance at interiors is guaranteed with a complex interior perception stream. Moreover, a transition compensation stream is adopted to amend the probable failures in transitional regions between interiors and boundaries. In particular, an integrated successive dilation module is proposed to enhance the feature invariance at interiors and transitional regions. Extensive experiments on six datasets show that the proposed approach outperforms 16 state-of-the-art methods. |
Tasks | Object Detection, Salient Object Detection |
Published | 2018-12-25 |
URL | https://arxiv.org/abs/1812.10066v3 |
https://arxiv.org/pdf/1812.10066v3.pdf | |
PWC | https://paperswithcode.com/paper/selectivity-or-invariance-boundary-aware |
Repo | |
Framework | |
Performance Analysis of Plug-and-Play ADMM: A Graph Signal Processing Perspective
Title | Performance Analysis of Plug-and-Play ADMM: A Graph Signal Processing Perspective |
Authors | Stanley H. Chan |
Abstract | The Plug-and-Play (PnP) ADMM algorithm is a powerful image restoration framework that allows advanced image denoising priors to be integrated into physical forward models to generate high quality image restoration results. However, despite the enormous number of applications and several theoretical studies trying to prove the convergence by leveraging tools in convex analysis, very little is known about why the algorithm is doing so well. The goal of this paper is to fill the gap by discussing the performance of PnP ADMM. By restricting the denoisers to the class of graph filters under a linearity assumption, or more specifically the symmetric smoothing filters, we offer three contributions: (1) We show conditions under which an equivalent maximum-a-posteriori (MAP) optimization exists, (2) we present a geometric interpretation and show that the performance gain is due to an intrinsic pre-denoising characteristic of the PnP prior, (3) we introduce a new analysis technique via the concept of consensus equilibrium, and provide interpretations to problems involving multiple priors. |
Tasks | Denoising, Image Denoising, Image Restoration |
Published | 2018-08-31 |
URL | https://arxiv.org/abs/1809.00020v3 |
https://arxiv.org/pdf/1809.00020v3.pdf | |
PWC | https://paperswithcode.com/paper/performance-analysis-of-plug-and-play-admm-a |
Repo | |
Framework | |
Open Domain Suggestion Mining: Problem Definition and Datasets
Title | Open Domain Suggestion Mining: Problem Definition and Datasets |
Authors | Sapna Negi, Maarten de Rijke, Paul Buitelaar |
Abstract | We propose a formal definition for the task of suggestion mining in the context of a wide range of open domain applications. Human perception of the term \emph{suggestion} is subjective and this effects the preparation of hand labeled datasets for the task of suggestion mining. Existing work either lacks a formal problem definition and annotation procedure, or provides domain and application specific definitions. Moreover, many previously used manually labeled datasets remain proprietary. We first present an annotation study, and based on our observations propose a formal task definition and annotation procedure for creating benchmark datasets for suggestion mining. With this study, we also provide publicly available labeled datasets for suggestion mining in multiple domains. |
Tasks | |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02179v2 |
http://arxiv.org/pdf/1806.02179v2.pdf | |
PWC | https://paperswithcode.com/paper/open-domain-suggestion-mining-problem |
Repo | |
Framework | |
Self-Attentive Neural Collaborative Filtering
Title | Self-Attentive Neural Collaborative Filtering |
Authors | Yi Tay, Shuai Zhang, Luu Anh Tuan, Siu Cheung Hui |
Abstract | This paper has been withdrawn as we discovered a bug in our tensorflow implementation that involved accidental mixing of vectors across batches. This lead to different inference results given different batch sizes which is completely strange. The performance scores still remain the same but we concluded that it was not the self-attention that contributed to the performance. We are withdrawing the paper because this renders the main claim of the paper false. Thanks to Guan Xinyu from NUS for discovering this issue in our previously open source code. |
Tasks | |
Published | 2018-06-17 |
URL | http://arxiv.org/abs/1806.06446v2 |
http://arxiv.org/pdf/1806.06446v2.pdf | |
PWC | https://paperswithcode.com/paper/self-attentive-neural-collaborative-filtering |
Repo | |
Framework | |
A Progressive Batching L-BFGS Method for Machine Learning
Title | A Progressive Batching L-BFGS Method for Machine Learning |
Authors | Raghu Bollapragada, Dheevatsa Mudigere, Jorge Nocedal, Hao-Jun Michael Shi, Ping Tak Peter Tang |
Abstract | The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, L-BFGS is currently not considered an algorithm of choice for large-scale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the L-BFGS algorithm that combines three basic components - progressive batching, a stochastic line search, and stable quasi-Newton updating - and that performs well on training logistic regression and deep neural networks. We provide supporting convergence theory for the method. |
Tasks | |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05374v2 |
http://arxiv.org/pdf/1802.05374v2.pdf | |
PWC | https://paperswithcode.com/paper/a-progressive-batching-l-bfgs-method-for |
Repo | |
Framework | |
3D Normal Coordinate Systems for Cortical Areas
Title | 3D Normal Coordinate Systems for Cortical Areas |
Authors | J. Tilak Ratnanather, Sylvain Arguillère, Kwame S. Kutten, Peter Hubka, Andrej Kral, Laurent Younes |
Abstract | A surface-based diffeomorphic algorithm to generate 3D coordinate grids in the cortical ribbon is described. In the grid, normal coordinate lines are generated by the diffeomorphic evolution from the grey/white (inner) surface to the grey/csf (outer) surface. Specifically, the cortical ribbon is described by two triangulated surfaces with open boundaries. Conceptually, the inner surface sits on top of the white matter structure and the outer on top of the gray matter. It is assumed that the cortical ribbon consists of cortical columns which are orthogonal to the white matter surface. This might be viewed as a consequence of the development of the columns in the embryo. It is also assumed that the columns are orthogonal to the outer surface so that the resultant vector field is orthogonal to the evolving surface. Then the distance of the normal lines from the vector field such that the inner surface evolves diffeomorphically towards the outer one can be construed as a measure of thickness. Applications are described for the auditory cortices in human adults and cats with normal hearing or hearing loss. The approach offers great potential for cortical morphometry. |
Tasks | |
Published | 2018-06-28 |
URL | https://arxiv.org/abs/1806.11169v3 |
https://arxiv.org/pdf/1806.11169v3.pdf | |
PWC | https://paperswithcode.com/paper/3d-normal-coordinate-systems-for-cortical |
Repo | |
Framework | |
Graph based Question Answering System
Title | Graph based Question Answering System |
Authors | Piyush Mital, Saurabh Agarwal, Bhargavi Neti, Yashodhara Haribhakta, Vibhavari Kamble, Krishnanjan Bhattacharjee, Debashri Das, Swati Mehta, Ajai Kumar |
Abstract | In today’s digital age in the dawning era of big data analytics it is not the information but the linking of information through entities and actions which defines the discourse. Any textual data either available on the Internet off off-line (like newspaper data, Wikipedia dump, etc) is basically connect information which cannot be treated isolated for its wholesome semantics. There is a need for an automated retrieval process with proper information extraction to structure the data for relevant and fast text analytics. The first big challenge is the conversion of unstructured textual data to structured data. Unlike other databases, graph databases handle relationships and connections elegantly. Our project aims at developing a graph-based information extraction and retrieval system. |
Tasks | Question Answering |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.01828v1 |
http://arxiv.org/pdf/1812.01828v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-question-answering-system |
Repo | |
Framework | |
Neural language representations predict outcomes of scientific research
Title | Neural language representations predict outcomes of scientific research |
Authors | James P. Bagrow, Daniel Berenberg, Joshua Bongard |
Abstract | Many research fields codify their findings in standard formats, often by reporting correlations between quantities of interest. But the space of all testable correlates is far larger than scientific resources can currently address, so the ability to accurately predict correlations would be useful to plan research and allocate resources. Using a dataset of approximately 170,000 correlational findings extracted from leading social science journals, we show that a trained neural network can accurately predict the reported correlations using only the text descriptions of the correlates. Accurate predictive models such as these can guide scientists towards promising untested correlates, better quantify the information gained from new findings, and has implications for moving artificial intelligence systems from predicting structures to predicting relationships in the real world. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06879v1 |
http://arxiv.org/pdf/1805.06879v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-language-representations-predict |
Repo | |
Framework | |
SD-CNN: a Shallow-Deep CNN for Improved Breast Cancer Diagnosis
Title | SD-CNN: a Shallow-Deep CNN for Improved Breast Cancer Diagnosis |
Authors | Fei Gao, Teresa Wu, Jing Li, Bin Zheng, Lingxiang Ruan, Desheng Shang, Bhavika Patel |
Abstract | Breast cancer is the second leading cause of cancer death among women worldwide. Nevertheless, it is also one of the most treatable malignances if detected early. Screening for breast cancer with digital mammography (DM) has been widely used. However it demonstrates limited sensitivity for women with dense breasts. An emerging technology in the field is contrast-enhanced digital mammography (CEDM), which includes a low energy (LE) image similar to DM, and a recombined image leveraging tumor neoangiogenesis similar to breast magnetic resonance imaging (MRI). CEDM has shown better diagnostic accuracy than DM. While promising, CEDM is not yet widely available across medical centers. In this research, we propose a Shallow-Deep Convolutional Neural Network (SD-CNN) where a shallow CNN is developed to derive “virtual” recombined images from LE images, and a deep CNN is employed to extract novel features from LE, recombined or “virtual” recombined images for ensemble models to classify the cases as benign vs. cancer. To evaluate the validity of our approach, we first develop a deep-CNN using 49 CEDM cases collected from Mayo Clinic to prove the contributions from recombined images for improved breast cancer diagnosis (0.86 in accuracy using LE imaging vs. 0.90 in accuracy using both LE and recombined imaging). We then develop a shallow-CNN using the same 49 CEDM cases to learn the nonlinear mapping from LE to recombined images. Next, we use 69 DM cases collected from the hospital located at Zhejiang University, China to generate “virtual” recombined images. Using DM alone provides 0.91 in accuracy, whereas SD-CNN improves the diagnostic accuracy to 0.95. |
Tasks | |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00663v2 |
http://arxiv.org/pdf/1803.00663v2.pdf | |
PWC | https://paperswithcode.com/paper/sd-cnn-a-shallow-deep-cnn-for-improved-breast |
Repo | |
Framework | |
Zeta Distribution and Transfer Learning Problem
Title | Zeta Distribution and Transfer Learning Problem |
Authors | Eray Özkural |
Abstract | We explore the relations between the zeta distribution and algorithmic information theory via a new model of the transfer learning problem. The program distribution is approximated by a zeta distribution with parameter near $1$. We model the training sequence as a stochastic process. We analyze the upper temporal bound for learning a training sequence and its entropy rates, assuming an oracle for the transfer learning problem. We argue from empirical evidence that power-law models are suitable for natural processes. Four sequence models are proposed. Random typing model is like no-free lunch where transfer learning does not work. Zeta process independently samples programs from the zeta distribution. A model of common sub-programs inspired by genetics uses a database of sub-programs. An evolutionary zeta process samples mutations from Zeta distribution. The analysis of stochastic processes inspired by evolution suggest that AI may be feasible in nature, countering no-free lunch sort of arguments. |
Tasks | Transfer Learning |
Published | 2018-06-23 |
URL | http://arxiv.org/abs/1806.08908v1 |
http://arxiv.org/pdf/1806.08908v1.pdf | |
PWC | https://paperswithcode.com/paper/zeta-distribution-and-transfer-learning |
Repo | |
Framework | |
Time-Agnostic Prediction: Predicting Predictable Video Frames
Title | Time-Agnostic Prediction: Predicting Predictable Video Frames |
Authors | Dinesh Jayaraman, Frederik Ebert, Alexei A. Efros, Sergey Levine |
Abstract | Prediction is arguably one of the most basic functions of an intelligent system. In general, the problem of predicting events in the future or between two waypoints is exceedingly difficult. However, most phenomena naturally pass through relatively predictable bottlenecks—while we cannot predict the precise trajectory of a robot arm between being at rest and holding an object up, we can be certain that it must have picked the object up. To exploit this, we decouple visual prediction from a rigid notion of time. While conventional approaches predict frames at regularly spaced temporal intervals, our time-agnostic predictors (TAP) are not tied to specific times so that they may instead discover predictable “bottleneck” frames no matter when they occur. We evaluate our approach for future and intermediate frame prediction across three robotic manipulation tasks. Our predictions are not only of higher visual quality, but also correspond to coherent semantic subgoals in temporally extended tasks. |
Tasks | |
Published | 2018-08-23 |
URL | http://arxiv.org/abs/1808.07784v3 |
http://arxiv.org/pdf/1808.07784v3.pdf | |
PWC | https://paperswithcode.com/paper/time-agnostic-prediction-predicting |
Repo | |
Framework | |