October 19, 2019

3171 words 15 mins read

Paper Group ANR 333

Spatio-Temporal Instance Learning: Action Tubes from Class Supervision. Self Configuration in Machine Learning. Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction. Training LSTM Networks with Resistive Cross-Point Devices. Selectivity or Invariance: Boundary-aware Salient Object Detection. Performance Analysis of Plug-and-Pla …

Spatio-Temporal Instance Learning: Action Tubes from Class Supervision


Title	Spatio-Temporal Instance Learning: Action Tubes from Class Supervision
Authors	Pascal Mettes, Cees G. M. Snoek
Abstract	The goal of this work is spatio-temporal action localization in videos, using only the supervision from video-level class labels. The state-of-the-art casts this weakly-supervised action localization regime as a Multiple Instance Learning problem, where instances are a priori computed spatio-temporal proposals. Rather than disconnecting the spatio-temporal learning from the training, we propose Spatio-Temporal Instance Learning, which enables action localization directly from box proposals in video frames. We outline the assumptions of our model and propose a max-margin objective and optimization with latent variables that enable spatio-temporal learning of actions from video labels. We also provide an efficient linking algorithm and two reranking strategies to facilitate and further improve the action localization. Experimental evaluation on four action datasets demonstrate the effectiveness of our approach for localization from weak supervision. Moreover, we show how to incorporate other supervision levels and mixtures, as a step towards determining optimal supervision strategies for action localization.
Tasks	Action Localization, Multiple Instance Learning, Spatio-Temporal Action Localization, Temporal Action Localization, Weakly Supervised Action Localization
Published	2018-07-08
URL	http://arxiv.org/abs/1807.02800v2
PDF	http://arxiv.org/pdf/1807.02800v2.pdf
PWC	https://paperswithcode.com/paper/spatio-temporal-instance-learning-action
Repo
Framework

Self Configuration in Machine Learning


Title	Self Configuration in Machine Learning
Authors	Eugene Wong
Abstract	In this paper we first present a class of algorithms for training multi-level neural networks with a quadratic cost function one layer at a time starting from the input layer. The algorithm is based on the fact that for any layer to be trained, the effect of a direct connection to an optimized linear output layer can be computed without the connection being made. Thus, starting from the input layer, we can train each layer in succession in isolation from the other layers. Once trained, the weights are kept fixed and the outputs of the trained layer then serve as the inputs to the next layer to be trained. The result is a very fast algorithm. The simplicity of this training arrangement allows the activation function and step size in weight adjustment to be adaptive and self-adjusting. Furthermore, the stability of the training process allows relatively large steps to be taken and thereby achieving in even greater speeds. Finally, in our context configuring the network means determining the number of outputs for each layer. By decomposing the overall cost function into separate components related to approximation and estimation, we obtain an optimization formula for determining the number of outputs for each layer. With the ability to self-configure and set parameters, we now have more than a fast training algorithm, but the ability to build automatically a fully trained deep neural network starting with nothing more than data.
Tasks
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06463v1
PDF	http://arxiv.org/pdf/1809.06463v1.pdf
PWC	https://paperswithcode.com/paper/self-configuration-in-machine-learning
Repo
Framework

Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction


Title	Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction
Authors	Alejandro Mottini, Rodrigo Acuna-Agost
Abstract	Travel providers such as airlines and on-line travel agents are becoming more and more interested in understanding how passengers choose among alternative itineraries when searching for flights. This knowledge helps them better display and adapt their offer, taking into account market conditions and customer needs. Some common applications are not only filtering and sorting alternatives, but also changing certain attributes in real-time (e.g., changing the price). In this paper, we concentrate with the problem of modeling air passenger choices of flight itineraries. This problem has historically been tackled using classical Discrete Choice Modelling techniques. Traditional statistical approaches, in particular the Multinomial Logit model (MNL), is widely used in industrial applications due to its simplicity and general good performance. However, MNL models present several shortcomings and assumptions that might not hold in real applications. To overcome these difficulties, we present a new choice model based on Pointer Networks. Given an input sequence, this type of deep neural architecture combines Recurrent Neural Networks with the Attention Mechanism to learn the conditional probability of an output whose values correspond to positions in an input sequence. Therefore, given a sequence of different alternatives presented to a customer, the model can learn to point to the one most likely to be chosen by the customer. The proposed method was evaluated on a real dataset that combines on-line user search logs and airline flight bookings. Experimental results show that the proposed model outperforms the traditional MNL model on several metrics.
Tasks
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05976v1
PDF	http://arxiv.org/pdf/1803.05976v1.pdf
PWC	https://paperswithcode.com/paper/deep-choice-model-using-pointer-networks-for
Repo
Framework

Training LSTM Networks with Resistive Cross-Point Devices


Title	Training LSTM Networks with Resistive Cross-Point Devices
Authors	Tayfun Gokmen, Malte Rasch, Wilfried Haensch
Abstract	In our previous work we have shown that resistive cross point devices, so called Resistive Processing Unit (RPU) devices, can provide significant power and speed benefits when training deep fully connected networks as well as convolutional neural networks. In this work, we further extend the RPU concept for training recurrent neural networks (RNNs) namely LSTMs. We show that the mapping of recurrent layers is very similar to the mapping of fully connected layers and therefore the RPU concept can potentially provide large acceleration factors for RNNs as well. In addition, we study the effect of various device imperfections and system parameters on training performance. Symmetry of updates becomes even more crucial for RNNs; already a few percent asymmetry results in an increase in the test error compared to the ideal case trained with floating point numbers. Furthermore, the input signal resolution to device arrays needs to be at least 7 bits for successful training. However, we show that a stochastic rounding scheme can reduce the input signal resolution back to 5 bits. Further, we find that RPU device variations and hardware noise are enough to mitigate overfitting, so that there is less need for using dropout. We note that the models trained here are roughly 1500 times larger than the fully connected network trained on MNIST dataset in terms of the total number of multiplication and summation operations performed per epoch. Thus, here we attempt to study the validity of the RPU approach for large scale networks.
Tasks
Published	2018-06-01
URL	http://arxiv.org/abs/1806.00166v1
PDF	http://arxiv.org/pdf/1806.00166v1.pdf
PWC	https://paperswithcode.com/paper/training-lstm-networks-with-resistive-cross
Repo
Framework

Selectivity or Invariance: Boundary-aware Salient Object Detection


Title	Selectivity or Invariance: Boundary-aware Salient Object Detection
Authors	Jinming Su, Jia Li, Yu Zhang, Changqun Xia, Yonghong Tian
Abstract	Typically, a salient object detection (SOD) model faces opposite requirements in processing object interiors and boundaries. The features of interiors should be invariant to strong appearance change so as to pop-out the salient object as a whole, while the features of boundaries should be selective to slight appearance change to distinguish salient objects and background. To address this selectivity-invariance dilemma, we propose a novel boundary-aware network with successive dilation for image-based SOD. In this network, the feature selectivity at boundaries is enhanced by incorporating a boundary localization stream, while the feature invariance at interiors is guaranteed with a complex interior perception stream. Moreover, a transition compensation stream is adopted to amend the probable failures in transitional regions between interiors and boundaries. In particular, an integrated successive dilation module is proposed to enhance the feature invariance at interiors and transitional regions. Extensive experiments on six datasets show that the proposed approach outperforms 16 state-of-the-art methods.
Tasks	Object Detection, Salient Object Detection
Published	2018-12-25
URL	https://arxiv.org/abs/1812.10066v3
PDF	https://arxiv.org/pdf/1812.10066v3.pdf
PWC	https://paperswithcode.com/paper/selectivity-or-invariance-boundary-aware
Repo
Framework

Performance Analysis of Plug-and-Play ADMM: A Graph Signal Processing Perspective


Title	Performance Analysis of Plug-and-Play ADMM: A Graph Signal Processing Perspective
Authors	Stanley H. Chan
Abstract	The Plug-and-Play (PnP) ADMM algorithm is a powerful image restoration framework that allows advanced image denoising priors to be integrated into physical forward models to generate high quality image restoration results. However, despite the enormous number of applications and several theoretical studies trying to prove the convergence by leveraging tools in convex analysis, very little is known about why the algorithm is doing so well. The goal of this paper is to fill the gap by discussing the performance of PnP ADMM. By restricting the denoisers to the class of graph filters under a linearity assumption, or more specifically the symmetric smoothing filters, we offer three contributions: (1) We show conditions under which an equivalent maximum-a-posteriori (MAP) optimization exists, (2) we present a geometric interpretation and show that the performance gain is due to an intrinsic pre-denoising characteristic of the PnP prior, (3) we introduce a new analysis technique via the concept of consensus equilibrium, and provide interpretations to problems involving multiple priors.
Tasks	Denoising, Image Denoising, Image Restoration
Published	2018-08-31
URL	https://arxiv.org/abs/1809.00020v3
PDF	https://arxiv.org/pdf/1809.00020v3.pdf
PWC	https://paperswithcode.com/paper/performance-analysis-of-plug-and-play-admm-a
Repo
Framework

Open Domain Suggestion Mining: Problem Definition and Datasets


Title	Open Domain Suggestion Mining: Problem Definition and Datasets
Authors	Sapna Negi, Maarten de Rijke, Paul Buitelaar
Abstract	We propose a formal definition for the task of suggestion mining in the context of a wide range of open domain applications. Human perception of the term \emph{suggestion} is subjective and this effects the preparation of hand labeled datasets for the task of suggestion mining. Existing work either lacks a formal problem definition and annotation procedure, or provides domain and application specific definitions. Moreover, many previously used manually labeled datasets remain proprietary. We first present an annotation study, and based on our observations propose a formal task definition and annotation procedure for creating benchmark datasets for suggestion mining. With this study, we also provide publicly available labeled datasets for suggestion mining in multiple domains.
Tasks
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02179v2
PDF	http://arxiv.org/pdf/1806.02179v2.pdf
PWC	https://paperswithcode.com/paper/open-domain-suggestion-mining-problem
Repo
Framework

Self-Attentive Neural Collaborative Filtering


Title	Self-Attentive Neural Collaborative Filtering
Authors	Yi Tay, Shuai Zhang, Luu Anh Tuan, Siu Cheung Hui
Abstract	This paper has been withdrawn as we discovered a bug in our tensorflow implementation that involved accidental mixing of vectors across batches. This lead to different inference results given different batch sizes which is completely strange. The performance scores still remain the same but we concluded that it was not the self-attention that contributed to the performance. We are withdrawing the paper because this renders the main claim of the paper false. Thanks to Guan Xinyu from NUS for discovering this issue in our previously open source code.
Tasks
Published	2018-06-17
URL	http://arxiv.org/abs/1806.06446v2
PDF	http://arxiv.org/pdf/1806.06446v2.pdf
PWC	https://paperswithcode.com/paper/self-attentive-neural-collaborative-filtering
Repo
Framework

A Progressive Batching L-BFGS Method for Machine Learning


Title	A Progressive Batching L-BFGS Method for Machine Learning
Authors	Raghu Bollapragada, Dheevatsa Mudigere, Jorge Nocedal, Hao-Jun Michael Shi, Ping Tak Peter Tang
Abstract	The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, L-BFGS is currently not considered an algorithm of choice for large-scale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the L-BFGS algorithm that combines three basic components - progressive batching, a stochastic line search, and stable quasi-Newton updating - and that performs well on training logistic regression and deep neural networks. We provide supporting convergence theory for the method.
Tasks
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05374v2
PDF	http://arxiv.org/pdf/1802.05374v2.pdf
PWC	https://paperswithcode.com/paper/a-progressive-batching-l-bfgs-method-for
Repo
Framework

3D Normal Coordinate Systems for Cortical Areas


Title	3D Normal Coordinate Systems for Cortical Areas
Authors	J. Tilak Ratnanather, Sylvain Arguillère, Kwame S. Kutten, Peter Hubka, Andrej Kral, Laurent Younes
Abstract	A surface-based diffeomorphic algorithm to generate 3D coordinate grids in the cortical ribbon is described. In the grid, normal coordinate lines are generated by the diffeomorphic evolution from the grey/white (inner) surface to the grey/csf (outer) surface. Specifically, the cortical ribbon is described by two triangulated surfaces with open boundaries. Conceptually, the inner surface sits on top of the white matter structure and the outer on top of the gray matter. It is assumed that the cortical ribbon consists of cortical columns which are orthogonal to the white matter surface. This might be viewed as a consequence of the development of the columns in the embryo. It is also assumed that the columns are orthogonal to the outer surface so that the resultant vector field is orthogonal to the evolving surface. Then the distance of the normal lines from the vector field such that the inner surface evolves diffeomorphically towards the outer one can be construed as a measure of thickness. Applications are described for the auditory cortices in human adults and cats with normal hearing or hearing loss. The approach offers great potential for cortical morphometry.
Tasks
Published	2018-06-28
URL	https://arxiv.org/abs/1806.11169v3
PDF	https://arxiv.org/pdf/1806.11169v3.pdf
PWC	https://paperswithcode.com/paper/3d-normal-coordinate-systems-for-cortical
Repo
Framework

Graph based Question Answering System


Title	Graph based Question Answering System
Authors	Piyush Mital, Saurabh Agarwal, Bhargavi Neti, Yashodhara Haribhakta, Vibhavari Kamble, Krishnanjan Bhattacharjee, Debashri Das, Swati Mehta, Ajai Kumar
Abstract	In today’s digital age in the dawning era of big data analytics it is not the information but the linking of information through entities and actions which defines the discourse. Any textual data either available on the Internet off off-line (like newspaper data, Wikipedia dump, etc) is basically connect information which cannot be treated isolated for its wholesome semantics. There is a need for an automated retrieval process with proper information extraction to structure the data for relevant and fast text analytics. The first big challenge is the conversion of unstructured textual data to structured data. Unlike other databases, graph databases handle relationships and connections elegantly. Our project aims at developing a graph-based information extraction and retrieval system.
Tasks	Question Answering
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01828v1
PDF	http://arxiv.org/pdf/1812.01828v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-question-answering-system
Repo
Framework

Neural language representations predict outcomes of scientific research


Title	Neural language representations predict outcomes of scientific research
Authors	James P. Bagrow, Daniel Berenberg, Joshua Bongard
Abstract	Many research fields codify their findings in standard formats, often by reporting correlations between quantities of interest. But the space of all testable correlates is far larger than scientific resources can currently address, so the ability to accurately predict correlations would be useful to plan research and allocate resources. Using a dataset of approximately 170,000 correlational findings extracted from leading social science journals, we show that a trained neural network can accurately predict the reported correlations using only the text descriptions of the correlates. Accurate predictive models such as these can guide scientists towards promising untested correlates, better quantify the information gained from new findings, and has implications for moving artificial intelligence systems from predicting structures to predicting relationships in the real world.
Tasks
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06879v1
PDF	http://arxiv.org/pdf/1805.06879v1.pdf
PWC	https://paperswithcode.com/paper/neural-language-representations-predict
Repo
Framework

SD-CNN: a Shallow-Deep CNN for Improved Breast Cancer Diagnosis


Title	SD-CNN: a Shallow-Deep CNN for Improved Breast Cancer Diagnosis
Authors	Fei Gao, Teresa Wu, Jing Li, Bin Zheng, Lingxiang Ruan, Desheng Shang, Bhavika Patel
Abstract	Breast cancer is the second leading cause of cancer death among women worldwide. Nevertheless, it is also one of the most treatable malignances if detected early. Screening for breast cancer with digital mammography (DM) has been widely used. However it demonstrates limited sensitivity for women with dense breasts. An emerging technology in the field is contrast-enhanced digital mammography (CEDM), which includes a low energy (LE) image similar to DM, and a recombined image leveraging tumor neoangiogenesis similar to breast magnetic resonance imaging (MRI). CEDM has shown better diagnostic accuracy than DM. While promising, CEDM is not yet widely available across medical centers. In this research, we propose a Shallow-Deep Convolutional Neural Network (SD-CNN) where a shallow CNN is developed to derive “virtual” recombined images from LE images, and a deep CNN is employed to extract novel features from LE, recombined or “virtual” recombined images for ensemble models to classify the cases as benign vs. cancer. To evaluate the validity of our approach, we first develop a deep-CNN using 49 CEDM cases collected from Mayo Clinic to prove the contributions from recombined images for improved breast cancer diagnosis (0.86 in accuracy using LE imaging vs. 0.90 in accuracy using both LE and recombined imaging). We then develop a shallow-CNN using the same 49 CEDM cases to learn the nonlinear mapping from LE to recombined images. Next, we use 69 DM cases collected from the hospital located at Zhejiang University, China to generate “virtual” recombined images. Using DM alone provides 0.91 in accuracy, whereas SD-CNN improves the diagnostic accuracy to 0.95.
Tasks
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00663v2
PDF	http://arxiv.org/pdf/1803.00663v2.pdf
PWC	https://paperswithcode.com/paper/sd-cnn-a-shallow-deep-cnn-for-improved-breast
Repo
Framework

Zeta Distribution and Transfer Learning Problem


Title	Zeta Distribution and Transfer Learning Problem
Authors	Eray Özkural
Abstract	We explore the relations between the zeta distribution and algorithmic information theory via a new model of the transfer learning problem. The program distribution is approximated by a zeta distribution with parameter near $1$. We model the training sequence as a stochastic process. We analyze the upper temporal bound for learning a training sequence and its entropy rates, assuming an oracle for the transfer learning problem. We argue from empirical evidence that power-law models are suitable for natural processes. Four sequence models are proposed. Random typing model is like no-free lunch where transfer learning does not work. Zeta process independently samples programs from the zeta distribution. A model of common sub-programs inspired by genetics uses a database of sub-programs. An evolutionary zeta process samples mutations from Zeta distribution. The analysis of stochastic processes inspired by evolution suggest that AI may be feasible in nature, countering no-free lunch sort of arguments.
Tasks	Transfer Learning
Published	2018-06-23
URL	http://arxiv.org/abs/1806.08908v1
PDF	http://arxiv.org/pdf/1806.08908v1.pdf
PWC	https://paperswithcode.com/paper/zeta-distribution-and-transfer-learning
Repo
Framework

Time-Agnostic Prediction: Predicting Predictable Video Frames


Title	Time-Agnostic Prediction: Predicting Predictable Video Frames
Authors	Dinesh Jayaraman, Frederik Ebert, Alexei A. Efros, Sergey Levine
Abstract	Prediction is arguably one of the most basic functions of an intelligent system. In general, the problem of predicting events in the future or between two waypoints is exceedingly difficult. However, most phenomena naturally pass through relatively predictable bottlenecks—while we cannot predict the precise trajectory of a robot arm between being at rest and holding an object up, we can be certain that it must have picked the object up. To exploit this, we decouple visual prediction from a rigid notion of time. While conventional approaches predict frames at regularly spaced temporal intervals, our time-agnostic predictors (TAP) are not tied to specific times so that they may instead discover predictable “bottleneck” frames no matter when they occur. We evaluate our approach for future and intermediate frame prediction across three robotic manipulation tasks. Our predictions are not only of higher visual quality, but also correspond to coherent semantic subgoals in temporally extended tasks.
Tasks
Published	2018-08-23
URL	http://arxiv.org/abs/1808.07784v3
PDF	http://arxiv.org/pdf/1808.07784v3.pdf
PWC	https://paperswithcode.com/paper/time-agnostic-prediction-predicting
Repo
Framework