Paper Group ANR 489
MEC: Memory-efficient Convolution for Deep Neural Network. Transferring Semantic Roles Using Translation and Syntactic Information. Iteratively-Reweighted Least-Squares Fitting of Support Vector Machines: A Majorization–Minimization Algorithm Approach. WMRB: Learning to Rank in a Scalable Batch Training Approach. Probabilistic Rule Realization and …
MEC: Memory-efficient Convolution for Deep Neural Network
Title | MEC: Memory-efficient Convolution for Deep Neural Network |
Authors | Minsik Cho, Daniel Brand |
Abstract | Convolution is a critical component in modern deep neural networks, thus several algorithms for convolution have been developed. Direct convolution is simple but suffers from poor performance. As an alternative, multiple indirect methods have been proposed including im2col-based convolution, FFT-based convolution, or Winograd-based algorithm. However, all these indirect methods have high memory-overhead, which creates performance degradation and offers a poor trade-off between performance and memory consumption. In this work, we propose a memory-efficient convolution or MEC with compact lowering, which reduces memory-overhead substantially and accelerates convolution process. MEC lowers the input matrix in a simple yet efficient/compact way (i.e., much less memory-overhead), and then executes multiple small matrix multiplications in parallel to get convolution completed. Additionally, the reduced memory footprint improves memory sub-system efficiency, improving performance. Our experimental results show that MEC reduces memory consumption significantly with good speedup on both mobile and server platforms, compared with other indirect convolution algorithms. |
Tasks | |
Published | 2017-06-21 |
URL | http://arxiv.org/abs/1706.06873v1 |
http://arxiv.org/pdf/1706.06873v1.pdf | |
PWC | https://paperswithcode.com/paper/mec-memory-efficient-convolution-for-deep |
Repo | |
Framework | |
Transferring Semantic Roles Using Translation and Syntactic Information
Title | Transferring Semantic Roles Using Translation and Syntactic Information |
Authors | Maryam Aminian, Mohammad Sadegh Rasooli, Mona Diab |
Abstract | Our paper addresses the problem of annotation projection for semantic role labeling for resource-poor languages using supervised annotations from a resource-rich language through parallel data. We propose a transfer method that employs information from source and target syntactic dependencies as well as word alignment density to improve the quality of an iterative bootstrapping method. Our experiments yield a $3.5$ absolute labeled F-score improvement over a standard annotation projection method. |
Tasks | Semantic Role Labeling, Word Alignment |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01411v1 |
http://arxiv.org/pdf/1710.01411v1.pdf | |
PWC | https://paperswithcode.com/paper/transferring-semantic-roles-using-translation |
Repo | |
Framework | |
Iteratively-Reweighted Least-Squares Fitting of Support Vector Machines: A Majorization–Minimization Algorithm Approach
Title | Iteratively-Reweighted Least-Squares Fitting of Support Vector Machines: A Majorization–Minimization Algorithm Approach |
Authors | Hien D. Nguyen, Geoffrey J. McLachlan |
Abstract | Support vector machines (SVMs) are an important tool in modern data analysis. Traditionally, support vector machines have been fitted via quadratic programming, either using purpose-built or off-the-shelf algorithms. We present an alternative approach to SVM fitting via the majorization–minimization (MM) paradigm. Algorithms that are derived via MM algorithm constructions can be shown to monotonically decrease their objectives at each iteration, as well as be globally convergent to stationary points. We demonstrate the construction of iteratively-reweighted least-squares (IRLS) algorithms, via the MM paradigm, for SVM risk minimization problems involving the hinge, least-square, squared-hinge, and logistic losses, and 1-norm, 2-norm, and elastic net penalizations. Successful implementations of our algorithms are presented via some numerical examples. |
Tasks | |
Published | 2017-05-12 |
URL | http://arxiv.org/abs/1705.04651v1 |
http://arxiv.org/pdf/1705.04651v1.pdf | |
PWC | https://paperswithcode.com/paper/iteratively-reweighted-least-squares-fitting |
Repo | |
Framework | |
WMRB: Learning to Rank in a Scalable Batch Training Approach
Title | WMRB: Learning to Rank in a Scalable Batch Training Approach |
Authors | Kuan Liu, Prem Natarajan |
Abstract | We propose a new learning to rank algorithm, named Weighted Margin-Rank Batch loss (WMRB), to extend the popular Weighted Approximate-Rank Pairwise loss (WARP). WMRB uses a new rank estimator and an efficient batch training algorithm. The approach allows more accurate item rank approximation and explicit utilization of parallel computation to accelerate training. In three item recommendation tasks, WMRB consistently outperforms WARP and other baselines. Moreover, WMRB shows clear time efficiency advantages as data scale increases. |
Tasks | Learning-To-Rank |
Published | 2017-11-10 |
URL | http://arxiv.org/abs/1711.04015v1 |
http://arxiv.org/pdf/1711.04015v1.pdf | |
PWC | https://paperswithcode.com/paper/wmrb-learning-to-rank-in-a-scalable-batch |
Repo | |
Framework | |
Probabilistic Rule Realization and Selection
Title | Probabilistic Rule Realization and Selection |
Authors | Haizi Yu, Tianxi Li, Lav R. Varshney |
Abstract | Abstraction and realization are bilateral processes that are key in deriving intelligence and creativity. In many domains, the two processes are approached through rules: high-level principles that reveal invariances within similar yet diverse examples. Under a probabilistic setting for discrete input spaces, we focus on the rule realization problem which generates input sample distributions that follow the given rules. More ambitiously, we go beyond a mechanical realization that takes whatever is given, but instead ask for proactively selecting reasonable rules to realize. This goal is demanding in practice, since the initial rule set may not always be consistent and thus intelligent compromises are needed. We formulate both rule realization and selection as two strongly connected components within a single and symmetric bi-convex problem, and derive an efficient algorithm that works at large scale. Taking music compositional rules as the main example throughout the paper, we demonstrate our model’s efficiency in not only music realization (composition) but also music interpretation and understanding (analysis). |
Tasks | |
Published | 2017-09-06 |
URL | http://arxiv.org/abs/1709.01674v3 |
http://arxiv.org/pdf/1709.01674v3.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-rule-realization-and-selection |
Repo | |
Framework | |
Parameter-free online learning via model selection
Title | Parameter-free online learning via model selection |
Authors | Dylan J. Foster, Satyen Kale, Mehryar Mohri, Karthik Sridharan |
Abstract | We introduce an efficient algorithmic framework for model selection in online learning, also known as parameter-free online learning. Departing from previous work, which has focused on highly structured function classes such as nested balls in Hilbert space, we propose a generic meta-algorithm framework that achieves online model selection oracle inequalities under minimal structural assumptions. We give the first computationally efficient parameter-free algorithms that work in arbitrary Banach spaces under mild smoothness assumptions; previous results applied only to Hilbert spaces. We further derive new oracle inequalities for matrix classes, non-nested convex sets, and $\mathbb{R}^{d}$ with generic regularizers. Finally, we generalize these results by providing oracle inequalities for arbitrary non-linear classes in the online supervised learning model. These results are all derived through a unified meta-algorithm scheme using a novel “multi-scale” algorithm for prediction with expert advice based on random playout, which may be of independent interest. |
Tasks | Model Selection |
Published | 2017-12-30 |
URL | http://arxiv.org/abs/1801.00101v2 |
http://arxiv.org/pdf/1801.00101v2.pdf | |
PWC | https://paperswithcode.com/paper/parameter-free-online-learning-via-model |
Repo | |
Framework | |
Robust Lane Tracking with Multi-mode Observation Model and Particle Filtering
Title | Robust Lane Tracking with Multi-mode Observation Model and Particle Filtering |
Authors | Jiawei Huang, Zhaowen Wang |
Abstract | Automatic lane tracking involves estimating the underlying signal from a sequence of noisy signal observations. Many models and methods have been proposed for lane tracking, and dynamic targets tracking in general. The Kalman Filter is a widely used method that works well on linear Gaussian models. But this paper shows that Kalman Filter is not suitable for lane tracking, because its Gaussian observation model cannot faithfully represent the procured observations. We propose using a Particle Filter on top of a novel multiple mode observation model. Experiments show that our method produces superior performance to a conventional Kalman Filter. |
Tasks | |
Published | 2017-06-28 |
URL | http://arxiv.org/abs/1706.09119v1 |
http://arxiv.org/pdf/1706.09119v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-lane-tracking-with-multi-mode |
Repo | |
Framework | |
Gradient Sparsification for Communication-Efficient Distributed Optimization
Title | Gradient Sparsification for Communication-Efficient Distributed Optimization |
Authors | Jianqiao Wangni, Jialei Wang, Ji Liu, Tong Zhang |
Abstract | Modern large scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost we propose a convex optimization formulation to minimize the coding length of stochastic gradients. To solve the optimal sparsification efficiently, several simple and fast algorithms are proposed for approximate solution, with theoretical guaranteed for sparseness. Experiments on $\ell_2$ regularized logistic regression, support vector machines, and convolutional neural networks validate our sparsification approaches. |
Tasks | Distributed Optimization, Stochastic Optimization |
Published | 2017-10-26 |
URL | http://arxiv.org/abs/1710.09854v1 |
http://arxiv.org/pdf/1710.09854v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-sparsification-for-communication |
Repo | |
Framework | |
Image-based immersed boundary model of the aortic root
Title | Image-based immersed boundary model of the aortic root |
Authors | Ali Hasan, Ebrahim M. Kolahdouz, Andinet Enquobahrie, Thomas G. Caranasos, John P. Vavalle, Boyce E. Griffith |
Abstract | Each year, approximately 300,000 heart valve repair or replacement procedures are performed worldwide, including approximately 70,000 aortic valve replacement surgeries in the United States alone. This paper describes progress in constructing anatomically and physiologically realistic immersed boundary (IB) models of the dynamics of the aortic root and ascending aorta. This work builds on earlier IB models of fluid-structure interaction (FSI) in the aortic root, which previously achieved realistic hemodynamics over multiple cardiac cycles, but which also were limited to simplified aortic geometries and idealized descriptions of the biomechanics of the aortic valve cusps. By contrast, the model described herein uses an anatomical geometry reconstructed from patient-specific computed tomography angiography (CTA) data, and employs a description of the elasticity of the aortic valve leaflets based on a fiber-reinforced constitutive model fit to experimental tensile test data. Numerical tests show that the model is able to resolve the leaflet biomechanics in diastole and early systole at practical grid spacings. The model is also used to examine differences in the mechanics and fluid dynamics yielded by fresh valve leaflets and glutaraldehyde-fixed leaflets similar to those used in bioprosthetic heart valves. Although there are large differences in the leaflet deformations during diastole, the differences in the open configurations of the valve models are relatively small, and nearly identical hemodynamics are obtained in all cases considered. |
Tasks | |
Published | 2017-05-04 |
URL | http://arxiv.org/abs/1705.04279v1 |
http://arxiv.org/pdf/1705.04279v1.pdf | |
PWC | https://paperswithcode.com/paper/image-based-immersed-boundary-model-of-the |
Repo | |
Framework | |
Predicting non-linear dynamics by stable local learning in a recurrent spiking neural network
Title | Predicting non-linear dynamics by stable local learning in a recurrent spiking neural network |
Authors | Aditya Gilra, Wulfram Gerstner |
Abstract | Brains need to predict how the body reacts to motor commands. It is an open question how networks of spiking neurons can learn to reproduce the non-linear body dynamics caused by motor commands, using local, online and stable learning rules. Here, we present a supervised learning scheme for the feedforward and recurrent connections in a network of heterogeneous spiking neurons. The error in the output is fed back through fixed random connections with a negative gain, causing the network to follow the desired dynamics, while an online and local rule changes the weights. The rule for Feedback-based Online Local Learning Of Weights (FOLLOW) is local in the sense that weight changes depend on the presynaptic activity and the error signal projected onto the postsynaptic neuron. We provide examples of learning linear, non-linear and chaotic dynamics, as well as the dynamics of a two-link arm. Using the Lyapunov method, and under reasonable assumptions and approximations, we show that FOLLOW learning is stable uniformly, with the error going to zero asymptotically. |
Tasks | |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06463v2 |
http://arxiv.org/pdf/1702.06463v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-non-linear-dynamics-by-stable |
Repo | |
Framework | |
AutoMode: Relational Learning With Less Black Magic
Title | AutoMode: Relational Learning With Less Black Magic |
Authors | Jose Picado, Sudhanshu Pathak, Arash Termehchy, Alan Fern |
Abstract | Relational databases are valuable resources for learning novel and interesting relations and concepts. Relational learning algorithms learn the Datalog definition of new relations in terms of the existing relations in the database. In order to constraint the search through the large space of candidate definitions, users must tune the algorithm by specifying a language bias. Unfortunately, specifying the language bias is done via trial and error and is guided by the expert’s intuitions. Hence, it normally takes a great deal of time and effort to effectively use these algorithms. In particular, it is hard to find a user that knows computer science concepts, such as database schema, and has a reasonable intuition about the target relation in special domains, such as biology. We propose AutoMode, a system that leverages information in the schema and content of the database to automatically induce the language bias used by popular relational learning systems. We show that AutoMode delivers the same accuracy as using manually-written language bias by imposing only a slight overhead on the running time of the learning algorithm. |
Tasks | Relational Reasoning |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01420v1 |
http://arxiv.org/pdf/1710.01420v1.pdf | |
PWC | https://paperswithcode.com/paper/automode-relational-learning-with-less-black |
Repo | |
Framework | |
Differentiable Scheduled Sampling for Credit Assignment
Title | Differentiable Scheduled Sampling for Credit Assignment |
Authors | Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick |
Abstract | We demonstrate that a continuous relaxation of the argmax operation can be used to create a differentiable approximation to greedy decoding for sequence-to-sequence (seq2seq) models. By incorporating this approximation into the scheduled sampling training procedure (Bengio et al., 2015)–a well-known technique for correcting exposure bias–we introduce a new training objective that is continuous and differentiable everywhere and that can provide informative gradients near points where previous decoding decisions change their value. In addition, by using a related approximation, we demonstrate a similar approach to sampled-based training. Finally, we show that our approach outperforms cross-entropy training and scheduled sampling procedures in two sequence prediction tasks: named entity recognition and machine translation. |
Tasks | Machine Translation, Named Entity Recognition |
Published | 2017-04-23 |
URL | http://arxiv.org/abs/1704.06970v1 |
http://arxiv.org/pdf/1704.06970v1.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-scheduled-sampling-for-credit |
Repo | |
Framework | |
Fake News Mitigation via Point Process Based Intervention
Title | Fake News Mitigation via Point Process Based Intervention |
Authors | Mehrdad Farajtabar, Jiachen Yang, Xiaojing Ye, Huan Xu, Rakshit Trivedi, Elias Khalil, Shuang Li, Le Song, Hongyuan Zha |
Abstract | We propose the first multistage intervention framework that tackles fake news in social networks by combining reinforcement learning with a point process network activity model. The spread of fake news and mitigation events within the network is modeled by a multivariate Hawkes process with additional exogenous control terms. By choosing a feature representation of states, defining mitigation actions and constructing reward functions to measure the effectiveness of mitigation activities, we map the problem of fake news mitigation into the reinforcement learning framework. We develop a policy iteration method unique to the multivariate networked point process, with the goal of optimizing the actions for maximal total reward under budget constraints. Our method shows promising performance in real-time intervention experiments on a Twitter network to mitigate a surrogate fake news campaign, and outperforms alternatives on synthetic datasets. |
Tasks | |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07823v2 |
http://arxiv.org/pdf/1703.07823v2.pdf | |
PWC | https://paperswithcode.com/paper/fake-news-mitigation-via-point-process-based |
Repo | |
Framework | |
Network Essence: PageRank Completion and Centrality-Conforming Markov Chains
Title | Network Essence: PageRank Completion and Centrality-Conforming Markov Chains |
Authors | Shang-Hua Teng |
Abstract | Ji\v{r}'i Matou\v{s}ek (1963-2015) had many breakthrough contributions in mathematics and algorithm design. His milestone results are not only profound but also elegant. By going beyond the original objects — such as Euclidean spaces or linear programs — Jirka found the essence of the challenging mathematical/algorithmic problems as well as beautiful solutions that were natural to him, but were surprising discoveries to the field. In this short exploration article, I will first share with readers my initial encounter with Jirka and discuss one of his fundamental geometric results from the early 1990s. In the age of social and information networks, I will then turn the discussion from geometric structures to network structures, attempting to take a humble step towards the holy grail of network science, that is to understand the network essence that underlies the observed sparse-and-multifaceted network data. I will discuss a simple result which summarizes some basic algebraic properties of personalized PageRank matrices. Unlike the traditional transitive closure of binary relations, the personalized PageRank matrices take “accumulated Markovian closure” of network data. Some of these algebraic properties are known in various contexts. But I hope featuring them together in a broader context will help to illustrate the desirable properties of this Markovian completion of networks, and motivate systematic developments of a network theory for understanding vast and ubiquitous multifaceted network data. |
Tasks | |
Published | 2017-08-25 |
URL | http://arxiv.org/abs/1708.07906v1 |
http://arxiv.org/pdf/1708.07906v1.pdf | |
PWC | https://paperswithcode.com/paper/network-essence-pagerank-completion-and |
Repo | |
Framework | |
Projection based advanced motion model for cubic mapping for 360-degree video
Title | Projection based advanced motion model for cubic mapping for 360-degree video |
Authors | Li Li, Zhu Li, Madhukar Budagavi, Houqiang Li |
Abstract | This paper proposes a novel advanced motion model to handle the irregular motion for the cubic map projection of 360-degree video. Since the irregular motion is mainly caused by the projection from the sphere to the cube map, we first try to project the pixels in both the current picture and reference picture from unfolding cube back to the sphere. Then through utilizing the characteristic that most of the motions in the sphere are uniform, we can derive the relationship between the motion vectors of various pixels in the unfold cube. The proposed advanced motion model is implemented in the High Efficiency Video Coding reference software. Experimental results demonstrate that quite obvious performance improvement can be achieved for the sequences with obvious motions. |
Tasks | |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06277v1 |
http://arxiv.org/pdf/1702.06277v1.pdf | |
PWC | https://paperswithcode.com/paper/projection-based-advanced-motion-model-for |
Repo | |
Framework | |