Paper Group ANR 501
Feature-Area Optimization: A Novel SAR Image Registration Method. Fast Domain Adaptation for Neural Machine Translation. Message Passing Multi-Agent GANs. Evolutionary Image Transition Based on Theoretical Insights of Random Processes. What Do Recurrent Neural Network Grammars Learn About Syntax?. Enhanced Factored Three-Way Restricted Boltzmann Ma …
Feature-Area Optimization: A Novel SAR Image Registration Method
Title | Feature-Area Optimization: A Novel SAR Image Registration Method |
Authors | Fuqiang Liu, Fukun Bi, Liang Chen, Hao Shi, Wei Liu |
Abstract | This letter proposes a synthetic aperture radar (SAR) image registration method named Feature-Area Optimization (FAO). First, the traditional area-based optimization model is reconstructed and decomposed into three key but uncertain factors: initialization, slice set and regularization. Next, structural features are extracted by scale invariant feature transform (SIFT) in dual-resolution space (SIFT-DRS), a novel SIFT-Like method dedicated to FAO. Then, the three key factors are determined based on these features. Finally, solving the factor-determined optimization model can get the registration result. A series of experiments demonstrate that the proposed method can register multi-temporal SAR images accurately and efficiently. |
Tasks | Image Registration |
Published | 2016-02-18 |
URL | http://arxiv.org/abs/1602.05660v1 |
http://arxiv.org/pdf/1602.05660v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-area-optimization-a-novel-sar-image |
Repo | |
Framework | |
Fast Domain Adaptation for Neural Machine Translation
Title | Fast Domain Adaptation for Neural Machine Translation |
Authors | Markus Freitag, Yaser Al-Onaizan |
Abstract | Neural Machine Translation (NMT) is a new approach for automatic translation of text from one human language into another. The basic concept in NMT is to train a large Neural Network that maximizes the translation performance on a given parallel corpus. NMT is gaining popularity in the research community because it outperformed traditional SMT approaches in several translation tasks at WMT and other evaluation tasks/benchmarks at least for some language pairs. However, many of the enhancements in SMT over the years have not been incorporated into the NMT framework. In this paper, we focus on one such enhancement namely domain adaptation. We propose an approach for adapting a NMT system to a new domain. The main idea behind domain adaptation is that the availability of large out-of-domain training data and a small in-domain training data. We report significant gains with our proposed method in both automatic metrics and a human subjective evaluation metric on two language pairs. With our adaptation method, we show large improvement on the new domain while the performance of our general domain only degrades slightly. In addition, our approach is fast enough to adapt an already trained system to a new domain within few hours without the need to retrain the NMT model on the combined data which usually takes several days/weeks depending on the volume of the data. |
Tasks | Domain Adaptation, Machine Translation |
Published | 2016-12-20 |
URL | http://arxiv.org/abs/1612.06897v1 |
http://arxiv.org/pdf/1612.06897v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-domain-adaptation-for-neural-machine |
Repo | |
Framework | |
Message Passing Multi-Agent GANs
Title | Message Passing Multi-Agent GANs |
Authors | Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri |
Abstract | Communicating and sharing intelligence among agents is an important facet of achieving Artificial General Intelligence. As a first step towards this challenge, we introduce a novel framework for image generation: Message Passing Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have recently been shown to be very effective for image generation and other tasks, these networks have been limited to mostly single generator-discriminator networks. We show that we can obtain multi-agent GANs that communicate through message passing to achieve better image generation. The objectives of the individual agents in this framework are two fold: a co-operation objective and a competing objective. The co-operation objective ensures that the message sharing mechanism guides the other generator to generate better than itself while the competing objective encourages each generator to generate better than its counterpart. We analyze and visualize the messages that these GANs share among themselves in various scenarios. We quantitatively show that the message sharing formulation serves as a regularizer for the adversarial training. Qualitatively, we show that the different generators capture different traits of the underlying data distribution. |
Tasks | Image Generation |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01294v1 |
http://arxiv.org/pdf/1612.01294v1.pdf | |
PWC | https://paperswithcode.com/paper/message-passing-multi-agent-gans |
Repo | |
Framework | |
Evolutionary Image Transition Based on Theoretical Insights of Random Processes
Title | Evolutionary Image Transition Based on Theoretical Insights of Random Processes |
Authors | Aneta Neumann, Bradley Alexander, Frank Neumann |
Abstract | Evolutionary algorithms have been widely studied from a theoretical perspective. In particular, the area of runtime analysis has contributed significantly to a theoretical understanding and provided insights into the working behaviour of these algorithms. We study how these insights into evolutionary processes can be used for evolutionary art. We introduce the notion of evolutionary image transition which transfers a given starting image into a target image through an evolutionary process. Combining standard mutation effects known from the optimization of the classical benchmark function OneMax and different variants of random walks, we present ways of performing evolutionary image transition with different artistic effects. |
Tasks | |
Published | 2016-04-21 |
URL | http://arxiv.org/abs/1604.06187v1 |
http://arxiv.org/pdf/1604.06187v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-image-transition-based-on |
Repo | |
Framework | |
What Do Recurrent Neural Network Grammars Learn About Syntax?
Title | What Do Recurrent Neural Network Grammars Learn About Syntax? |
Authors | Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith |
Abstract | Recurrent neural network grammars (RNNG) are a recently proposed probabilistic generative modeling family for natural language. They show state-of-the-art language modeling and parsing performance. We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection. We find that explicit modeling of composition is crucial for achieving the best performance. Through the attention mechanism, we find that headedness plays a central role in phrasal representation (with the model’s latent attention largely agreeing with predictions made by hand-crafted head rules, albeit with some important differences). By training grammars without nonterminal labels, we find that phrasal representations depend minimally on nonterminals, providing support for the endocentricity hypothesis. |
Tasks | Constituency Parsing, Dependency Parsing, Language Modelling |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05774v2 |
http://arxiv.org/pdf/1611.05774v2.pdf | |
PWC | https://paperswithcode.com/paper/what-do-recurrent-neural-network-grammars |
Repo | |
Framework | |
Enhanced Factored Three-Way Restricted Boltzmann Machines for Speech Detection
Title | Enhanced Factored Three-Way Restricted Boltzmann Machines for Speech Detection |
Authors | Pengfei Sun, Jun Qin |
Abstract | In this letter, we propose enhanced factored three way restricted Boltzmann machines (EFTW-RBMs) for speech detection. The proposed model incorporates conditional feature learning by multiplying the dynamical state of the third unit, which allows a modulation over the visible-hidden node pairs. Instead of stacking previous frames of speech as the third unit in a recursive manner, the correlation related weighting coefficients are assigned to the contextual neighboring frames. Specifically, a threshold function is designed to capture the long-term features and blend the globally stored speech structure. A factored low rank approximation is introduced to reduce the parameters of the three-dimensional interaction tensor, on which non-negative constraint is imposed to address the sparsity characteristic. The validations through the area-under-ROC-curve (AUC) and signal distortion ratio (SDR) show that our approach outperforms several existing 1D and 2D (i.e., time and time-frequency domain) speech detection algorithms in various noisy environments. |
Tasks | |
Published | 2016-11-01 |
URL | http://arxiv.org/abs/1611.00326v3 |
http://arxiv.org/pdf/1611.00326v3.pdf | |
PWC | https://paperswithcode.com/paper/enhanced-factored-three-way-restricted |
Repo | |
Framework | |
Polling-systems-based Autonomous Vehicle Coordination in Traffic Intersections with No Traffic Signals
Title | Polling-systems-based Autonomous Vehicle Coordination in Traffic Intersections with No Traffic Signals |
Authors | David Miculescu, Sertac Karaman |
Abstract | The rapid development of autonomous vehicles spurred a careful investigation of the potential benefits of all-autonomous transportation networks. Most studies conclude that autonomous systems can enable drastic improvements in performance. A widely studied concept is all-autonomous, collision-free intersections, where vehicles arriving in a traffic intersection with no traffic light adjust their speeds to cross safely through the intersection as quickly as possible. In this paper, we propose a coordination control algorithm for this problem, assuming stochastic models for the arrival times of the vehicles. The proposed algorithm provides provable guarantees on safety and performance. More precisely, it is shown that no collisions occur surely, and moreover a rigorous upper bound is provided for the expected wait time. The algorithm is also demonstrated in simulations. The proposed algorithms are inspired by polling systems. In fact, the problem studied in this paper leads to a new polling system where customers are subject to differential constraints, which may be interesting in its own right. |
Tasks | Autonomous Vehicles |
Published | 2016-07-26 |
URL | http://arxiv.org/abs/1607.07896v1 |
http://arxiv.org/pdf/1607.07896v1.pdf | |
PWC | https://paperswithcode.com/paper/polling-systems-based-autonomous-vehicle |
Repo | |
Framework | |
Analyzing Games with Ambiguous Player Types using the ${\rm MINthenMAX}$ Decision Model
Title | Analyzing Games with Ambiguous Player Types using the ${\rm MINthenMAX}$ Decision Model |
Authors | Ilan Nehama |
Abstract | In many common interactive scenarios, participants lack information about other participants, and specifically about the preferences of other participants. In this work, we model an extreme case of incomplete information, which we term games with type ambiguity, where a participant lacks even information enabling him to form a belief on the preferences of others. Under type ambiguity, one cannot analyze the scenario using the commonly used Bayesian framework, and therefore he needs to model the participants using a different decision model. In this work, we present the ${\rm MINthenMAX}$ decision model under ambiguity. This model is a refinement of Wald’s MiniMax principle, which we show to be too coarse for games with type ambiguity. We characterize ${\rm MINthenMAX}$ as the finest refinement of the MiniMax principle that satisfies three properties we claim are necessary for games with type ambiguity. This prior-less approach we present her also follows the common practice in computer science of worst-case analysis. Finally, we define and analyze the corresponding equilibrium concept assuming all players follow ${\rm MINthenMAX}$. We demonstrate this equilibrium by applying it to two common economic scenarios: coordination games and bilateral trade. We show that in both scenarios, an equilibrium in pure strategies always exists and we analyze the equilibria. |
Tasks | |
Published | 2016-03-04 |
URL | http://arxiv.org/abs/1603.01524v3 |
http://arxiv.org/pdf/1603.01524v3.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-games-with-ambiguous-player-types |
Repo | |
Framework | |
Modern Physiognomy: An Investigation on Predicting Personality Traits and Intelligence from the Human Face
Title | Modern Physiognomy: An Investigation on Predicting Personality Traits and Intelligence from the Human Face |
Authors | Rizhen Qin, Wei Gao, Huarong Xu, Zhanyi Hu |
Abstract | The human behavior of evaluating other individuals with respect to their personality traits and intelligence by evaluating their faces plays a crucial role in human relations. These trait judgments might influence important social outcomes in our lives such as elections and court sentences. Previous studies have reported that human can make valid inferences for at least four personality traits. In addition, some studies have demonstrated that facial trait evaluation can be learned using machine learning methods accurately. In this work, we experimentally explore whether self-reported personality traits and intelligence can be predicted reliably from a facial image. More specifically, the prediction problem is separately cast in two parts: a classification task and a regression task. A facial structural feature is constructed from the relations among facial salient points, and an appearance feature is built by five texture descriptors. In addition, a minutia-based fingerprint feature from a fingerprint image is also explored. The classification results show that the personality traits “Rule-consciousness” and “Vigilance” can be predicted reliably, and that the traits of females can be predicted more accurately than those of male. However, the regression experiments show that it is difficult to predict scores for individual personality traits and intelligence. The residual plots and the correlation results indicate no evident linear correlation between the measured scores and the predicted scores. Both the classification and the regression results reveal that “Rule-consciousness” and “Tension” can be reliably predicted from the facial features, while “Social boldness” gets the worst prediction results. The experiments results show that it is difficult to predict intelligence from either the facial features or the fingerprint feature, a finding that is in agreement with previous studies. |
Tasks | |
Published | 2016-04-26 |
URL | http://arxiv.org/abs/1604.07499v1 |
http://arxiv.org/pdf/1604.07499v1.pdf | |
PWC | https://paperswithcode.com/paper/modern-physiognomy-an-investigation-on |
Repo | |
Framework | |
Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading
Title | Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading |
Authors | Xiao Yang, Dafang He, Wenyi Huang, Zihan Zhou, Alex Ororbia, Dan Kifer, C. Lee Giles |
Abstract | Physical library collections are valuable and long standing resources for knowledge and learning. However, managing books in a large bookshelf and finding books on it often leads to tedious manual work, especially for large book collections where books might be missing or misplaced. Recently, deep neural models, such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have achieved great success for scene text detection and recognition. Motivated by these recent successes, we aim to investigate their viability in facilitating book management, a task that introduces further challenges including large amounts of cluttered scene text, distortion, and varied lighting conditions. In this paper, we present a library inventory building and retrieval system based on scene text reading methods. We specifically design our scene text recognition model using rich supervision to accelerate training and achieve state-of-the-art performance on several benchmark datasets. Our proposed system has the potential to greatly reduce the amount of human labor required in managing book inventories as well as the space needed to store book information. |
Tasks | Scene Text Detection, Scene Text Recognition |
Published | 2016-11-22 |
URL | http://arxiv.org/abs/1611.07385v1 |
http://arxiv.org/pdf/1611.07385v1.pdf | |
PWC | https://paperswithcode.com/paper/smart-library-identifying-books-in-a-library |
Repo | |
Framework | |
Low-rank Bilinear Pooling for Fine-Grained Classification
Title | Low-rank Bilinear Pooling for Fine-Grained Classification |
Authors | Shu Kong, Charless Fowlkes |
Abstract | Pooling second-order local feature statistics to form a high-dimensional bilinear feature has been shown to achieve state-of-the-art performance on a variety of fine-grained classification tasks. To address the computational demands of high feature dimensionality, we propose to represent the covariance features as a matrix and apply a low-rank bilinear classifier. The resulting classifier can be evaluated without explicitly computing the bilinear feature map which allows for a large reduction in the compute time as well as decreasing the effective number of parameters to be learned. To further compress the model, we propose classifier co-decomposition that factorizes the collection of bilinear classifiers into a common factor and compact per-class terms. The co-decomposition idea can be deployed through two convolutional layers and trained in an end-to-end architecture. We suggest a simple yet effective initialization that avoids explicitly first training and factorizing the larger bilinear classifiers. Through extensive experiments, we show that our model achieves state-of-the-art performance on several public datasets for fine-grained classification trained with only category labels. Importantly, our final model is an order of magnitude smaller than the recently proposed compact bilinear model, and three orders smaller than the standard bilinear CNN model. |
Tasks | |
Published | 2016-11-16 |
URL | http://arxiv.org/abs/1611.05109v2 |
http://arxiv.org/pdf/1611.05109v2.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-bilinear-pooling-for-fine-grained |
Repo | |
Framework | |
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
Title | Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations |
Authors | Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio |
Abstract | We introduce a method to train Quantized Neural Networks (QNNs) — neural networks with extremely low precision (e.g., 1-bit) weights and activations, at run-time. At train-time the quantized weights and activations are used for computing the parameter gradients. During the forward pass, QNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations. As a result, power consumption is expected to be drastically reduced. We trained QNNs over the MNIST, CIFAR-10, SVHN and ImageNet datasets. The resulting QNNs achieve prediction accuracy comparable to their 32-bit counterparts. For example, our quantized version of AlexNet with 1-bit weights and 2-bit activations achieves $51%$ top-1 accuracy. Moreover, we quantize the parameter gradients to 6-bits as well which enables gradients computation using only bit-wise operation. Quantized recurrent neural networks were tested over the Penn Treebank dataset, and achieved comparable accuracy as their 32-bit counterparts using only 4-bits. Last but not least, we programmed a binary matrix multiplication GPU kernel with which it is possible to run our MNIST QNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The QNN code is available online. |
Tasks | |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.07061v1 |
http://arxiv.org/pdf/1609.07061v1.pdf | |
PWC | https://paperswithcode.com/paper/quantized-neural-networks-training-neural |
Repo | |
Framework | |
Learning may need only a few bits of synaptic precision
Title | Learning may need only a few bits of synaptic precision |
Authors | Carlo Baldassi, Federica Gerace, Carlo Lucibello, Luca Saglietti, Riccardo Zecchina |
Abstract | Learning in neural networks poses peculiar challenges when using discretized rather then continuous synaptic states. The choice of discrete synapses is motivated by biological reasoning and experiments, and possibly by hardware implementation considerations as well. In this paper we extend a previous large deviations analysis which unveiled the existence of peculiar dense regions in the space of synaptic states which accounts for the possibility of learning efficiently in networks with binary synapses. We extend the analysis to synapses with multiple states and generally more plausible biological features. The results clearly indicate that the overall qualitative picture is unchanged with respect to the binary case, and very robust to variation of the details of the model. We also provide quantitative results which suggest that the advantages of increasing the synaptic precision (i.e.~the number of internal synaptic states) rapidly vanish after the first few bits, and therefore that, for practical applications, only few bits may be needed for near-optimal performance, consistently with recent biological findings. Finally, we demonstrate how the theoretical analysis can be exploited to design efficient algorithmic search strategies. |
Tasks | |
Published | 2016-02-12 |
URL | http://arxiv.org/abs/1602.04129v2 |
http://arxiv.org/pdf/1602.04129v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-may-need-only-a-few-bits-of-synaptic |
Repo | |
Framework | |
UberNet: Training a `Universal’ Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory
Title | UberNet: Training a `Universal’ Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory | |
Authors | Iasonas Kokkinos |
Abstract | In this work we introduce a convolutional neural network (CNN) that jointly handles low-, mid-, and high-level vision tasks in a unified architecture that is trained end-to-end. Such a universal network can act like a `swiss knife’ for vision tasks; we call this architecture an UberNet to indicate its overarching nature. We address two main technical challenges that emerge when broadening up the range of tasks handled by a single CNN: (i) training a deep architecture while relying on diverse training sets and (ii) training many (potentially unlimited) tasks with a limited memory budget. Properly addressing these two problems allows us to train accurate predictors for a host of tasks, without compromising accuracy. Through these advances we train in an end-to-end manner a CNN that simultaneously addresses (a) boundary detection (b) normal estimation (c) saliency estimation (d) semantic segmentation (e) human part segmentation (f) semantic boundary detection, (g) region proposal generation and object detection. We obtain competitive performance while jointly addressing all of these tasks in 0.7 seconds per frame on a single GPU. A demonstration of this system can be found at http://cvn.ecp.fr/ubernet/. | |
Tasks | Boundary Detection, Human Part Segmentation, Object Detection, Saliency Prediction, Semantic Segmentation |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.02132v1 |
http://arxiv.org/pdf/1609.02132v1.pdf | |
PWC | https://paperswithcode.com/paper/ubernet-training-a-universal-convolutional |
Repo | |
Framework | |
Can Active Learning Experience Be Transferred?
Title | Can Active Learning Experience Be Transferred? |
Authors | Hong-Min Chu, Hsuan-Tien Lin |
Abstract | Active learning is an important machine learning problem in reducing the human labeling effort. Current active learning strategies are designed from human knowledge, and are applied on each dataset in an immutable manner. In other words, experience about the usefulness of strategies cannot be updated and transferred to improve active learning on other datasets. This paper initiates a pioneering study on whether active learning experience can be transferred. We first propose a novel active learning model that linearly aggregates existing strategies. The linear weights can then be used to represent the active learning experience. We equip the model with the popular linear upper- confidence-bound (LinUCB) algorithm for contextual bandit to update the weights. Finally, we extend our model to transfer the experience across datasets with the technique of biased regularization. Empirical studies demonstrate that the learned experience not only is competitive with existing strategies on most single datasets, but also can be transferred across datasets to improve the performance on future learning tasks. |
Tasks | Active Learning |
Published | 2016-08-02 |
URL | http://arxiv.org/abs/1608.00667v1 |
http://arxiv.org/pdf/1608.00667v1.pdf | |
PWC | https://paperswithcode.com/paper/can-active-learning-experience-be-transferred |
Repo | |
Framework | |