Paper Group ANR 333
Decoupling Learning Rules from Representations. One-shot and few-shot learning of word embeddings. Adversarial nets with perceptual losses for text-to-image synthesis. Natural Language Generation for Spoken Dialogue System using RNN Encoder-Decoder Networks. Learning to Generate Posters of Scientific Papers by Probabilistic Graphical Models. Anomal …
Decoupling Learning Rules from Representations
Title | Decoupling Learning Rules from Representations |
Authors | Philip S. Thomas, Christoph Dann, Emma Brunskill |
Abstract | In the artificial intelligence field, learning often corresponds to changing the parameters of a parameterized function. A learning rule is an algorithm or mathematical expression that specifies precisely how the parameters should be changed. When creating an artificial intelligence system, we must make two decisions: what representation should be used (i.e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions. Using most learning rules, these two decisions are coupled in a subtle (and often unintentional) way. That is, using the same learning rule with two different representations that can represent the same sets of functions can result in two different outcomes. After arguing that this coupling is undesirable, particularly when using artificial neural networks, we present a method for partially decoupling these two decisions for a broad class of learning rules that span unsupervised learning, reinforcement learning, and supervised learning. |
Tasks | |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1706.03100v1 |
http://arxiv.org/pdf/1706.03100v1.pdf | |
PWC | https://paperswithcode.com/paper/decoupling-learning-rules-from |
Repo | |
Framework | |
One-shot and few-shot learning of word embeddings
Title | One-shot and few-shot learning of word embeddings |
Authors | Andrew K. Lampinen, James L. McClelland |
Abstract | Standard deep learning systems require thousands or millions of examples to learn a concept, and cannot integrate new concepts easily. By contrast, humans have an incredible ability to do one-shot or few-shot learning. For instance, from just hearing a word used in a sentence, humans can infer a great deal about it, by leveraging what the syntax and semantics of the surrounding words tells us. Here, we draw inspiration from this to highlight a simple technique by which deep recurrent networks can similarly exploit their prior knowledge to learn a useful representation for a new word from little data. This could make natural language processing systems much more flexible, by allowing them to learn continually from the new words they encounter. |
Tasks | Few-Shot Learning, Word Embeddings |
Published | 2017-10-27 |
URL | http://arxiv.org/abs/1710.10280v2 |
http://arxiv.org/pdf/1710.10280v2.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-and-few-shot-learning-of-word |
Repo | |
Framework | |
Adversarial nets with perceptual losses for text-to-image synthesis
Title | Adversarial nets with perceptual losses for text-to-image synthesis |
Authors | Miriam Cha, Youngjune Gwon, H. T. Kung |
Abstract | Recent approaches in generative adversarial networks (GANs) can automatically synthesize realistic images from descriptive text. Despite the overall fair quality, the generated images often expose visible flaws that lack structural definition for an object of interest. In this paper, we aim to extend state of the art for GAN-based text-to-image synthesis by improving perceptual quality of generated images. Differentiated from previous work, our synthetic image generator optimizes on perceptual loss functions that measure pixel, feature activation, and texture differences against a natural image. We present visually more compelling synthetic images of birds and flowers generated from text descriptions in comparison to some of the most prominent existing work. |
Tasks | Image Generation |
Published | 2017-08-30 |
URL | http://arxiv.org/abs/1708.09321v1 |
http://arxiv.org/pdf/1708.09321v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-nets-with-perceptual-losses-for |
Repo | |
Framework | |
Natural Language Generation for Spoken Dialogue System using RNN Encoder-Decoder Networks
Title | Natural Language Generation for Spoken Dialogue System using RNN Encoder-Decoder Networks |
Authors | Van-Khanh Tran, Le-Minh Nguyen |
Abstract | Natural language generation (NLG) is a critical component in a spoken dialogue system. This paper presents a Recurrent Neural Network based Encoder-Decoder architecture, in which an LSTM-based decoder is introduced to select, aggregate semantic elements produced by an attention mechanism over the input elements, and to produce the required utterances. The proposed generator can be jointly trained both sentence planning and surface realization to produce natural language sentences. The proposed model was extensively evaluated on four different NLG datasets. The experimental results showed that the proposed generators not only consistently outperform the previous methods across all the NLG domains but also show an ability to generalize from a new, unseen domain and learn from multi-domain datasets. |
Tasks | Text Generation |
Published | 2017-06-01 |
URL | http://arxiv.org/abs/1706.00139v3 |
http://arxiv.org/pdf/1706.00139v3.pdf | |
PWC | https://paperswithcode.com/paper/natural-language-generation-for-spoken |
Repo | |
Framework | |
Learning to Generate Posters of Scientific Papers by Probabilistic Graphical Models
Title | Learning to Generate Posters of Scientific Papers by Probabilistic Graphical Models |
Authors | Yu-ting Qiang, Yanwei Fu, Xiao Yu, Yanwen Guo, Zhi-Hua Zhou, Leonid Sigal |
Abstract | Researchers often summarize their work in the form of scientific posters. Posters provide a coherent and efficient way to convey core ideas expressed in scientific papers. Generating a good scientific poster, however, is a complex and time consuming cognitive task, since such posters need to be readable, informative, and visually aesthetic. In this paper, for the first time, we study the challenging problem of learning to generate posters from scientific papers. To this end, a data-driven framework, that utilizes graphical models, is proposed. Specifically, given content to display, the key elements of a good poster, including attributes of each panel and arrangements of graphical elements are learned and inferred from data. During the inference stage, an MAP inference framework is employed to incorporate some design principles. In order to bridge the gap between panel attributes and the composition within each panel, we also propose a recursive page splitting algorithm to generate the panel layout for a poster. To learn and validate our model, we collect and release a new benchmark dataset, called NJU-Fudan Paper-Poster dataset, which consists of scientific papers and corresponding posters with exhaustively labelled panels and attributes. Qualitative and quantitative results indicate the effectiveness of our approach. |
Tasks | |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06228v1 |
http://arxiv.org/pdf/1702.06228v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-generate-posters-of-scientific-1 |
Repo | |
Framework | |
Anomaly Detection and Modeling in 802.11 Wireless Networks
Title | Anomaly Detection and Modeling in 802.11 Wireless Networks |
Authors | Anisa Allahdadi, Ricardo Morla |
Abstract | IEEE 802.11 Wireless Networks are getting more and more popular at university campuses, enterprises, shopping centers, airports and in so many other public places, providing Internet access to a large crowd openly and quickly. The wireless users are also getting more dependent on WiFi technology and therefore demanding more reliability and higher performance for this vital technology. However, due to unstable radio conditions, faulty equipment, and dynamic user behavior among other reasons, there are always unpredictable performance problems in a wireless covered area. Detection and prediction of such problems is of great significance to network managers if they are to alleviate the connectivity issues of the mobile users and provide a higher quality wireless service. This paper aims to improve the management of the 802.11 wireless networks by characterizing and modeling wireless usage patterns in a set of anomalous scenarios that can occur in such networks. We apply time-invariant (Gaussian Mixture Models) and time-variant (Hidden Markov Models) modeling approaches to a dataset generated from a large production network and describe how we use these models for anomaly detection. We then generate several common anomalies on a Testbed network and evaluate the proposed anomaly detection methodologies in a controlled environment. The experimental results of the Testbed show that HMM outperforms GMM and yields a higher anomaly detection ratio and a lower false alarm rate. |
Tasks | Anomaly Detection |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.00948v1 |
http://arxiv.org/pdf/1707.00948v1.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detection-and-modeling-in-80211 |
Repo | |
Framework | |
A machine learning approach for efficient uncertainty quantification using multiscale methods
Title | A machine learning approach for efficient uncertainty quantification using multiscale methods |
Authors | Shing Chan, Ahmed H. Elsheikh |
Abstract | Several multiscale methods account for sub-grid scale features using coarse scale basis functions. For example, in the Multiscale Finite Volume method the coarse scale basis functions are obtained by solving a set of local problems over dual-grid cells. We introduce a data-driven approach for the estimation of these coarse scale basis functions. Specifically, we employ a neural network predictor fitted using a set of solution samples from which it learns to generate subsequent basis functions at a lower computational cost than solving the local problems. The computational advantage of this approach is realized for uncertainty quantification tasks where a large number of realizations has to be evaluated. We attribute the ability to learn these basis functions to the modularity of the local problems and the redundancy of the permeability patches between samples. The proposed method is evaluated on elliptic problems yielding very promising results. |
Tasks | |
Published | 2017-11-12 |
URL | http://arxiv.org/abs/1711.04315v1 |
http://arxiv.org/pdf/1711.04315v1.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-approach-for-efficient |
Repo | |
Framework | |
Automatic Understanding of Image and Video Advertisements
Title | Automatic Understanding of Image and Video Advertisements |
Authors | Zaeem Hussain, Mingda Zhang, Xiaozhong Zhang, Keren Ye, Christopher Thomas, Zuha Agha, Nathan Ong, Adriana Kovashka |
Abstract | There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action. We propose the novel problem of automatic advertisement understanding. To enable research on this problem, we create two datasets: an image dataset of 64,832 image ads, and a video dataset of 3,477 ads. Our data contains rich annotations encompassing the topic and sentiment of the ads, questions and answers describing what actions the viewer is prompted to take and the reasoning that the ad presents to persuade the viewer (“What should I do according to this ad, and why should I do it?"), and symbolic references ads make (e.g. a dove symbolizes peace). We also analyze the most common persuasive strategies ads use, and the capabilities that computer vision systems should have to understand these strategies. We present baseline classification results for several prediction tasks, including automatically answering questions about the messages of the ads. |
Tasks | |
Published | 2017-07-10 |
URL | http://arxiv.org/abs/1707.03067v1 |
http://arxiv.org/pdf/1707.03067v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-understanding-of-image-and-video |
Repo | |
Framework | |
Neural networks catching up with finite differences in solving partial differential equations in higher dimensions
Title | Neural networks catching up with finite differences in solving partial differential equations in higher dimensions |
Authors | V. I. Avrutskiy |
Abstract | Fully connected multilayer perceptrons are used for obtaining numerical solutions of partial differential equations in various dimensions. Independent variables are fed into the input layer, and the output is considered as solution’s value. To train such a network one can use square of equation’s residual as a cost function and minimize it with respect to weights by gradient descent. Following previously developed method, derivatives of the equation’s residual along random directions in space of independent variables are also added to cost function. Similar procedure is known to produce nearly machine precision results using less than 8 grid points per dimension for 2D case. The same effect is observed here for higher dimensions: solutions are obtained on low density grids, but maintain their precision in the entire region. Boundary value problems for linear and nonlinear Poisson equations are solved inside 2, 3, 4, and 5 dimensional balls. Grids for linear cases have 40, 159, 512 and 1536 points and for nonlinear 64, 350, 1536 and 6528 points respectively. In all cases maximum error is less than $8.8\cdot10^{-6}$, and median error is less than $2.4\cdot10^{-6}$. Very weak grid requirements enable neural networks to obtain solution of 5D linear problem within 22 minutes, whereas projected solving time for finite differences on the same hardware is 50 minutes. Method is applied to second order equation, but requires little to none modifications to solve systems or higher order PDEs. |
Tasks | |
Published | 2017-12-14 |
URL | http://arxiv.org/abs/1712.05067v1 |
http://arxiv.org/pdf/1712.05067v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-networks-catching-up-with-finite |
Repo | |
Framework | |
Multiscale dictionary of rat locomotion
Title | Multiscale dictionary of rat locomotion |
Authors | Haozhe Shan, Peggy Mason |
Abstract | To effectively connect animal behaviors to activities and patterns in the nervous system, it is ideal have a precise, accurate, and complete description of stereotyped modules and their dynamics in behaviors. In case of rodent behaviors, observers have identified and described several stereotyped behaviors, such as grooming and lateral threat. Discovering behavioral repertoires in this way is imprecise, slow and contaminated with biases and individual differences. As a replacement, we propose a framework for unbiased, efficient and precise investigation of rat locomotor activities. We propose that locomotion possesses multiscale dynamics that can be well approximated by multiple Markov processes running in parallel at different spatial-temporal scales. To capture motifs and transition dynamics on multiple scales, we developed a segmentation-decomposition procedure, which imposes explicit constraints on timescales on parallel Hidden Markov Models (HMM). Each HMM describes the motifs and transition dynamics at its respective timescale. We showed that the motifs discovered across timescales have experimental significance and space-dependent heterogeneity. Through statistical tests, we show that locomotor dynamics largely conforms with Markov property across scales. Finally, using layered HMMs, we showed that motif assembly is strongly constrained to a few fixed sequences. The motifs potentially reflect outputs of canonical underlying behavioral output motifs. Our approach and results for the first time capture behavioral dynamics at different spatial-temporal scales, painting a more complete picture of how behaviors are organized. |
Tasks | |
Published | 2017-07-11 |
URL | http://arxiv.org/abs/1707.03360v2 |
http://arxiv.org/pdf/1707.03360v2.pdf | |
PWC | https://paperswithcode.com/paper/multiscale-dictionary-of-rat-locomotion |
Repo | |
Framework | |
Exploring epoch-dependent stochastic residual networks
Title | Exploring epoch-dependent stochastic residual networks |
Authors | Fabio Carrara, Andrea Esuli, Fabrizio Falchi, Alejandro Moreo Fernández |
Abstract | The recently proposed stochastic residual networks selectively activate or bypass the layers during training, based on independent stochastic choices, each of which following a probability distribution that is fixed in advance. In this paper we present a first exploration on the use of an epoch-dependent distribution, starting with a higher probability of bypassing deeper layers and then activating them more frequently as training progresses. Preliminary results are mixed, yet they show some potential of adding an epoch-dependent management of distributions, worth of further investigation. |
Tasks | |
Published | 2017-04-20 |
URL | http://arxiv.org/abs/1704.06178v1 |
http://arxiv.org/pdf/1704.06178v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-epoch-dependent-stochastic-residual |
Repo | |
Framework | |
Costate-focused models for reinforcement learning
Title | Costate-focused models for reinforcement learning |
Authors | Bita Behrouzi, Xuefei Liu, Douglas Tweed |
Abstract | Many recent algorithms for reinforcement learning are model-free and founded on the Bellman equation. Here we present a method founded on the costate equation and models of the state dynamics. We use the costate – the gradient of cost with respect to state – to improve the policy and also to “focus” the model, training it to detect and mimic those features of the environment that are most relevant to its task. We show that this method can handle difficult time-optimal control problems, driving deterministic or stochastic mechanical systems quickly to a target. On these tasks it works well compared to deep deterministic policy gradient, a recent Bellman method. And because it creates a model, the costate method can also learn from mental practice. |
Tasks | |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05817v5 |
http://arxiv.org/pdf/1711.05817v5.pdf | |
PWC | https://paperswithcode.com/paper/costate-focused-models-for-reinforcement |
Repo | |
Framework | |
DS*: Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems
Title | DS*: Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems |
Authors | Florian Bernard, Christian Theobalt, Michael Moeller |
Abstract | In this work we study convex relaxations of quadratic optimisation problems over permutation matrices. While existing semidefinite programming approaches can achieve remarkably tight relaxations, they have the strong disadvantage that they lift the original $n {\times} n$-dimensional variable to an $n^2 {\times} n^2$-dimensional variable, which limits their practical applicability. In contrast, here we present a lifting-free convex relaxation that is provably at least as tight as existing (lifting-free) convex relaxations. We demonstrate experimentally that our approach is superior to existing convex and non-convex methods for various problems, including image arrangement and multi-graph matching. |
Tasks | Graph Matching |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10733v2 |
http://arxiv.org/pdf/1711.10733v2.pdf | |
PWC | https://paperswithcode.com/paper/ds-tighter-lifting-free-convex-relaxations |
Repo | |
Framework | |
Sentiment Classification with Word Attention based on Weakly Supervised Learning with a Convolutional Neural Network
Title | Sentiment Classification with Word Attention based on Weakly Supervised Learning with a Convolutional Neural Network |
Authors | Gichang Lee, Jaeyun Jeong, Seungwan Seo, CzangYeob Kim, Pilsung Kang |
Abstract | In order to maximize the applicability of sentiment analysis results, it is necessary to not only classify the overall sentiment (positive/negative) of a given document but also to identify the main words that contribute to the classification. However, most datasets for sentiment analysis only have the sentiment label for each document or sentence. In other words, there is no information about which words play an important role in sentiment classification. In this paper, we propose a method for identifying key words discriminating positive and negative sentences by using a weakly supervised learning method based on a convolutional neural network (CNN). In our model, each word is represented as a continuous-valued vector and each sentence is represented as a matrix whose rows correspond to the word vector used in the sentence. Then, the CNN model is trained using these sentence matrices as inputs and the sentiment labels as the output. Once the CNN model is trained, we implement the word attention mechanism that identifies high-contributing words to classification results with a class activation map, using the weights from the fully connected layer at the end of the learned CNN model. In order to verify the proposed methodology, we evaluated the classification accuracy and inclusion rate of polarity words using two movie review datasets. Experimental result show that the proposed model can not only correctly classify the sentence polarity but also successfully identify the corresponding words with high polarity scores. |
Tasks | Sentiment Analysis |
Published | 2017-09-28 |
URL | http://arxiv.org/abs/1709.09885v2 |
http://arxiv.org/pdf/1709.09885v2.pdf | |
PWC | https://paperswithcode.com/paper/sentiment-classification-with-word-attention |
Repo | |
Framework | |
Material Classification in the Wild: Do Synthesized Training Data Generalise Better than Real-World Training Data?
Title | Material Classification in the Wild: Do Synthesized Training Data Generalise Better than Real-World Training Data? |
Authors | Grigorios Kalliatakis, Anca Sticlaru, George Stamatiadis, Shoaib Ehsan, Ales Leonardis, Juergen Gall, Klaus D. McDonald-Maier |
Abstract | We question the dominant role of real-world training images in the field of material classification by investigating whether synthesized data can generalise more effectively than real-world data. Experimental results on three challenging real-world material databases show that the best performing pre-trained convolutional neural network (CNN) architectures can achieve up to 91.03% mean average precision when classifying materials in cross-dataset scenarios. We demonstrate that synthesized data achieve an improvement on mean average precision when used as training data and in conjunction with pre-trained CNN architectures, which spans from ~ 5% to ~ 19% across three widely used material databases of real-world images. |
Tasks | Material Classification |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03874v1 |
http://arxiv.org/pdf/1711.03874v1.pdf | |
PWC | https://paperswithcode.com/paper/material-classification-in-the-wild-do |
Repo | |
Framework | |