Paper Group ANR 444
Finding Efficient Swimming Strategies in a Three Dimensional Chaotic Flow by Reinforcement Learning. End-to-end Network for Twitter Geolocation Prediction and Hashing. A Fully-Automated Pipeline for Detection and Segmentation of Liver Lesions and Pathological Lymph Nodes. Fuzzy Based Implicit Sentiment Analysis on Quantitative Sentences. A Novel Fr …
Finding Efficient Swimming Strategies in a Three Dimensional Chaotic Flow by Reinforcement Learning
Title | Finding Efficient Swimming Strategies in a Three Dimensional Chaotic Flow by Reinforcement Learning |
Authors | K. Gustavsson, L. Biferale, A. Celani, S. Colabrese |
Abstract | We apply a reinforcement learning algorithm to show how smart particles can learn approximately optimal strategies to navigate in complex flows. In this paper we consider microswimmers in a paradigmatic three-dimensional case given by a stationary superposition of two Arnold-Beltrami-Childress flows with chaotic advection along streamlines. In such a flow, we study the evolution of point-like particles which can decide in which direction to swim, while keeping the velocity amplitude constant. We show that it is sufficient to endow the swimmers with a very restricted set of actions (six fixed swimming directions in our case) to have enough freedom to find efficient strategies to move upward and escape local fluid traps. The key ingredient is the learning-from-experience structure of the algorithm, which assigns positive or negative rewards depending on whether the taken action is, or is not, profitable for the predetermined goal in the long term horizon. This is another example supporting the efficiency of the reinforcement learning approach to learn how to accomplish difficult tasks in complex fluid environments. |
Tasks | |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05826v2 |
http://arxiv.org/pdf/1711.05826v2.pdf | |
PWC | https://paperswithcode.com/paper/finding-efficient-swimming-strategies-in-a |
Repo | |
Framework | |
End-to-end Network for Twitter Geolocation Prediction and Hashing
Title | End-to-end Network for Twitter Geolocation Prediction and Hashing |
Authors | Jey Han Lau, Lianhua Chi, Khoi-Nguyen Tran, Trevor Cohn |
Abstract | We propose an end-to-end neural network to predict the geolocation of a tweet. The network takes as input a number of raw Twitter metadata such as the tweet message and associated user account information. Our model is language independent, and despite minimal feature engineering, it is interpretable and capable of learning location indicative words and timing patterns. Compared to state-of-the-art systems, our model outperforms them by 2%-6%. Additionally, we propose extensions to the model to compress representation learnt by the network into binary codes. Experiments show that it produces compact codes compared to benchmark hashing algorithms. An implementation of the model is released publicly. |
Tasks | Feature Engineering |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.04802v1 |
http://arxiv.org/pdf/1710.04802v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-network-for-twitter-geolocation |
Repo | |
Framework | |
A Fully-Automated Pipeline for Detection and Segmentation of Liver Lesions and Pathological Lymph Nodes
Title | A Fully-Automated Pipeline for Detection and Segmentation of Liver Lesions and Pathological Lymph Nodes |
Authors | Assaf Hoogi, John W. Lambert, Yefeng Zheng, Dorin Comaniciu, Daniel L. Rubin |
Abstract | We propose a fully-automated method for accurate and robust detection and segmentation of potentially cancerous lesions found in the liver and in lymph nodes. The process is performed in three steps, including organ detection, lesion detection and lesion segmentation. Our method applies machine learning techniques such as marginal space learning and convolutional neural networks, as well as active contour models. The method proves to be robust in its handling of extremely high lesion diversity. We tested our method on volumetric computed tomography (CT) images, including 42 volumes containing liver lesions and 86 volumes containing 595 pathological lymph nodes. Preliminary results under 10-fold cross validation show that for both the liver lesions and the lymph nodes, a total detection sensitivity of 0.53 and average Dice score of $0.71 \pm 0.15$ for segmentation were obtained. |
Tasks | Computed Tomography (CT), Lesion Segmentation, Organ Detection |
Published | 2017-03-19 |
URL | http://arxiv.org/abs/1703.06418v1 |
http://arxiv.org/pdf/1703.06418v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fully-automated-pipeline-for-detection-and |
Repo | |
Framework | |
Fuzzy Based Implicit Sentiment Analysis on Quantitative Sentences
Title | Fuzzy Based Implicit Sentiment Analysis on Quantitative Sentences |
Authors | Amir Hossein Yazdavar, Monireh Ebrahimi, Naomie Salim |
Abstract | With the rapid growth of social media on the web, emotional polarity computation has become a flourishing frontier in the text mining community. However, it is challenging to understand the latest trends and summarize the state or general opinions about products due to the big diversity and size of social media data and this creates the need of automated and real time opinion extraction and mining. On the other hand, the bulk of current research has been devoted to study the subjective sentences which contain opinion keywords and limited work has been reported for objective statements that imply sentiment. In this paper, fuzzy based knowledge engineering model has been developed for sentiment classification of special group of such sentences including the change or deviation from desired range or value. Drug reviews are the rich source of such statements. Therefore, in this research, some experiments were carried out on patient’s reviews on several different cholesterol lowering drugs to determine their sentiment polarity. The main conclusion through this study is, in order to increase the accuracy level of existing drug opinion mining systems, objective sentences which imply opinion should be taken into account. Our experimental results demonstrate that our proposed model obtains over 72 percent F1 value. |
Tasks | Opinion Mining, Sentiment Analysis |
Published | 2017-01-03 |
URL | http://arxiv.org/abs/1701.00798v1 |
http://arxiv.org/pdf/1701.00798v1.pdf | |
PWC | https://paperswithcode.com/paper/fuzzy-based-implicit-sentiment-analysis-on |
Repo | |
Framework | |
A Novel Framework for Robustness Analysis of Visual QA Models
Title | A Novel Framework for Robustness Analysis of Visual QA Models |
Authors | Jia-Hong Huang, Cuong Duc Dao, Modar Alfadly, Bernard Ghanem |
Abstract | Deep neural networks have been playing an essential role in many computer vision tasks including Visual Question Answering (VQA). Until recently, the study of their accuracy was the main focus of research but now there is a trend toward assessing the robustness of these models against adversarial attacks by evaluating their tolerance to varying noise levels. In VQA, adversarial attacks can target the image and/or the proposed main question and yet there is a lack of proper analysis of the later. In this work, we propose a flexible framework that focuses on the language part of VQA that uses semantically relevant questions, dubbed basic questions, acting as controllable noise to evaluate the robustness of VQA models. We hypothesize that the level of noise is positively correlated to the similarity of a basic question to the main question. Hence, to apply noise on any given main question, we rank a pool of basic questions based on their similarity by casting this ranking task as a LASSO optimization problem. Then, we propose a novel robustness measure, R_score, and two large-scale basic question datasets (BQDs) in order to standardize robustness analysis for VQA models. |
Tasks | Question Answering, Visual Question Answering |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.06232v3 |
http://arxiv.org/pdf/1711.06232v3.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-framework-for-robustness-analysis-of |
Repo | |
Framework | |
SEARNN: Training RNNs with Global-Local Losses
Title | SEARNN: Training RNNs with Global-Local Losses |
Authors | Rémi Leblond, Jean-Baptiste Alayrac, Anton Osokin, Simon Lacoste-Julien |
Abstract | We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the “learning to search” (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an appropriate surrogate for the test error: by only maximizing the ground truth probability, it fails to exploit the wealth of information offered by structured losses. Further, it introduces discrepancies between training and predicting (such as exposure bias) that may hurt test performance. Instead, SEARNN leverages test-alike search space exploration to introduce global-local losses that are closer to the test error. We first demonstrate improved performance over MLE on two different tasks: OCR and spelling correction. Then, we propose a subsampling strategy to enable SEARNN to scale to large vocabulary sizes. This allows us to validate the benefits of our approach on a machine translation task. |
Tasks | Machine Translation, Optical Character Recognition, Spelling Correction, Structured Prediction |
Published | 2017-06-14 |
URL | http://arxiv.org/abs/1706.04499v3 |
http://arxiv.org/pdf/1706.04499v3.pdf | |
PWC | https://paperswithcode.com/paper/searnn-training-rnns-with-global-local-losses |
Repo | |
Framework | |
Hierarchical Multi-scale Attention Networks for Action Recognition
Title | Hierarchical Multi-scale Attention Networks for Action Recognition |
Authors | Shiyang Yan, Jeremy S. Smith, Wenjin Lu, Bailing Zhang |
Abstract | Recurrent Neural Networks (RNNs) have been widely used in natural language processing and computer vision. Among them, the Hierarchical Multi-scale RNN (HM-RNN), a kind of multi-scale hierarchical RNN proposed recently, can learn the hierarchical temporal structure from data automatically. In this paper, we extend the work to solve the computer vision task of action recognition. However, in sequence-to-sequence models like RNN, it is normally very hard to discover the relationships between inputs and outputs given static inputs. As a solution, attention mechanism could be applied to extract the relevant information from input thus facilitating the modeling of input-output relationships. Based on these considerations, we propose a novel attention network, namely Hierarchical Multi-scale Attention Network (HM-AN), by combining the HM-RNN and the attention mechanism and apply it to action recognition. A newly proposed gradient estimation method for stochastic neurons, namely Gumbel-softmax, is exploited to implement the temporal boundary detectors and the stochastic hard attention mechanism. To amealiate the negative effect of sensitive temperature of the Gumbel-softmax, an adaptive temperature training method is applied to better the system performance. The experimental results demonstrate the improved effect of HM-AN over LSTM with attention on the vision task. Through visualization of what have been learnt by the networks, it can be observed that both the attention regions of images and the hierarchical temporal structure can be captured by HM-AN. |
Tasks | Temporal Action Localization |
Published | 2017-08-25 |
URL | http://arxiv.org/abs/1708.07590v2 |
http://arxiv.org/pdf/1708.07590v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-multi-scale-attention-networks |
Repo | |
Framework | |
Large-Scale YouTube-8M Video Understanding with Deep Neural Networks
Title | Large-Scale YouTube-8M Video Understanding with Deep Neural Networks |
Authors | Manuk Akopyan, Eshsou Khashba |
Abstract | Video classification problem has been studied many years. The success of Convolutional Neural Networks (CNN) in image recognition tasks gives a powerful incentive for researchers to create more advanced video classification approaches. As video has a temporal content Long Short Term Memory (LSTM) networks become handy tool allowing to model long-term temporal clues. Both approaches need a large dataset of input data. In this paper three models provided to address video classification using recently announced YouTube-8M large-scale dataset. The first model is based on frame pooling approach. Two other models based on LSTM networks. Mixture of Experts intermediate layer is used in third model allowing to increase model capacity without dramatically increasing computations. The set of experiments for handling imbalanced training data has been conducted. |
Tasks | Video Classification, Video Understanding |
Published | 2017-06-14 |
URL | http://arxiv.org/abs/1706.04488v1 |
http://arxiv.org/pdf/1706.04488v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-youtube-8m-video-understanding |
Repo | |
Framework | |
Curriculum Q-Learning for Visual Vocabulary Acquisition
Title | Curriculum Q-Learning for Visual Vocabulary Acquisition |
Authors | Ahmed H. Zaidi, Russell Moore, Ted Briscoe |
Abstract | The structure of curriculum plays a vital role in our learning process, both as children and adults. Presenting material in ascending order of difficulty that also exploits prior knowledge can have a significant impact on the rate of learning. However, the notion of difficulty and prior knowledge differs from person to person. Motivated by the need for a personalised curriculum, we present a novel method of curriculum learning for vocabulary words in the form of visual prompts. We employ a reinforcement learning model grounded in pedagogical theories that emulates the actions of a tutor. We simulate three students with different levels of vocabulary knowledge in order to evaluate the how well our model adapts to the environment. The results of the simulation reveal that through interaction, the model is able to identify the areas of weakness, as well as push students to the edge of their ZPD. We hypothesise that these methods can also be effective in training agents to learn language representations in a simulated environment where it has previously been shown that order of words and prior knowledge play an important role in the efficacy of language learning. |
Tasks | Q-Learning |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10837v1 |
http://arxiv.org/pdf/1711.10837v1.pdf | |
PWC | https://paperswithcode.com/paper/curriculum-q-learning-for-visual-vocabulary |
Repo | |
Framework | |
Manifold Constrained Low-Rank Decomposition
Title | Manifold Constrained Low-Rank Decomposition |
Authors | Chen Chen, Baochang Zhang, Alessio Del Bue, Vittorio Murino |
Abstract | Low-rank decomposition (LRD) is a state-of-the-art method for visual data reconstruction and modelling. However, it is a very challenging problem when the image data contains significant occlusion, noise, illumination variation, and misalignment from rotation or viewpoint changes. We leverage the specific structure of data in order to improve the performance of LRD when the data are not ideal. To this end, we propose a new framework that embeds manifold priors into LRD. To implement the framework, we design an alternating direction method of multipliers (ADMM) method which efficiently integrates the manifold constraints during the optimization process. The proposed approach is successfully used to calculate low-rank models from face images, hand-written digits and planar surface images. The results show a consistent increase of performance when compared to the state-of-the-art over a wide range of realistic image misalignments and corruptions. |
Tasks | |
Published | 2017-08-06 |
URL | http://arxiv.org/abs/1708.01846v1 |
http://arxiv.org/pdf/1708.01846v1.pdf | |
PWC | https://paperswithcode.com/paper/manifold-constrained-low-rank-decomposition |
Repo | |
Framework | |
Certified Defenses for Data Poisoning Attacks
Title | Certified Defenses for Data Poisoning Attacks |
Authors | Jacob Steinhardt, Pang Wei Koh, Percy Liang |
Abstract | Machine learning systems trained on user-provided data are susceptible to data poisoning attacks, whereby malicious users inject false training data with the aim of corrupting the learned model. While recent work has proposed a number of attacks and defenses, little is understood about the worst-case loss of a defense in the face of a determined attacker. We address this by constructing approximate upper bounds on the loss across a broad family of attacks, for defenders that first perform outlier removal followed by empirical risk minimization. Our approximation relies on two assumptions: (1) that the dataset is large enough for statistical concentration between train and test error to hold, and (2) that outliers within the clean (non-poisoned) data do not have a strong effect on the model. Our bound comes paired with a candidate attack that often nearly matches the upper bound, giving us a powerful tool for quickly assessing defenses on a given dataset. Empirically, we find that even under a simple defense, the MNIST-1-7 and Dogfish datasets are resilient to attack, while in contrast the IMDB sentiment dataset can be driven from 12% to 23% test error by adding only 3% poisoned data. |
Tasks | data poisoning |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1706.03691v2 |
http://arxiv.org/pdf/1706.03691v2.pdf | |
PWC | https://paperswithcode.com/paper/certified-defenses-for-data-poisoning-attacks |
Repo | |
Framework | |
BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks
Title | BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks |
Authors | Viet-Quoc Pham, Satoshi Ito, Tatsuo Kozakaya |
Abstract | We present a simple and effective framework for simultaneous semantic segmentation and instance segmentation with Fully Convolutional Networks (FCNs). The method, called BiSeg, predicts instance segmentation as a posterior in Bayesian inference, where semantic segmentation is used as a prior. We extend the idea of position-sensitive score maps used in recent methods to a fusion of multiple score maps at different scales and partition modes, and adopt it as a robust likelihood for instance segmentation inference. As both Bayesian inference and map fusion are performed per pixel, BiSeg is a fully convolutional end-to-end solution that inherits all the advantages of FCNs. We demonstrate state-of-the-art instance segmentation accuracy on PASCAL VOC. |
Tasks | Bayesian Inference, Instance Segmentation, Semantic Segmentation |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.02135v2 |
http://arxiv.org/pdf/1706.02135v2.pdf | |
PWC | https://paperswithcode.com/paper/biseg-simultaneous-instance-segmentation-and |
Repo | |
Framework | |
The Sup-norm Perturbation of HOSVD and Low Rank Tensor Denoising
Title | The Sup-norm Perturbation of HOSVD and Low Rank Tensor Denoising |
Authors | Dong Xia, Fan Zhou |
Abstract | The higher order singular value decomposition (HOSVD) of tensors is a generalization of matrix SVD. The perturbation analysis of HOSVD under random noise is more delicate than its matrix counterpart. Recently, polynomial time algorithms have been proposed where statistically optimal estimates of the singular subspaces and the low rank tensors are attainable in the Euclidean norm. In this article, we analyze the sup-norm perturbation bounds of HOSVD and introduce estimators of the singular subspaces with sharp deviation bounds in the sup-norm. We also investigate a low rank tensor denoising estimator and demonstrate its fast convergence rate with respect to the entry-wise errors. The sup-norm perturbation bounds reveal unconventional phase transitions for statistical learning applications such as the exact clustering in high dimensional Gaussian mixture model and the exact support recovery in sub-tensor localizations. In addition, the bounds established for HOSVD also elaborate the one-sided sup-norm perturbation bounds for the singular subspaces of unbalanced (or fat) matrices. |
Tasks | Denoising |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01207v5 |
http://arxiv.org/pdf/1707.01207v5.pdf | |
PWC | https://paperswithcode.com/paper/the-sup-norm-perturbation-of-hosvd-and-low |
Repo | |
Framework | |
Deep Network Guided Proof Search
Title | Deep Network Guided Proof Search |
Authors | Sarah Loos, Geoffrey Irving, Christian Szegedy, Cezary Kaliszyk |
Abstract | Deep learning techniques lie at the heart of several significant AI advances in recent years including object recognition and detection, image captioning, machine translation, speech recognition and synthesis, and playing the game of Go. Automated first-order theorem provers can aid in the formalization and verification of mathematical theorems and play a crucial role in program analysis, theory reasoning, security, interpolation, and system verification. Here we suggest deep learning based guidance in the proof search of the theorem prover E. We train and compare several deep neural network models on the traces of existing ATP proofs of Mizar statements and use them to select processed clauses during proof search. We give experimental evidence that with a hybrid, two-phase approach, deep learning based guidance can significantly reduce the average number of proof search steps while increasing the number of theorems proved. Using a few proof guidance strategies that leverage deep neural networks, we have found first-order proofs of 7.36% of the first-order logic translations of the Mizar Mathematical Library theorems that did not previously have ATP generated proofs. This increases the ratio of statements in the corpus with ATP generated proofs from 56% to 59%. |
Tasks | Game of Go, Image Captioning, Machine Translation, Object Recognition, Speech Recognition |
Published | 2017-01-24 |
URL | http://arxiv.org/abs/1701.06972v1 |
http://arxiv.org/pdf/1701.06972v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-network-guided-proof-search |
Repo | |
Framework | |
Hyper-dimensional computing for a visual question-answering system that is trainable end-to-end
Title | Hyper-dimensional computing for a visual question-answering system that is trainable end-to-end |
Authors | Guglielmo Montone, J. Kevin O’Regan, Alexander V. Terekhov |
Abstract | In this work we propose a system for visual question answering. Our architecture is composed of two parts, the first part creates the logical knowledge base given the image. The second part evaluates questions against the knowledge base. Differently from previous work, the knowledge base is represented using hyper-dimensional computing. This choice has the advantage that all the operations in the system, namely creating the knowledge base and evaluating the questions against it, are differentiable, thereby making the system easily trainable in an end-to-end fashion. |
Tasks | Question Answering, Visual Question Answering |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10185v1 |
http://arxiv.org/pdf/1711.10185v1.pdf | |
PWC | https://paperswithcode.com/paper/hyper-dimensional-computing-for-a-visual |
Repo | |
Framework | |