Paper Group ANR 1403
MULE: Multimodal Universal Language Embedding. DAmageNet: A Universal Adversarial Dataset. Automatic Generation of Level Maps with the Do What’s Possible Representation. Dynamic Portfolio Management with Reinforcement Learning. Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference. The Universal Approximation Property: Characterization …
MULE: Multimodal Universal Language Embedding
Title | MULE: Multimodal Universal Language Embedding |
Authors | Donghyun Kim, Kuniaki Saito, Kate Saenko, Stan Sclaroff, Bryan A. Plummer |
Abstract | Existing vision-language methods typically support two languages at a time at most. In this paper, we present a modular approach which can easily be incorporated into existing vision-language methods in order to support many languages. We accomplish this by learning a single shared Multimodal Universal Language Embedding (MULE) which has been visually-semantically aligned across all languages. Then we learn to relate MULE to visual data as if it were a single language. Our method is not architecture specific, unlike prior work which typically learned separate branches for each language, enabling our approach to easily be adapted to many vision-language methods and tasks. Since MULE learns a single language branch in the multimodal model, we can also scale to support many languages, and languages with fewer annotations can take advantage of the good representation learned from other (more abundant) language data. We demonstrate the effectiveness of MULE on the bidirectional image-sentence retrieval task, supporting up to four languages in a single model. In addition, we show that Machine Translation can be used for data augmentation in multilingual learning, which, combined with MULE, improves mean recall by up to 21.9% on a single-language compared to prior work, with the most significant gains seen on languages with relatively few annotations. Our code is publicly available. |
Tasks | Data Augmentation, Machine Translation |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03493v2 |
https://arxiv.org/pdf/1909.03493v2.pdf | |
PWC | https://paperswithcode.com/paper/mule-multimodal-universal-language-embedding |
Repo | |
Framework | |
DAmageNet: A Universal Adversarial Dataset
Title | DAmageNet: A Universal Adversarial Dataset |
Authors | Sizhe Chen, Xiaolin Huang, Zhengbao He, Chengjin Sun |
Abstract | It is now well known that deep neural networks (DNNs) are vulnerable to adversarial attack. Adversarial samples are similar to the clean ones, but are able to cheat the attacked DNN to produce incorrect predictions in high confidence. But most of the existing adversarial attacks have high success rate only when the information of the attacked DNN is well-known or could be estimated by massive queries. A promising way is to generate adversarial samples with high transferability. By this way, we generate 96020 transferable adversarial samples from original ones in ImageNet. The average difference, measured by root means squared deviation, is only around 3.8 on average. However, the adversarial samples are misclassified by various models with an error rate up to 90%. Since the images are generated independently with the attacked DNNs, this is essentially zero-query adversarial attack. We call the dataset \emph{DAmageNet}, which is the first universal adversarial dataset that beats many models trained in ImageNet. By finding the drawbacks, DAmageNet could serve as a benchmark to study and improve robustness of DNNs. DAmageNet could be downloaded in http://www.pami.sjtu.edu.cn/Show/56/122. |
Tasks | Adversarial Attack |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07160v1 |
https://arxiv.org/pdf/1912.07160v1.pdf | |
PWC | https://paperswithcode.com/paper/damagenet-a-universal-adversarial-dataset |
Repo | |
Framework | |
Automatic Generation of Level Maps with the Do What’s Possible Representation
Title | Automatic Generation of Level Maps with the Do What’s Possible Representation |
Authors | Daniel Ashlock, Christoph Salge |
Abstract | Automatic generation of level maps is a popular form of automatic content generation. In this study, a recently developed technique employing the {\em do what’s possible} representation is used to create open-ended level maps. Generation of the map can continue indefinitely, yielding a highly scalable representation. A parameter study is performed to find good parameters for the evolutionary algorithm used to locate high-quality map generators. Variations on the technique are presented, demonstrating its versatility, and an algorithmic variant is given that both improves performance and changes the character of maps located. The ability of the map to adapt to different regions where the map is permitted to occupy space are also tested. |
Tasks | |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09618v1 |
https://arxiv.org/pdf/1905.09618v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-generation-of-level-maps-with-the |
Repo | |
Framework | |
Dynamic Portfolio Management with Reinforcement Learning
Title | Dynamic Portfolio Management with Reinforcement Learning |
Authors | Junhao Wang, Yinheng Li, Yijie Cao |
Abstract | Dynamic Portfolio Management is a domain that concerns the continuous redistribution of assets within a portfolio to maximize the total return in a given period of time. With the recent advancement in machine learning and artificial intelligence, many efforts have been put in designing and discovering efficient algorithmic ways to manage the portfolio. This paper presents two different reinforcement learning agents, policy gradient actor-critic and evolution strategy. The performance of the two agents is compared during backtesting. We also discuss the problem set up from state space design, to state value function approximator and policy control design. We include the short position to give the agent more flexibility during assets redistribution and a constant trading cost of 0.25%. The agent is able to achieve 5% return in 10 days daily trading despite 0.25% trading cost. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11880v1 |
https://arxiv.org/pdf/1911.11880v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-portfolio-management-with |
Repo | |
Framework | |
Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference
Title | Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference |
Authors | Nikita Kitaev, Dan Klein |
Abstract | We present a constituency parsing algorithm that maps from word-aligned contextualized feature vectors to parse trees. Our algorithm proceeds strictly left-to-right, processing one word at a time by assigning it a label from a small vocabulary. We show that, with mild assumptions, our inference procedure requires constant computation time per word. Our method gets 95.4 F1 on the WSJ test set. |
Tasks | Constituency Parsing |
Published | 2019-04-22 |
URL | http://arxiv.org/abs/1904.09745v1 |
http://arxiv.org/pdf/1904.09745v1.pdf | |
PWC | https://paperswithcode.com/paper/tetra-tagging-word-synchronous-parsing-with |
Repo | |
Framework | |
The Universal Approximation Property: Characterizations, Existence, and a Canonical Topology for Deep-Learning
Title | The Universal Approximation Property: Characterizations, Existence, and a Canonical Topology for Deep-Learning |
Authors | Anastasis Kratsios |
Abstract | The universal approximation property (UAP) of feed-forward neural networks is systematically studied for arbitrary families of functions in general function spaces. Two characterizations of the UAP are found, conditions for the existence of a small family of functions with the UAP are given, and a canonical topology guaranteeing that a set of functions has the UAP is explicitly constructed. These general results are applied to two concrete problems in learning theory. First, it is shown that neural network architectures with a sigmoid activation function achieving the values 0 and 1 are capable of approximating any set function between two Euclidean spaces for the canonical topology. As a second application of our results, it is shown that any continuous function accepting an arbitrary number of inputs can be approximated by a neural network receiving an arbitrary number of inputs. This makes these networks suitable for learning problems where the dimension of the data is diverging, such as in ultra-high dimensional situations. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03344v2 |
https://arxiv.org/pdf/1910.03344v2.pdf | |
PWC | https://paperswithcode.com/paper/universal-approximation-theorems |
Repo | |
Framework | |
Enabling Highly Efficient Capsule Networks Processing Through A PIM-Based Architecture Design
Title | Enabling Highly Efficient Capsule Networks Processing Through A PIM-Based Architecture Design |
Authors | Xingyao Zhang, Shuaiwen Leon Song, Chenhao Xie, Jing Wang, Weigong Zhang, Xin Fu |
Abstract | In recent years, the CNNs have achieved great successes in the image processing tasks, e.g., image recognition and object detection. Unfortunately, traditional CNN’s classification is found to be easily misled by increasingly complex image features due to the usage of pooling operations, hence unable to preserve accurate position and pose information of the objects. To address this challenge, a novel neural network structure called Capsule Network has been proposed, which introduces equivariance through capsules to significantly enhance the learning ability for image segmentation and object detection. Due to its requirement of performing a high volume of matrix operations, CapsNets have been generally accelerated on modern GPU platforms that provide highly optimized software library for common deep learning tasks. However, based on our performance characterization on modern GPUs, CapsNets exhibit low efficiency due to the special program and execution features of their routing procedure, including massive unshareable intermediate variables and intensive synchronizations, which are very difficult to optimize at software level. To address these challenges, we propose a hybrid computing architecture design named \textit{PIM-CapsNet}. It preserves GPU’s on-chip computing capability for accelerating CNN types of layers in CapsNet, while pipelining with an off-chip in-memory acceleration solution that effectively tackles routing procedure’s inefficiency by leveraging the processing-in-memory capability of today’s 3D stacked memory. Using routing procedure’s inherent parallellization feature, our design enables hierarchical improvements on CapsNet inference efficiency through minimizing data movement and maximizing parallel processing in memory. |
Tasks | Object Detection, Semantic Segmentation |
Published | 2019-11-07 |
URL | https://arxiv.org/abs/1911.03451v1 |
https://arxiv.org/pdf/1911.03451v1.pdf | |
PWC | https://paperswithcode.com/paper/enabling-highly-efficient-capsule-networks |
Repo | |
Framework | |
Relative Hausdorff Distance for Network Analysis
Title | Relative Hausdorff Distance for Network Analysis |
Authors | Sinan G. Aksoy, Kathleen E. Nowak, Emilie Purvine, Stephen J. Young |
Abstract | Similarity measures are used extensively in machine learning and data science algorithms. The newly proposed graph Relative Hausdorff (RH) distance is a lightweight yet nuanced similarity measure for quantifying the closeness of two graphs. In this work we study the effectiveness of RH distance as a tool for detecting anomalies in time-evolving graph sequences. We apply RH to cyber data with given red team events, as well to synthetically generated sequences of graphs with planted attacks. In our experiments, the performance of RH distance is at times comparable, and sometimes superior, to graph edit distance in detecting anomalous phenomena. Our results suggest that in appropriate contexts, RH distance has advantages over more computationally intensive similarity measures. |
Tasks | |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.04936v1 |
https://arxiv.org/pdf/1906.04936v1.pdf | |
PWC | https://paperswithcode.com/paper/relative-hausdorff-distance-for-network |
Repo | |
Framework | |
Semi-Supervised Learning using Differentiable Reasoning
Title | Semi-Supervised Learning using Differentiable Reasoning |
Authors | Emile van Krieken, Erman Acar, Frank van Harmelen |
Abstract | We introduce Differentiable Reasoning (DR), a novel semi-supervised learning technique which uses relational background knowledge to benefit from unlabeled data. We apply it to the Semantic Image Interpretation (SII) task and show that background knowledge provides significant improvement. We find that there is a strong but interesting imbalance between the contributions of updates from Modus Ponens (MP) and its logical equivalent Modus Tollens (MT) to the learning process, suggesting that our approach is very sensitive to a phenomenon called the Raven Paradox. We propose a solution to overcome this situation. |
Tasks | |
Published | 2019-08-13 |
URL | https://arxiv.org/abs/1908.04700v1 |
https://arxiv.org/pdf/1908.04700v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-using-differentiable |
Repo | |
Framework | |
Regularizing linear inverse problems with convolutional neural networks
Title | Regularizing linear inverse problems with convolutional neural networks |
Authors | Reinhard Heckel |
Abstract | Deep convolutional neural networks trained on large datsets have emerged as an intriguing alternative for compressing images and solving inverse problems such as denoising and compressive sensing. However, it has only recently been realized that even without training, convolutional networks can function as concise image models, and thus regularize inverse problems. In this paper, we provide further evidence for this finding by studying variations of convolutional neural networks that map few weight parameters to an image. The networks we consider only consist of convolutional operations, with either fixed or parameterized filters followed by ReLU non-linearities. We demonstrate that with both fixed and parameterized convolutional filters those networks enable representing images with few coefficients. What is more, the underparameterization enables regularization of inverse problems, in particular recovering an image from few observations. We show that, similar to standard compressive sensing guarantees, on the order of the number of model parameters many measurements suffice for recovering an image from compressive measurements. Finally, we demonstrate that signal recovery with a un-trained convolutional network outperforms standard l1 and total variation minimization for magnetic resonance imaging (MRI). |
Tasks | Compressive Sensing, Denoising |
Published | 2019-07-06 |
URL | https://arxiv.org/abs/1907.03100v1 |
https://arxiv.org/pdf/1907.03100v1.pdf | |
PWC | https://paperswithcode.com/paper/regularizing-linear-inverse-problems-with |
Repo | |
Framework | |
Towards Partial Supervision for Generic Object Counting in Natural Scenes
Title | Towards Partial Supervision for Generic Object Counting in Natural Scenes |
Authors | Hisham Cholakkal, Guolei Sun, Salman Khan, Fahad Shahbaz Khan, Ling Shao, Luc Van Gool |
Abstract | Generic object counting in natural scenes is a challenging computer vision problem. Existing approaches either rely on instance-level supervision or absolute count information to train a generic object counter. We introduce a partially supervised setting that significantly reduces the supervision level required for generic object counting. We propose two novel frameworks, named lower-count (LC) and reduced lower-count (RLC), to enable object counting under this setting. Our frameworks are built on a novel dual-branch architecture that has an image classification and a density branch. Our LC framework reduces the annotation cost due to multiple instances in an image by using only lower-count supervision for all object categories. Our RLC framework further reduces the annotation cost arising from large numbers of object categories in a dataset by only using lower-count supervision for a subset of categories and class-labels for the remaining ones. The RLC framework extends our dual-branch LC framework with a novel weight modulation layer and a category-independent density map prediction. Experiments are performed on COCO, Visual Genome and PASCAL 2007 datasets. Our frameworks perform on par with state-of-the-art approaches using higher levels of supervision. Additionally, we demonstrate the applicability of our LC supervised density map for image-level supervised instance segmentation. |
Tasks | Image Classification, Instance Segmentation, Object Counting, Semantic Segmentation |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06448v1 |
https://arxiv.org/pdf/1912.06448v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-partial-supervision-for-generic |
Repo | |
Framework | |
Reservoirs learn to learn
Title | Reservoirs learn to learn |
Authors | Anand Subramoney, Franz Scherr, Wolfgang Maass |
Abstract | We consider reservoirs in the form of liquid state machines, i.e., recurrently connected networks of spiking neurons with randomly chosen weights. So far only the weights of a linear readout were adapted for a specific task. We wondered whether the performance of liquid state machines can be improved if the recurrent weights are chosen with a purpose, rather than randomly. After all, weights of recurrent connections in the brain are also not assumed to be randomly chosen. Rather, these weights were probably optimized during evolution, development, and prior learning experiences for specific task domains. In order to examine the benefits of choosing recurrent weights within a liquid with a purpose, we applied the Learning-to-Learn (L2L) paradigm to our model: We optimized the weights of the recurrent connections – and hence the dynamics of the liquid state machine – for a large family of potential learning tasks, which the network might have to learn later through modification of the weights of readout neurons. We found that this two-tiered process substantially improves the learning speed of liquid state machines for specific tasks. In fact, this learning speed increases further if one does not train the weights of linear readouts at all, and relies instead on the internal dynamics and fading memory of the network for remembering salient information that it could extract from preceding examples for the current learning task. This second type of learning has recently been proposed to underlie fast learning in the prefrontal cortex and motor cortex, and hence it is of interest to explore its performance also in models. Since liquid state machines share many properties with other types of reservoirs, our results raise the question whether L2L conveys similar benefits also to these other reservoirs. |
Tasks | |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07486v2 |
https://arxiv.org/pdf/1909.07486v2.pdf | |
PWC | https://paperswithcode.com/paper/reservoirs-learn-to-learn |
Repo | |
Framework | |
Mapping Spiking Neural Networks to Neuromorphic Hardware
Title | Mapping Spiking Neural Networks to Neuromorphic Hardware |
Authors | Adarsha Balaji, Anup Das, Yuefeng Wu, Khanh Huynh, Francesco Dell’Anna, Giacomo Indiveri, Jeffrey L. Krichmar, Nikil Dutt, Siebren Schaafsma, Francky Catthoor |
Abstract | Neuromorphic hardware platforms implement biological neurons and synapses to execute spiking neural networks (SNNs) in an energy-efficient manner. We present SpiNeMap, a design methodology to map SNNs to crossbar-based neuromorphic hardware, minimizing spike latency and energy consumption. SpiNeMap operates in two steps: SpiNeCluster and SpiNePlacer. SpiNeCluster is a heuristic-based clustering technique to partition SNNs into clusters of synapses, where intracluster local synapses are mapped within crossbars of the hardware and inter-cluster global synapses are mapped to the shared interconnect. SpiNeCluster minimizes the number of spikes on global synapses, which reduces spike congestion on the shared interconnect, improving application performance. SpiNePlacer then finds the best placement of local and global synapses on the hardware using a meta-heuristic-based approach to minimize energy consumption and spike latency. We evaluate SpiNeMap using synthetic and realistic SNNs on the DynapSE neuromorphic hardware. We show that SpiNeMap reduces average energy consumption by 45% and average spike latency by 21%, compared to state-of-the-art techniques. |
Tasks | |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.01843v1 |
https://arxiv.org/pdf/1909.01843v1.pdf | |
PWC | https://paperswithcode.com/paper/mapping-spiking-neural-networks-to |
Repo | |
Framework | |
Gated CRF Loss for Weakly Supervised Semantic Image Segmentation
Title | Gated CRF Loss for Weakly Supervised Semantic Image Segmentation |
Authors | Anton Obukhov, Stamatios Georgoulis, Dengxin Dai, Luc Van Gool |
Abstract | State-of-the-art approaches for semantic segmentation rely on deep convolutional neural networks trained on fully annotated datasets, that have been shown to be notoriously expensive to collect, both in terms of time and money. To remedy this situation, weakly supervised methods leverage other forms of supervision that require substantially less annotation effort, but they typically present an inability to predict precise object boundaries due to approximate nature of the supervisory signals in those regions. While great progress has been made in improving the performance, many of these weakly supervised methods are highly tailored to their own specific settings. This raises challenges in reusing algorithms and making steady progress. In this paper, we intentionally avoid such practices when tackling weakly supervised semantic segmentation. In particular, we train standard neural networks with partial cross-entropy loss function for the labeled pixels and our proposed Gated CRF loss for the unlabeled pixels. The Gated CRF loss is designed to deliver several important assets: 1) it enables flexibility in the kernel construction to mask out influence from undesired pixel positions; 2) it offloads learning contextual relations to CNN and concentrates on semantic boundaries; 3) it does not rely on high-dimensional filtering and thus has a simple implementation. Throughout the paper we present the advantages of the loss function, analyze several aspects of weakly supervised training, and show that our `purist’ approach achieves state-of-the-art performance for both click-based and scribble-based annotations. | |
Tasks | Semantic Segmentation, Weakly-Supervised Semantic Segmentation |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04651v2 |
https://arxiv.org/pdf/1906.04651v2.pdf | |
PWC | https://paperswithcode.com/paper/gated-crf-loss-for-weakly-supervised-semantic |
Repo | |
Framework | |
To Beta or Not To Beta: Information Bottleneck for DigitaL Image Forensics
Title | To Beta or Not To Beta: Information Bottleneck for DigitaL Image Forensics |
Authors | Aurobrata Ghosh, Zheng Zhong, Steve Cruz, Subbu Veeravasarapu, Terrance E Boult, Maneesh Singh |
Abstract | We consider an information theoretic approach to address the problem of identifying fake digital images. We propose an innovative method to formulate the issue of localizing manipulated regions in an image as a deep representation learning problem using the Information Bottleneck (IB), which has recently gained popularity as a framework for interpreting deep neural networks. Tampered images pose a serious predicament since digitized media is a ubiquitous part of our lives. These are facilitated by the easy availability of image editing software and aggravated by recent advances in deep generative models such as GANs. We propose InfoPrint, a computationally efficient solution to the IB formulation using approximate variational inference and compare it to a numerical solution that is computationally expensive. Testing on a number of standard datasets, we demonstrate that InfoPrint outperforms the state-of-the-art and the numerical solution. Additionally, it also has the ability to detect alterations made by inpainting GANs. |
Tasks | Representation Learning |
Published | 2019-08-11 |
URL | https://arxiv.org/abs/1908.03864v1 |
https://arxiv.org/pdf/1908.03864v1.pdf | |
PWC | https://paperswithcode.com/paper/to-beta-or-not-to-beta-information-bottleneck |
Repo | |
Framework | |