Paper Group ANR 474
Chinese Typeface Transformation with Hierarchical Adversarial Network. Rejection-Cascade of Gaussians: Real-time adaptive background subtraction framework. Interpreting and Extending The Guided Filter Via Cyclic Coordinate Descent. Dilated FCN for Multi-Agent 2D/3D Medical Image Registration. Interpretable Active Learning. A Useful Motif for Flexib …
Chinese Typeface Transformation with Hierarchical Adversarial Network
Title | Chinese Typeface Transformation with Hierarchical Adversarial Network |
Authors | Jie Chang, Yujun Gu, Ya Zhang |
Abstract | In this paper, we explore automated typeface generation through image style transfer which has shown great promise in natural image generation. Existing style transfer methods for natural images generally assume that the source and target images share similar high-frequency features. However, this assumption is no longer true in typeface transformation. Inspired by the recent advancement in Generative Adversarial Networks (GANs), we propose a Hierarchical Adversarial Network (HAN) for typeface transformation. The proposed HAN consists of two sub-networks: a transfer network and a hierarchical adversarial discriminator. The transfer network maps characters from one typeface to another. A unique characteristic of typefaces is that the same radicals may have quite different appearances in different characters even under the same typeface. Hence, a stage-decoder is employed by the transfer network to leverage multiple feature layers, aiming to capture both the global and local features. The hierarchical adversarial discriminator implicitly measures data discrepancy between the generated domain and the target domain. To leverage the complementary discriminating capability of different feature layers, a hierarchical structure is proposed for the discriminator. We have experimentally demonstrated that HAN is an effective framework for typeface transfer and characters restoration. |
Tasks | Image Generation, Style Transfer |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06448v1 |
http://arxiv.org/pdf/1711.06448v1.pdf | |
PWC | https://paperswithcode.com/paper/chinese-typeface-transformation-with |
Repo | |
Framework | |
Rejection-Cascade of Gaussians: Real-time adaptive background subtraction framework
Title | Rejection-Cascade of Gaussians: Real-time adaptive background subtraction framework |
Authors | B Ravi Kiran, Arindam Das, Senthil Yogamani |
Abstract | Background-Foreground classification is a well-studied problem in computer vision. Due to the pixel-wise nature of modeling and processing in the algorithm, it is usually difficult to satisfy real-time constraints. There is a trade-off between the speed (because of model complexity) and accuracy. Inspired by the rejection cascade of Viola-Jones classifier, we decompose the Gaussian Mixture Model (GMM) into an adaptive cascade of Gaussians(CoG). We achieve a good improvement in speed without compromising the accuracy with respect to the baseline GMM model. We demonstrate a speed-up factor of 4-5x and 17 percent average improvement in accuracy over Wallflowers surveillance datasets. The CoG is then demonstrated to over the latent space representation of images of a convolutional variational autoencoder(VAE). We provide initial results over CDW-2014 dataset, which could speed up background subtraction for deep architectures. |
Tasks | |
Published | 2017-05-25 |
URL | https://arxiv.org/abs/1705.09339v2 |
https://arxiv.org/pdf/1705.09339v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-background-subtraction-using |
Repo | |
Framework | |
Interpreting and Extending The Guided Filter Via Cyclic Coordinate Descent
Title | Interpreting and Extending The Guided Filter Via Cyclic Coordinate Descent |
Authors | Longquan Dai |
Abstract | In this paper, we will disclose that the Guided Filter (GF) can be interpreted as the Cyclic Coordinate Descent (CCD) solver of a Least Square (LS) objective function. This discovery implies a possible way to extend GF because we can alter the objective function of GF and define new filters as the first pass iteration of the CCD solver of modified objective functions. Moreover, referring to the iterative minimizing procedure of CCD, we can derive new rolling filtering schemes. Hence, under the guidance of this discovery, we not only propose new GF-like filters adapting to the specific requirements of applications but also offer thoroughly explanations for two rolling filtering schemes of GF as well as the way to extend them. Experiments show that our new filters and extensions produce state-of-the-art results. |
Tasks | |
Published | 2017-05-30 |
URL | http://arxiv.org/abs/1705.10552v1 |
http://arxiv.org/pdf/1705.10552v1.pdf | |
PWC | https://paperswithcode.com/paper/interpreting-and-extending-the-guided-filter |
Repo | |
Framework | |
Dilated FCN for Multi-Agent 2D/3D Medical Image Registration
Title | Dilated FCN for Multi-Agent 2D/3D Medical Image Registration |
Authors | Shun Miao, Sebastien Piat, Peter Fischer, Ahmet Tuysuzoglu, Philip Mewes, Tommaso Mansi, Rui Liao |
Abstract | 2D/3D image registration to align a 3D volume and 2D X-ray images is a challenging problem due to its ill-posed nature and various artifacts presented in 2D X-ray images. In this paper, we propose a multi-agent system with an auto attention mechanism for robust and efficient 2D/3D image registration. Specifically, an individual agent is trained with dilated Fully Convolutional Network (FCN) to perform registration in a Markov Decision Process (MDP) by observing a local region, and the final action is then taken based on the proposals from multiple agents and weighted by their corresponding confidence levels. The contributions of this paper are threefold. First, we formulate 2D/3D registration as a MDP with observations, actions, and rewards properly defined with respect to X-ray imaging systems. Second, to handle various artifacts in 2D X-ray images, multiple local agents are employed efficiently via FCN-based structures, and an auto attention mechanism is proposed to favor the proposals from regions with more reliable visual cues. Third, a dilated FCN-based training mechanism is proposed to significantly reduce the Degree of Freedom in the simulation of registration environment, and drastically improve training efficiency by an order of magnitude compared to standard CNN-based training method. We demonstrate that the proposed method achieves high robustness on both spine cone beam Computed Tomography data with a low signal-to-noise ratio and data from minimally invasive spine surgery where severe image artifacts and occlusions are presented due to metal screws and guide wires, outperforming other state-of-the-art methods (single agent-based and optimization-based) by a large margin. |
Tasks | Image Registration, Medical Image Registration |
Published | 2017-11-22 |
URL | http://arxiv.org/abs/1712.01651v1 |
http://arxiv.org/pdf/1712.01651v1.pdf | |
PWC | https://paperswithcode.com/paper/dilated-fcn-for-multi-agent-2d3d-medical |
Repo | |
Framework | |
Interpretable Active Learning
Title | Interpretable Active Learning |
Authors | Richard L. Phillips, Kyu Hyun Chang, Sorelle A. Friedler |
Abstract | Active learning has long been a topic of study in machine learning. However, as increasingly complex and opaque models have become standard practice, the process of active learning, too, has become more opaque. There has been little investigation into interpreting what specific trends and patterns an active learning strategy may be exploring. This work expands on the Local Interpretable Model-agnostic Explanations framework (LIME) to provide explanations for active learning recommendations. We demonstrate how LIME can be used to generate locally faithful explanations for an active learning strategy, and how these explanations can be used to understand how different models and datasets explore a problem space over time. In order to quantify the per-subgroup differences in how an active learning strategy queries spatial regions, we introduce a notion of uncertainty bias (based on disparate impact) to measure the discrepancy in the confidence for a model’s predictions between one subgroup and another. Using the uncertainty bias measure, we show that our query explanations accurately reflect the subgroup focus of the active learning queries, allowing for an interpretable explanation of what is being learned as points with similar sources of uncertainty have their uncertainty bias resolved. We demonstrate that this technique can be applied to track uncertainty bias over user-defined clusters or automatically generated clusters based on the source of uncertainty. |
Tasks | Active Learning |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1708.00049v2 |
http://arxiv.org/pdf/1708.00049v2.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-active-learning |
Repo | |
Framework | |
A Useful Motif for Flexible Task Learning in an Embodied Two-Dimensional Visual Environment
Title | A Useful Motif for Flexible Task Learning in an Embodied Two-Dimensional Visual Environment |
Authors | Kevin T. Feigelis, Daniel L. K. Yamins |
Abstract | Animals (especially humans) have an amazing ability to learn new tasks quickly, and switch between them flexibly. How brains support this ability is largely unknown, both neuroscientifically and algorithmically. One reasonable supposition is that modules drawing on an underlying general-purpose sensory representation are dynamically allocated on a per-task basis. Recent results from neuroscience and artificial intelligence suggest the role of the general purpose visual representation may be played by a deep convolutional neural network, and give some clues how task modules based on such a representation might be discovered and constructed. In this work, we investigate module architectures in an embodied two-dimensional touchscreen environment, in which an agent’s learning must occur via interactions with an environment that emits images and rewards, and accepts touches as input. This environment is designed to capture the physical structure of the task environments that are commonly deployed in visual neuroscience and psychophysics. We show that in this context, very simple changes in the nonlinear activations used by such a module can significantly influence how fast it is at learning visual tasks and how suitable it is for switching to new tasks. |
Tasks | |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07147v1 |
http://arxiv.org/pdf/1706.07147v1.pdf | |
PWC | https://paperswithcode.com/paper/a-useful-motif-for-flexible-task-learning-in |
Repo | |
Framework | |
Aspect Extraction and Sentiment Classification of Mobile Apps using App-Store Reviews
Title | Aspect Extraction and Sentiment Classification of Mobile Apps using App-Store Reviews |
Authors | Sharmistha Dey |
Abstract | Understanding of customer sentiment can be useful for product development. On top of that if the priorities for the development order can be known, then development procedure become simpler. This work has tried to address this issue in the mobile app domain. Along with aspect and opinion extraction this work has also categorized the extracted aspects ac-cording to their importance. This can help developers to focus their time and energy at the right place. |
Tasks | Aspect Extraction, Sentiment Analysis |
Published | 2017-12-09 |
URL | http://arxiv.org/abs/1712.03430v1 |
http://arxiv.org/pdf/1712.03430v1.pdf | |
PWC | https://paperswithcode.com/paper/aspect-extraction-and-sentiment |
Repo | |
Framework | |
Approximations of the Restless Bandit Problem
Title | Approximations of the Restless Bandit Problem |
Authors | Steffen Grunewalder, Azadeh Khaleghi |
Abstract | The multi-armed restless bandit problem is studied in the case where the pay-off distributions are stationary $\varphi$-mixing. This version of the problem provides a more realistic model for most real-world applications, but cannot be optimally solved in practice, since it is known to be PSPACE-hard. The objective of this paper is to characterize a sub-class of the problem where {\em good} approximate solutions can be found using tractable approaches. Specifically, it is shown that under some conditions on the $\varphi$-mixing coefficients, a modified version of UCB can prove effective. The main challenge is that, unlike in the i.i.d. setting, the distributions of the sampled pay-offs may not have the same characteristics as those of the original bandit arms. In particular, the $\varphi$-mixing property does not necessarily carry over. This is overcome by carefully controlling the effect of a sampling policy on the pay-off distributions. Some of the proof techniques developed in this paper can be more generally used in the context of online sampling under dependence. Proposed algorithms are accompanied with corresponding regret analysis. |
Tasks | |
Published | 2017-02-22 |
URL | http://arxiv.org/abs/1702.06972v3 |
http://arxiv.org/pdf/1702.06972v3.pdf | |
PWC | https://paperswithcode.com/paper/approximations-of-the-restless-bandit-problem |
Repo | |
Framework | |
Real time ridge orientation estimation for fingerprint images
Title | Real time ridge orientation estimation for fingerprint images |
Authors | Eman Alibeigi, Shadrokh Samavi, Shahram Shirani, Zahra Rahmani |
Abstract | Fingerprint verification is an important bio-metric technique for personal identification. Most of the automatic verification systems are based on matching of fingerprint minutiae. Extraction of minutiae is an essential process which requires estimation of orientation of the lines in an image. Most of the existing methods involve intense mathematical computations and hence are performed through software means. In this paper a hardware scheme to perform real time orientation estimation is presented which is based on pipelined architecture. Synthesized circuits proved the functionality and accuracy of the suggested method. |
Tasks | |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.05027v1 |
http://arxiv.org/pdf/1710.05027v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-ridge-orientation-estimation-for |
Repo | |
Framework | |
Deceiving Google’s Perspective API Built for Detecting Toxic Comments
Title | Deceiving Google’s Perspective API Built for Detecting Toxic Comments |
Authors | Hossein Hosseini, Sreeram Kannan, Baosen Zhang, Radha Poovendran |
Abstract | Social media platforms provide an environment where people can freely engage in discussions. Unfortunately, they also enable several problems, such as online harassment. Recently, Google and Jigsaw started a project called Perspective, which uses machine learning to automatically detect toxic language. A demonstration website has been also launched, which allows anyone to type a phrase in the interface and instantaneously see the toxicity score [1]. In this paper, we propose an attack on the Perspective toxic detection system based on the adversarial examples. We show that an adversary can subtly modify a highly toxic phrase in a way that the system assigns significantly lower toxicity score to it. We apply the attack on the sample phrases provided in the Perspective website and show that we can consistently reduce the toxicity scores to the level of the non-toxic phrases. The existence of such adversarial examples is very harmful for toxic detection systems and seriously undermines their usability. |
Tasks | |
Published | 2017-02-27 |
URL | http://arxiv.org/abs/1702.08138v1 |
http://arxiv.org/pdf/1702.08138v1.pdf | |
PWC | https://paperswithcode.com/paper/deceiving-googles-perspective-api-built-for |
Repo | |
Framework | |
Targeted Advertising Based on Browsing History
Title | Targeted Advertising Based on Browsing History |
Authors | Yong Zhang, Hongming Zhou, Nganmeng Tan, Saeed Bagheri, Meng Joo Er |
Abstract | Audience interest, demography, purchase behavior and other possible classifications are ex- tremely important factors to be carefully studied in a targeting campaign. This information can help advertisers and publishers deliver advertisements to the right audience group. How- ever, it is not easy to collect such information, especially for the online audience with whom we have limited interaction and minimum deterministic knowledge. In this paper, we pro- pose a predictive framework that can estimate online audience demographic attributes based on their browsing histories. Under the proposed framework, first, we retrieve the content of the websites visited by audience, and represent the content as website feature vectors; second, we aggregate the vectors of websites that audience have visited and arrive at feature vectors representing the users; finally, the support vector machine is exploited to predict the audience demographic attributes. The key to achieving good prediction performance is preparing representative features of the audience. Word Embedding, a widely used tech- nique in natural language processing tasks, together with term frequency-inverse document frequency weighting scheme is used in the proposed method. This new representation ap- proach is unsupervised and very easy to implement. The experimental results demonstrate that the new audience feature representation method is more powerful than existing baseline methods, leading to a great improvement in prediction accuracy. |
Tasks | |
Published | 2017-11-13 |
URL | http://arxiv.org/abs/1711.04498v1 |
http://arxiv.org/pdf/1711.04498v1.pdf | |
PWC | https://paperswithcode.com/paper/targeted-advertising-based-on-browsing |
Repo | |
Framework | |
Tracing Linguistic Relations in Winning and Losing Sides of Explicit Opposing Groups
Title | Tracing Linguistic Relations in Winning and Losing Sides of Explicit Opposing Groups |
Authors | Ceyda Sanli, Anupam Mondal, Erik Cambria |
Abstract | Linguistic relations in oral conversations present how opinions are constructed and developed in a restricted time. The relations bond ideas, arguments, thoughts, and feelings, re-shape them during a speech, and finally build knowledge out of all information provided in the conversation. Speakers share a common interest to discuss. It is expected that each speaker’s reply includes duplicated forms of words from previous speakers. However, linguistic adaptation is observed and evolves in a more complex path than just transferring slightly modified versions of common concepts. A conversation aiming a benefit at the end shows an emergent cooperation inducing the adaptation. Not only cooperation, but also competition drives the adaptation or an opposite scenario and one can capture the dynamic process by tracking how the concepts are linguistically linked. To uncover salient complex dynamic events in verbal communications, we attempt to discover self-organized linguistic relations hidden in a conversation with explicitly stated winners and losers. We examine open access data of the United States Supreme Court. Our understanding is crucial in big data research to guide how transition states in opinion mining and decision-making should be modeled and how this required knowledge to guide the model should be pinpointed, by filtering large amount of data. |
Tasks | Decision Making, Opinion Mining |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00317v1 |
http://arxiv.org/pdf/1703.00317v1.pdf | |
PWC | https://paperswithcode.com/paper/tracing-linguistic-relations-in-winning-and |
Repo | |
Framework | |
(Machine) Learning to Do More with Less
Title | (Machine) Learning to Do More with Less |
Authors | Timothy Cohen, Marat Freytsis, Bryan Ostdiek |
Abstract | Determining the best method for training a machine learning algorithm is critical to maximizing its ability to classify data. In this paper, we compare the standard “fully supervised” approach (that relies on knowledge of event-by-event truth-level labels) with a recent proposal that instead utilizes class ratios as the only discriminating information provided during training. This so-called “weakly supervised” technique has access to less information than the fully supervised method and yet is still able to yield impressive discriminating power. In addition, weak supervision seems particularly well suited to particle physics since quantum mechanics is incompatible with the notion of mapping an individual event onto any single Feynman diagram. We examine the technique in detail – both analytically and numerically – with a focus on the robustness to issues of mischaracterizing the training samples. Weakly supervised networks turn out to be remarkably insensitive to systematic mismodeling. Furthermore, we demonstrate that the event level outputs for weakly versus fully supervised networks are probing different kinematics, even though the numerical quality metrics are essentially identical. This implies that it should be possible to improve the overall classification ability by combining the output from the two types of networks. For concreteness, we apply this technology to a signature of beyond the Standard Model physics to demonstrate that all these impressive features continue to hold in a scenario of relevance to the LHC. |
Tasks | |
Published | 2017-06-28 |
URL | http://arxiv.org/abs/1706.09451v3 |
http://arxiv.org/pdf/1706.09451v3.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-to-do-more-with-less |
Repo | |
Framework | |
Methods for Interpreting and Understanding Deep Neural Networks
Title | Methods for Interpreting and Understanding Deep Neural Networks |
Authors | Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller |
Abstract | This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. It introduces some recently proposed techniques of interpretation, along with theory, tricks and recommendations, to make most efficient use of these techniques on real data. It also discusses a number of practical applications. |
Tasks | |
Published | 2017-06-24 |
URL | http://arxiv.org/abs/1706.07979v1 |
http://arxiv.org/pdf/1706.07979v1.pdf | |
PWC | https://paperswithcode.com/paper/methods-for-interpreting-and-understanding |
Repo | |
Framework | |
Active Expansion Sampling for Learning Feasible Domains in an Unbounded Input Space
Title | Active Expansion Sampling for Learning Feasible Domains in an Unbounded Input Space |
Authors | Wei Chen, Mark Fuge |
Abstract | Many engineering problems require identifying feasible domains under implicit constraints. One example is finding acceptable car body styling designs based on constraints like aesthetics and functionality. Current active-learning based methods learn feasible domains for bounded input spaces. However, we usually lack prior knowledge about how to set those input variable bounds. Bounds that are too small will fail to cover all feasible domains; while bounds that are too large will waste query budget. To avoid this problem, we introduce Active Expansion Sampling (AES), a method that identifies (possibly disconnected) feasible domains over an unbounded input space. AES progressively expands our knowledge of the input space, and uses successive exploitation and exploration stages to switch between learning the decision boundary and searching for new feasible domains. We show that AES has a misclassification loss guarantee within the explored region, independent of the number of iterations or labeled samples. Thus it can be used for real-time prediction of samples’ feasibility within the explored region. We evaluate AES on three test examples and compare AES with two adaptive sampling methods – the Neighborhood-Voronoi algorithm and the straddle heuristic – that operate over fixed input variable bounds. |
Tasks | Active Learning |
Published | 2017-08-25 |
URL | http://arxiv.org/abs/1708.07888v3 |
http://arxiv.org/pdf/1708.07888v3.pdf | |
PWC | https://paperswithcode.com/paper/active-expansion-sampling-for-learning |
Repo | |
Framework | |