October 18, 2019

3507 words 17 mins read

Paper Group ANR 519

A Roadmap for Robust End-to-End Alignment. A novel improved fuzzy support vector machine based stock price trend forecast model. Exploring Sentence Vector Spaces through Automatic Summarization. MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server. Simulated Tempering Langevin Monte Carlo II: An Improved Proof using Soft Mark …

A Roadmap for Robust End-to-End Alignment


Title	A Roadmap for Robust End-to-End Alignment
Authors	Lê Nguyên Hoang
Abstract	This paper discussed the {\it robust alignment} problem, that is, the problem of aligning the goals of algorithms with human preferences. It presented a general roadmap to tackle this issue. Interestingly, this roadmap identifies 5 critical steps, as well as many relevant aspects of these 5 steps. In other words, we have presented a large number of hopefully more tractable subproblems that readers are highly encouraged to tackle. Hopefully, this combination allows to better highlight the most pressing problems, how every expertise can be best used to, and how combining the solutions to subproblems might add up to solve robust alignment.
Tasks
Published	2018-09-04
URL	https://arxiv.org/abs/1809.01036v4
PDF	https://arxiv.org/pdf/1809.01036v4.pdf
PWC	https://paperswithcode.com/paper/a-roadmap-for-robust-end-to-end-alignment
Repo
Framework

A novel improved fuzzy support vector machine based stock price trend forecast model


Title	A novel improved fuzzy support vector machine based stock price trend forecast model
Authors	Shuheng Wang, Guohao Li, Yifan Bao
Abstract	Application of fuzzy support vector machine in stock price forecast. Support vector machine is a new type of machine learning method proposed in 1990s. It can deal with classification and regression problems very successfully. Due to the excellent learning performance of support vector machine, the technology has become a hot research topic in the field of machine learning, and it has been successfully applied in many fields. However, as a new technology, there are many limitations to support vector machines. There is a large amount of fuzzy information in the objective world. If the training of support vector machine contains noise and fuzzy information, the performance of the support vector machine will become very weak and powerless. As the complexity of many factors influence the stock price prediction, the prediction results of traditional support vector machine cannot meet people with precision, this study improved the traditional support vector machine fuzzy prediction algorithm is proposed to improve the new model precision. NASDAQ Stock Market, Standard & Poor’s (S&P) Stock market are considered. Novel advanced- fuzzy support vector machine (NA-FSVM) is the proposed methodology.
Tasks	Stock Price Prediction
Published	2018-01-02
URL	http://arxiv.org/abs/1801.00681v1
PDF	http://arxiv.org/pdf/1801.00681v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-improved-fuzzy-support-vector-machine
Repo
Framework

Exploring Sentence Vector Spaces through Automatic Summarization


Title	Exploring Sentence Vector Spaces through Automatic Summarization
Authors	Adly Templeton, Jugal Kalita
Abstract	Given vector representations for individual words, it is necessary to compute vector representations of sentences for many applications in a compositional manner, often using artificial neural networks. Relatively little work has explored the internal structure and properties of such sentence vectors. In this paper, we explore the properties of sentence vectors in the context of automatic summarization. In particular, we show that cosine similarity between sentence vectors and document vectors is strongly correlated with sentence importance and that vector semantics can identify and correct gaps between the sentences chosen so far and the document. In addition, we identify specific dimensions which are linked to effective summaries. To our knowledge, this is the first time specific dimensions of sentence embeddings have been connected to sentence properties. We also compare the features of different methods of sentence embeddings. Many of these insights have applications in uses of sentence embeddings far beyond summarization.
Tasks	Sentence Embeddings
Published	2018-10-16
URL	http://arxiv.org/abs/1810.07320v1
PDF	http://arxiv.org/pdf/1810.07320v1.pdf
PWC	https://paperswithcode.com/paper/exploring-sentence-vector-spaces-through
Repo
Framework

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server


Title	MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server
Authors	Guoxin Cui, Jun Xu, Wei Zeng, Yanyan Lan, Jiafeng Guo, Xueqi Cheng
Abstract	One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers during the training iterations. Gradient quantization has been proposed as an effective approach to reducing the communication volume. One key issue in gradient quantization is setting the number of bits for quantizing the gradients. Small number of bits can significantly reduce the communication overhead while hurts the gradient accuracies, and vise versa. An ideal quantization method would dynamically balance the communication overhead and model accuracy, through adjusting the number bits according to the knowledge learned from the immediate past training iterations. Existing methods, however, quantize the gradients either with fixed number of bits, or with predefined heuristic rules. In this paper we propose a novel adaptive quantization method within the framework of reinforcement learning. The method, referred to as MQGrad, formalizes the selection of quantization bits as actions in a Markov decision process (MDP) where the MDP states records the information collected from the past optimization iterations (e.g., the sequence of the loss function values). During the training iterations of a machine learning algorithm, MQGrad continuously updates the MDP state according to the changes of the loss function. Based on the information, MDP learns to select the optimal actions (number of bits) to quantize the gradients. Experimental results based on a benchmark dataset showed that MQGrad can accelerate the learning of a large scale deep neural network while keeping its prediction accuracies.
Tasks	Quantization
Published	2018-04-22
URL	http://arxiv.org/abs/1804.08066v1
PDF	http://arxiv.org/pdf/1804.08066v1.pdf
PWC	https://paperswithcode.com/paper/mqgrad-reinforcement-learning-of-gradient
Repo
Framework

Simulated Tempering Langevin Monte Carlo II: An Improved Proof using Soft Markov Chain Decomposition


Title	Simulated Tempering Langevin Monte Carlo II: An Improved Proof using Soft Markov Chain Decomposition
Authors	Rong Ge, Holden Lee, Andrej Risteski
Abstract	A key task in Bayesian machine learning is sampling from distributions that are only specified up to a partition function (i.e., constant of proportionality). One prevalent example of this is sampling posteriors in parametric distributions, such as latent-variable generative models. However sampling (even very approximately) can be #P-hard. Classical results going back to Bakry and 'Emery (1985) on sampling focus on log-concave distributions, and show a natural Markov chain called Langevin diffusion mixes in polynomial time. However, all log-concave distributions are uni-modal, while in practice it is very common for the distribution of interest to have multiple modes. In this case, Langevin diffusion suffers from torpid mixing. We address this problem by combining Langevin diffusion with simulated tempering. The result is a Markov chain that mixes more rapidly by transitioning between different temperatures of the distribution. We analyze this Markov chain for a mixture of (strongly) log-concave distributions of the same shape. In particular, our technique applies to the canonical multi-modal distribution: a mixture of gaussians (of equal variance). Our algorithm efficiently samples from these distributions given only access to the gradient of the log-pdf. For the analysis, we introduce novel techniques for proving spectral gaps based on decomposing the action of the generator of the diffusion. Previous approaches rely on decomposing the state space as a partition of sets, while our approach can be thought of as decomposing the stationary measure as a mixture of distributions (a “soft partition”). Additional materials for the paper can be found at http://tiny.cc/glr17. The proof and results have been improved and generalized from the precursor at www.arxiv.org/abs/1710.02736.
Tasks
Published	2018-11-29
URL	https://arxiv.org/abs/1812.00793v2
PDF	https://arxiv.org/pdf/1812.00793v2.pdf
PWC	https://paperswithcode.com/paper/simulated-tempering-langevin-monte-carlo-ii
Repo
Framework

Unsupervised Classification of Galaxies. I. ICA feature selection


Title	Unsupervised Classification of Galaxies. I. ICA feature selection
Authors	Didier Fraix-Burnet, Tanuka Chattopadhyay, Saptarshi Mondal
Abstract	Subjective classification of galaxies can mislead us in the quest of the origin regarding formation and evolution of galaxies since this is necessarily limited to a few features. The human mind is not able to apprehend the complex correlations in a manyfold parameter space, and multivariate analyses are the best tools to understand the differences among various kinds of objects. In this series of papers, an objective classification of 362,923 galaxies from the Value Added Galaxy Catalogue (VAGC) is carried out with the help of two methods of multivariate analysis. First, Independent Component Analysis (ICA) is used to determine a set of derived independent components that are linear combinations of 47 observed features (viz. ionized lines, Lick indices, photometric and morphological properties, star formation rates etc.) of the galaxies. Subsequently, a K-means cluster analysis is applied on the nine independent components to obtain ten distinct and homogeneous groups. In this first paper, we describe the methods and the main results. It appears that the nine Independent Components represent a complete physical description of galaxies (velocity dispersion, ionisation, metallicity, surface brightness and structure). We find that our ten groups can be essentially placed into traditional and empirical classes (from colour-magnitude and emission-line diagnostic diagrams, early- vs late-types) despite the classical corresponding features (colour, line ratios and morphology) being not significantly correlated with the nine Independent Components. More detailed physical interpretation of the groups will be performed in subsequent papers.
Tasks	Feature Selection
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02856v2
PDF	http://arxiv.org/pdf/1802.02856v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-classification-of-galaxies-i-ica
Repo
Framework

The Challenge of Crafting Intelligible Intelligence


Title	The Challenge of Crafting Intelligible Intelligence
Authors	Daniel S. Weld, Gagan Bansal
Abstract	Since Artificial Intelligence (AI) software uses techniques like deep lookahead search and stochastic optimization of huge neural networks to fit mammoth datasets, it often results in complex behavior that is difficult for people to understand. Yet organizations are deploying AI algorithms in many mission-critical settings. To trust their behavior, we must make AI intelligible, either by using inherently interpretable models or by developing new methods for explaining and controlling otherwise overwhelmingly complex decisions using local approximation, vocabulary alignment, and interactive explanation. This paper argues that intelligibility is essential, surveys recent work on building such systems, and highlights key directions for research.
Tasks	Stochastic Optimization
Published	2018-03-09
URL	http://arxiv.org/abs/1803.04263v3
PDF	http://arxiv.org/pdf/1803.04263v3.pdf
PWC	https://paperswithcode.com/paper/the-challenge-of-crafting-intelligible
Repo
Framework

A Sentiment Analysis of Breast Cancer Treatment Experiences and Healthcare Perceptions Across Twitter


Title	A Sentiment Analysis of Breast Cancer Treatment Experiences and Healthcare Perceptions Across Twitter
Authors	Eric M. Clark, Ted James, Chris A. Jones, Amulya Alapati, Promise Ukandu, Christopher M. Danforth, Peter Sheridan Dodds
Abstract	Background: Social media has the capacity to afford the healthcare industry with valuable feedback from patients who reveal and express their medical decision-making process, as well as self-reported quality of life indicators both during and post treatment. In prior work, [Crannell et. al.], we have studied an active cancer patient population on Twitter and compiled a set of tweets describing their experience with this disease. We refer to these online public testimonies as “Invisible Patient Reported Outcomes” (iPROs), because they carry relevant indicators, yet are difficult to capture by conventional means of self-report. Methods: Our present study aims to identify tweets related to the patient experience as an additional informative tool for monitoring public health. Using Twitter’s public streaming API, we compiled over 5.3 million “breast cancer” related tweets spanning September 2016 until mid December 2017. We combined supervised machine learning methods with natural language processing to sift tweets relevant to breast cancer patient experiences. We analyzed a sample of 845 breast cancer patient and survivor accounts, responsible for over 48,000 posts. We investigated tweet content with a hedonometric sentiment analysis to quantitatively extract emotionally charged topics. Results: We found that positive experiences were shared regarding patient treatment, raising support, and spreading awareness. Further discussions related to healthcare were prevalent and largely negative focusing on fear of political legislation that could result in loss of coverage. Conclusions: Social media can provide a positive outlet for patients to discuss their needs and concerns regarding their healthcare coverage and treatment needs. Capturing iPROs from online communication can help inform healthcare professionals and lead to more connected and personalized treatment regimens.
Tasks	Decision Making, Sentiment Analysis
Published	2018-05-25
URL	http://arxiv.org/abs/1805.09959v2
PDF	http://arxiv.org/pdf/1805.09959v2.pdf
PWC	https://paperswithcode.com/paper/a-sentiment-analysis-of-breast-cancer
Repo
Framework

Multi-labeled Relation Extraction with Attentive Capsule Network


Title	Multi-labeled Relation Extraction with Attentive Capsule Network
Authors	Xinsong Zhang, Pengshuai Li, Weijia Jia, Hai Zhao
Abstract	To disclose overlapped multiple relations from a sentence still keeps challenging. Most current works in terms of neural models inconveniently assuming that each sentence is explicitly mapped to a relation label, cannot handle multiple relations properly as the overlapped features of the relations are either ignored or very difficult to identify. To tackle with the new issue, we propose a novel approach for multi-labeled relation extraction with capsule network which acts considerably better than current convolutional or recurrent net in identifying the highly overlapped relations within an individual sentence. To better cluster the features and precisely extract the relations, we further devise attention-based routing algorithm and sliding-margin loss function, and embed them into our capsule network. The experimental results show that the proposed approach can indeed extract the highly overlapped features and achieve significant performance improvement for relation extraction comparing to the state-of-the-art works.
Tasks	Multi-Labeled Relation Extraction, Relation Extraction
Published	2018-11-11
URL	http://arxiv.org/abs/1811.04354v1
PDF	http://arxiv.org/pdf/1811.04354v1.pdf
PWC	https://paperswithcode.com/paper/multi-labeled-relation-extraction-with
Repo
Framework

Exploiting Class Learnability in Noisy Data


Title	Exploiting Class Learnability in Noisy Data
Authors	Matthew Klawonn, Eric Heim, James Hendler
Abstract	In many domains, collecting sufficient labeled training data for supervised machine learning requires easily accessible but noisy sources, such as crowdsourcing services or tagged Web data. Noisy labels occur frequently in data sets harvested via these means, sometimes resulting in entire classes of data on which learned classifiers generalize poorly. For real world applications, we argue that it can be beneficial to avoid training on such classes entirely. In this work, we aim to explore the classes in a given data set, and guide supervised training to spend time on a class proportional to its learnability. By focusing the training process, we aim to improve model generalization on classes with a strong signal. To that end, we develop an online algorithm that works in conjunction with classifier and training algorithm, iteratively selecting training data for the classifier based on how well it appears to generalize on each class. Testing our approach on a variety of data sets, we show our algorithm learns to focus on classes for which the model has low generalization error relative to strong baselines, yielding a classifier with good performance on learnable classes.
Tasks
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06524v1
PDF	http://arxiv.org/pdf/1811.06524v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-class-learnability-in-noisy-data
Repo
Framework

Detecting Dead Weights and Units in Neural Networks


Title	Detecting Dead Weights and Units in Neural Networks
Authors	Utku Evci
Abstract	Deep Neural Networks are highly over-parameterized and the size of the neural networks can be reduced significantly after training without any decrease in performance. One can clearly see this phenomenon in a wide range of architectures trained for various problems. Weight/channel pruning, distillation, quantization, matrix factorization are some of the main methods one can use to remove the redundancy to come up with smaller and faster models. This work starts with a short informative chapter, where we motivate the pruning idea and provide the necessary notation. In the second chapter, we compare various saliency scores in the context of parameter pruning. Using the insights obtained from this comparison and stating the problems it brings we motivate why pruning units instead of the individual parameters might be a better idea. We propose some set of definitions to quantify and analyze units that don’t learn and create any useful information. We propose an efficient way for detecting dead units and use it to select which units to prune. We get 5x model size reduction through unit-wise pruning on MNIST.
Tasks	Quantization
Published	2018-06-15
URL	http://arxiv.org/abs/1806.06068v1
PDF	http://arxiv.org/pdf/1806.06068v1.pdf
PWC	https://paperswithcode.com/paper/detecting-dead-weights-and-units-in-neural
Repo
Framework

Learning to Route with Sparse Trajectory Sets—Extended Version


Title	Learning to Route with Sparse Trajectory Sets—Extended Version
Authors	Chenjuan Guo, Bin Yang, Jilin Hu, Christian S. Jensen
Abstract	Motivated by the increasing availability of vehicle trajectory data, we propose learn-to-route, a comprehensive trajectory-based routing solution. Specifically, we first construct a graph-like structure from trajectories as the routing infrastructure. Second, we enable trajectory-based routing given an arbitrary (source, destination) pair. In the first step, given a road network and a collection of trajectories, we propose a trajectory-based clustering method that identifies regions in a road network. If a pair of regions are connected by trajectories, we maintain the paths used by these trajectories and learn a routing preference for travel between the regions. As trajectories are skewed and sparse, many region pairs are not connected by trajectories. We thus transfer routing preferences from region pairs with sufficient trajectories to such region pairs and then use the transferred preferences to identify paths between the regions. In the second step, we exploit the above graph-like structure to achieve a comprehensive trajectory-based routing solution. Empirical studies with two substantial trajectory data sets offer insight into the proposed solution, indicating that it is practical. A comparison with a leading routing service offers evidence that the paper’s proposal is able to enhance routing quality. This is an extended version of “Learning to Route with Sparse Trajectory Sets” [1], to appear in IEEE ICDE 2018.
Tasks
Published	2018-02-22
URL	http://arxiv.org/abs/1802.07980v1
PDF	http://arxiv.org/pdf/1802.07980v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-route-with-sparse-trajectory-sets
Repo
Framework

Towards Automatic Identification of Elephants in the Wild


Title	Towards Automatic Identification of Elephants in the Wild
Authors	Matthias Körschens, Björn Barz, Joachim Denzler
Abstract	Identifying animals from a large group of possible individuals is very important for biodiversity monitoring and especially for collecting data on a small number of particularly interesting individuals, as these have to be identified first before this can be done. Identifying them can be a very time-consuming task. This is especially true, if the animals look very similar and have only a small number of distinctive features, like elephants do. In most cases the animals stay at one place only for a short period of time during which the animal needs to be identified for knowing whether it is important to collect new data on it. For this reason, a system supporting the researchers in identifying elephants to speed up this process would be of great benefit. In this paper, we present such a system for identifying elephants in the face of a large number of individuals with only few training images per individual. For that purpose, we combine object part localization, off-the-shelf CNN features, and support vector machine classification to provide field researches with proposals of possible individuals given new images of an elephant. The performance of our system is demonstrated on a dataset comprising a total of 2078 images of 276 individual elephants, where we achieve 56% top-1 test accuracy and 80% top-10 accuracy. To deal with occlusion, varying viewpoints, and different poses present in the dataset, we furthermore enable the analysts to provide the system with multiple images of the same elephant to be identified and aggregate confidence values generated by the classifier. With that, our system achieves a top-1 accuracy of 74% and a top-10 accuracy of 88% on the held-out test dataset.
Tasks
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04418v1
PDF	http://arxiv.org/pdf/1812.04418v1.pdf
PWC	https://paperswithcode.com/paper/towards-automatic-identification-of-elephants
Repo
Framework

Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning


Title	Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning
Authors	Kwan-Yee Lin, Guanxiang Wang
Abstract	No-reference image quality assessment (NR-IQA) is a fundamental yet challenging task in low-level computer vision community. The difficulty is particularly pronounced for the limited information, for which the corresponding reference for comparison is typically absent. Although various feature extraction mechanisms have been leveraged from natural scene statistics to deep neural networks in previous methods, the performance bottleneck still exists. In this work, we propose a hallucination-guided quality regression network to address the issue. We firstly generate a hallucinated reference constrained on the distorted image, to compensate the absence of the true reference. Then, we pair the information of hallucinated reference with the distorted image, and forward them to the regressor to learn the perceptual discrepancy with the guidance of an implicit ranking relationship within the generator, and therefore produce the precise quality prediction. To demonstrate the effectiveness of our approach, comprehensive experiments are evaluated on four popular image quality assessment benchmarks. Our method significantly outperforms all the previous state-of-the-art methods by large margins. The code and model will be publicly available on the project page https://kwanyeelin.github.io/projects/HIQA/HIQA.html.
Tasks	Image Quality Assessment, No-Reference Image Quality Assessment
Published	2018-04-05
URL	http://arxiv.org/abs/1804.01681v1
PDF	http://arxiv.org/pdf/1804.01681v1.pdf
PWC	https://paperswithcode.com/paper/hallucinated-iqa-no-reference-image-quality
Repo
Framework

DeepKey: Towards End-to-End Physical Key Replication From a Single Photograph


Title	DeepKey: Towards End-to-End Physical Key Replication From a Single Photograph
Authors	Rory Smith, Tilo Burghardt
Abstract	This paper describes DeepKey, an end-to-end deep neural architecture capable of taking a digital RGB image of an ‘everyday’ scene containing a pin tumbler key (e.g. lying on a table or carpet) and fully automatically inferring a printable 3D key model. We report on the key detection performance and describe how candidates can be transformed into physical prints. We show an example opening a real-world lock. Our system is described in detail, providing a breakdown of all components including key detection, pose normalisation, bitting segmentation and 3D model inference. We provide an in-depth evaluation and conclude by reflecting on limitations, applications, potential security risks and societal impact. We contribute the DeepKey Datasets of 5, 300+ images covering a few test keys with bounding boxes, pose and unaligned mask data.
Tasks
Published	2018-11-04
URL	http://arxiv.org/abs/1811.01405v1
PDF	http://arxiv.org/pdf/1811.01405v1.pdf
PWC	https://paperswithcode.com/paper/deepkey-towards-end-to-end-physical-key
Repo
Framework