July 27, 2019

2686 words 13 mins read

Paper Group ANR 595

Domain Aware Neural Dialog System. Learning to select data for transfer learning with Bayesian Optimization. Machine Learning Based Fast Power Integrity Classifier. DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization. Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals. …

Domain Aware Neural Dialog System


Title	Domain Aware Neural Dialog System
Authors	Sajal Choudhary, Prerna Srivastava, Lyle Ungar, João Sedoc
Abstract	We investigate the task of building a domain aware chat system which generates intelligent responses in a conversation comprising of different domains. The domain, in this case, is the topic or theme of the conversation. To achieve this, we present DOM-Seq2Seq, a domain aware neural network model based on the novel technique of using domain-targeted sequence-to-sequence models (Sutskever et al., 2014) and a domain classifier. The model captures features from current utterance and domains of the previous utterances to facilitate the formation of relevant responses. We evaluate our model on automatic metrics and compare our performance with the Seq2Seq model.
Tasks
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00897v1
PDF	http://arxiv.org/pdf/1708.00897v1.pdf
PWC	https://paperswithcode.com/paper/domain-aware-neural-dialog-system
Repo
Framework

Learning to select data for transfer learning with Bayesian Optimization


Title	Learning to select data for transfer learning with Bayesian Optimization
Authors	Sebastian Ruder, Barbara Plank
Abstract	Domain similarity measures can be used to gauge adaptability and select suitable data for transfer learning, but existing approaches define ad hoc measures that are deemed suitable for respective tasks. Inspired by work on curriculum learning, we propose to \emph{learn} data selection measures using Bayesian Optimization and evaluate them across models, domains and tasks. Our learned measures outperform existing domain similarity measures significantly on three tasks: sentiment analysis, part-of-speech tagging, and parsing. We show the importance of complementing similarity with diversity, and that learned measures are – to some degree – transferable across models, domains, and even tasks.
Tasks	Part-Of-Speech Tagging, Sentiment Analysis, Transfer Learning
Published	2017-07-17
URL	http://arxiv.org/abs/1707.05246v1
PDF	http://arxiv.org/pdf/1707.05246v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-select-data-for-transfer-learning
Repo
Framework

Machine Learning Based Fast Power Integrity Classifier


Title	Machine Learning Based Fast Power Integrity Classifier
Authors	HuaChun Zhang, Lynden Kagan, Chen Zheng
Abstract	In this paper, we proposed a new machine learning based fast power integrity classifier that quickly flags the EM/IR hotspots. We discussed the features to extract to describe the power grid, cell power density, routing impact and controlled collapse chip connection (C4) bumps, etc. The continuous and discontinuous cases are identified and treated using different machine learning models. Nearest neighbors, random forest and neural network models are compared to select the best performance candidates. Experiments are run on open source benchmark, and result is showing promising prediction accuracy.
Tasks
Published	2017-11-08
URL	http://arxiv.org/abs/1711.03406v1
PDF	http://arxiv.org/pdf/1711.03406v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-based-fast-power-integrity
Repo
Framework

DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization


Title	DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization
Authors	Lin Xiao, Adams Wei Yu, Qihang Lin, Weizhu Chen
Abstract	Machine learning with big data often involves large optimization models. For distributed optimization over a cluster of machines, frequent communication and synchronization of all model parameters (optimization variables) can be very costly. A promising solution is to use parameter servers to store different subsets of the model parameters, and update them asynchronously at different machines using local datasets. In this paper, we focus on distributed optimization of large linear models with convex loss functions, and propose a family of randomized primal-dual block coordinate algorithms that are especially suitable for asynchronous distributed implementation with parameter servers. In particular, we work with the saddle-point formulation of such problems which allows simultaneous data and model partitioning, and exploit its structure by doubly stochastic coordinate optimization with variance reduction (DSCOVR). Compared with other first-order distributed algorithms, we show that DSCOVR may require less amount of overall computation and communication, and less or no synchronization. We discuss the implementation details of the DSCOVR algorithms, and present numerical experiments on an industrial distributed computing system.
Tasks	Distributed Optimization
Published	2017-10-13
URL	http://arxiv.org/abs/1710.05080v1
PDF	http://arxiv.org/pdf/1710.05080v1.pdf
PWC	https://paperswithcode.com/paper/dscovr-randomized-primal-dual-block
Repo
Framework

Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals


Title	Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals
Authors	Kiran Karra, Lamine Mili
Abstract	This paper introduces a nonparametric copula-based index for detecting the strength and monotonicity structure of linear and nonlinear statistical dependence between pairs of random variables or stochastic signals. Our index, termed Copula Index for Detecting Dependence and Monotonicity (CIM), satisfies several desirable properties of measures of association, including Renyi’s properties, the data processing inequality (DPI), and consequently self-equitability. Synthetic data simulations reveal that the statistical power of CIM compares favorably to other state-of-the-art measures of association that are proven to satisfy the DPI. Simulation results with real-world data reveal the CIM’s unique ability to detect the monotonicity structure among stochastic signals to find interesting dependencies in large datasets. Additionally, simulations show that the CIM shows favorable performance to estimators of mutual information when discovering Markov network structure.
Tasks
Published	2017-03-20
URL	http://arxiv.org/abs/1703.06686v5
PDF	http://arxiv.org/pdf/1703.06686v5.pdf
PWC	https://paperswithcode.com/paper/copula-index-for-detecting-dependence-and
Repo
Framework

Neural system identification for large populations separating “what” and “where”


Title	Neural system identification for large populations separating “what” and “where”
Authors	David A. Klindt, Alexander S. Ecker, Thomas Euler, Matthias Bethge
Abstract	Neuroscientists classify neurons into different types that perform similar computations at different locations in the visual field. Traditional methods for neural system identification do not capitalize on this separation of ‘what’ and ‘where’. Learning deep convolutional feature spaces that are shared among many neurons provides an exciting path forward, but the architectural design needs to account for data limitations: While new experimental techniques enable recordings from thousands of neurons, experimental time is limited so that one can sample only a small fraction of each neuron’s response space. Here, we show that a major bottleneck for fitting convolutional neural networks (CNNs) to neural data is the estimation of the individual receptive field locations, a problem that has been scratched only at the surface thus far. We propose a CNN architecture with a sparse readout layer factorizing the spatial (where) and feature (what) dimensions. Our network scales well to thousands of neurons and short recordings and can be trained end-to-end. We evaluate this architecture on ground-truth data to explore the challenges and limitations of CNN-based system identification. Moreover, we show that our network model outperforms current state-of-the art system identification models of mouse primary visual cortex.
Tasks
Published	2017-11-07
URL	http://arxiv.org/abs/1711.02653v2
PDF	http://arxiv.org/pdf/1711.02653v2.pdf
PWC	https://paperswithcode.com/paper/neural-system-identification-for-large
Repo
Framework

Image operator learning coupled with CNN classification and its application to staff line removal


Title	Image operator learning coupled with CNN classification and its application to staff line removal
Authors	Frank D. Julca-Aguilar, Nina S. T. Hirata
Abstract	Many image transformations can be modeled by image operators that are characterized by pixel-wise local functions defined on a finite support window. In image operator learning, these functions are estimated from training data using machine learning techniques. Input size is usually a critical issue when using learning algorithms, and it limits the size of practicable windows. We propose the use of convolutional neural networks (CNNs) to overcome this limitation. The problem of removing staff-lines in music score images is chosen to evaluate the effects of window and convolutional mask sizes on the learned image operator performance. Results show that the CNN based solution outperforms previous ones obtained using conventional learning algorithms or heuristic algorithms, indicating the potential of CNNs as base classifiers in image operator learning. The implementations will be made available on the TRIOSlib project site.
Tasks
Published	2017-09-19
URL	http://arxiv.org/abs/1709.06476v1
PDF	http://arxiv.org/pdf/1709.06476v1.pdf
PWC	https://paperswithcode.com/paper/image-operator-learning-coupled-with-cnn
Repo
Framework

Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples


Title	Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
Authors	Amit Sheth, Sujan Perera, Sanjaya Wijeratne, Krishnaprasad Thirunarayan
Abstract	Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.
Tasks
Published	2017-07-14
URL	http://arxiv.org/abs/1707.05308v1
PDF	http://arxiv.org/pdf/1707.05308v1.pdf
PWC	https://paperswithcode.com/paper/knowledge-will-propel-machine-understanding-1
Repo
Framework

Breeding electric zebras in the fields of Medicine


Title	Breeding electric zebras in the fields of Medicine
Authors	Federico Cabitza
Abstract	A few notes on the use of machine learning in medicine and the related unintended consequences.
Tasks
Published	2017-01-15
URL	http://arxiv.org/abs/1701.04077v3
PDF	http://arxiv.org/pdf/1701.04077v3.pdf
PWC	https://paperswithcode.com/paper/breeding-electric-zebras-in-the-fields-of
Repo
Framework

DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks


Title	DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks
Authors	Yu Xiang, Dieter Fox
Abstract	3D scene understanding is important for robots to interact with the 3D world in a meaningful way. Most previous works on 3D scene understanding focus on recognizing geometrical or semantic properties of the scene independently. In this work, we introduce Data Associated Recurrent Neural Networks (DA-RNNs), a novel framework for joint 3D scene mapping and semantic labeling. DA-RNNs use a new recurrent neural network architecture for semantic labeling on RGB-D videos. The output of the network is integrated with mapping techniques such as KinectFusion in order to inject semantic information into the reconstructed 3D scene. Experiments conducted on a real world dataset and a synthetic dataset with RGB-D videos demonstrate the ability of our method in semantic 3D scene mapping.
Tasks	Scene Understanding
Published	2017-03-09
URL	http://arxiv.org/abs/1703.03098v2
PDF	http://arxiv.org/pdf/1703.03098v2.pdf
PWC	https://paperswithcode.com/paper/da-rnn-semantic-mapping-with-data-associated
Repo
Framework

Posterior sampling for reinforcement learning: worst-case regret bounds


Title	Posterior sampling for reinforcement learning: worst-case regret bounds
Authors	Shipra Agrawal, Randy Jia
Abstract	We present an algorithm based on posterior sampling (aka Thompson sampling) that achieves near-optimal worst-case regret bounds when the underlying Markov Decision Process (MDP) is communicating with a finite, though unknown, diameter. Our main result is a high probability regret upper bound of $\tilde{O}(DS\sqrt{AT})$ for any communicating MDP with $S$ states, $A$ actions and diameter $D$. Here, regret compares the total reward achieved by the algorithm to the total expected reward of an optimal infinite-horizon undiscounted average reward policy, in time horizon $T$. This result closely matches the known lower bound of $\Omega(\sqrt{DSAT})$. Our techniques involve proving some novel results about the anti-concentration of Dirichlet distribution, which may be of independent interest.
Tasks
Published	2017-05-19
URL	https://arxiv.org/abs/1705.07041v3
PDF	https://arxiv.org/pdf/1705.07041v3.pdf
PWC	https://paperswithcode.com/paper/posterior-sampling-for-reinforcement-learning
Repo
Framework

Stochastic, Distributed and Federated Optimization for Machine Learning


Title	Stochastic, Distributed and Federated Optimization for Machine Learning
Authors	Jakub Konečný
Abstract	We study optimization algorithms for the finite sum problems frequently arising in machine learning applications. First, we propose novel variants of stochastic gradient descent with a variance reduction property that enables linear convergence for strongly convex objectives. Second, we study distributed setting, in which the data describing the optimization problem does not fit into a single computing node. In this case, traditional methods are inefficient, as the communication costs inherent in distributed optimization become the bottleneck. We propose a communication-efficient framework which iteratively forms local subproblems that can be solved with arbitrary local optimization algorithms. Finally, we introduce the concept of Federated Optimization/Learning, where we try to solve the machine learning problems without having data stored in any centralized manner. The main motivation comes from industry when handling user-generated data. The current prevalent practice is that companies collect vast amounts of user data and store them in datacenters. An alternative we propose is not to collect the data in first place, and instead occasionally use the computational power of users’ devices to solve the very same optimization problems, while alleviating privacy concerns at the same time. In such setting, minimization of communication rounds is the primary goal, and we demonstrate that solving the optimization problems in such circumstances is conceptually tractable.
Tasks	Distributed Optimization
Published	2017-07-04
URL	http://arxiv.org/abs/1707.01155v1
PDF	http://arxiv.org/pdf/1707.01155v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-distributed-and-federated
Repo
Framework

Towards an unanimous international regulatory body for responsible use of Artificial Intelligence [UIRB-AI]


Title	Towards an unanimous international regulatory body for responsible use of Artificial Intelligence [UIRB-AI]
Authors	Rajesh Chidambaram
Abstract	Artificial Intelligence (AI), is once again in the phase of drastic advancements. Unarguably, the technology itself can revolutionize the way we live our everyday life. But the exponential growth of technology poses a daunting task for policy researchers and law makers in making amendments to the existing norms. In addition, not everyone in the society is studying the potential socio-economic intricacies and cultural drifts that AI can bring about. It is prudence to reflect from our historical past to propel the development of technology in the right direction. To benefit the society of the present and future, I scientifically explore the societal impact of AI. While there are many public and private partnerships working on similar aspects, here I describe the necessity for an Unanimous International Regulatory Body for all applications of AI (UIRB-AI). I also discuss the benefits and drawbacks of such an organization. To combat any drawbacks in the formation of an UIRB-AI, both idealistic and pragmatic perspectives are discussed alternatively. The paper further advances the discussion by proposing novel policies on how such organization should be structured and how it can bring about a win-win situation for everyone in the society.
Tasks
Published	2017-12-21
URL	http://arxiv.org/abs/1712.07752v3
PDF	http://arxiv.org/pdf/1712.07752v3.pdf
PWC	https://paperswithcode.com/paper/towards-an-unanimous-international-regulatory
Repo
Framework

Crafting Adversarial Examples For Speech Paralinguistics Applications


Title	Crafting Adversarial Examples For Speech Paralinguistics Applications
Authors	Yuan Gong, Christian Poellabauer
Abstract	Computational paralinguistic analysis is increasingly being used in a wide range of cyber applications, including security-sensitive applications such as speaker verification, deceptive speech detection, and medical diagnostics. While state-of-the-art machine learning techniques, such as deep neural networks, can provide robust and accurate speech analysis, they are susceptible to adversarial attacks. In this work, we propose an end-to-end scheme to generate adversarial examples for computational paralinguistic applications by perturbing directly the raw waveform of an audio recording rather than specific acoustic features. Our experiments show that the proposed adversarial perturbation can lead to a significant performance drop of state-of-the-art deep neural networks, while only minimally impairing the audio quality.
Tasks	Medical Diagnosis, Speaker Verification
Published	2017-11-09
URL	http://arxiv.org/abs/1711.03280v2
PDF	http://arxiv.org/pdf/1711.03280v2.pdf
PWC	https://paperswithcode.com/paper/crafting-adversarial-examples-for-speech
Repo
Framework

Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin


Title	Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin
Authors	Ritambhara Singh, Jack Lanchantin, Arshdeep Sekhon, Yanjun Qi
Abstract	The past decade has seen a revolution in genomic technologies that enable a flood of genome-wide profiling of chromatin marks. Recent literature tried to understand gene regulation by predicting gene expression from large-scale chromatin measurements. Two fundamental challenges exist for such learning tasks: (1) genome-wide chromatin signals are spatially structured, high-dimensional and highly modular; and (2) the core aim is to understand what are the relevant factors and how they work together? Previous studies either failed to model complex dependencies among input signals or relied on separate feature analysis to explain the decisions. This paper presents an attention-based deep learning approach; we call AttentiveChrome, that uses a unified architecture to model and to interpret dependencies among chromatin factors for controlling gene regulation. AttentiveChrome uses a hierarchy of multiple Long short-term memory (LSTM) modules to encode the input signals and to model how various chromatin marks cooperate automatically. AttentiveChrome trains two levels of attention jointly with the target prediction, enabling it to attend differentially to relevant marks and to locate important positions per mark. We evaluate the model across 56 different cell types (tasks) in human. Not only is the proposed architecture more accurate, but its attention scores also provide a better interpretation than state-of-the-art feature visualization methods such as saliency map. Code and data are shared at www.deepchrome.org
Tasks
Published	2017-08-01
URL	http://arxiv.org/abs/1708.00339v3
PDF	http://arxiv.org/pdf/1708.00339v3.pdf
PWC	https://paperswithcode.com/paper/attend-and-predict-understanding-gene
Repo
Framework