Paper Group ANR 595
Domain Aware Neural Dialog System. Learning to select data for transfer learning with Bayesian Optimization. Machine Learning Based Fast Power Integrity Classifier. DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization. Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals. …
Domain Aware Neural Dialog System
Title | Domain Aware Neural Dialog System |
Authors | Sajal Choudhary, Prerna Srivastava, Lyle Ungar, João Sedoc |
Abstract | We investigate the task of building a domain aware chat system which generates intelligent responses in a conversation comprising of different domains. The domain, in this case, is the topic or theme of the conversation. To achieve this, we present DOM-Seq2Seq, a domain aware neural network model based on the novel technique of using domain-targeted sequence-to-sequence models (Sutskever et al., 2014) and a domain classifier. The model captures features from current utterance and domains of the previous utterances to facilitate the formation of relevant responses. We evaluate our model on automatic metrics and compare our performance with the Seq2Seq model. |
Tasks | |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00897v1 |
http://arxiv.org/pdf/1708.00897v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-aware-neural-dialog-system |
Repo | |
Framework | |
Learning to select data for transfer learning with Bayesian Optimization
Title | Learning to select data for transfer learning with Bayesian Optimization |
Authors | Sebastian Ruder, Barbara Plank |
Abstract | Domain similarity measures can be used to gauge adaptability and select suitable data for transfer learning, but existing approaches define ad hoc measures that are deemed suitable for respective tasks. Inspired by work on curriculum learning, we propose to \emph{learn} data selection measures using Bayesian Optimization and evaluate them across models, domains and tasks. Our learned measures outperform existing domain similarity measures significantly on three tasks: sentiment analysis, part-of-speech tagging, and parsing. We show the importance of complementing similarity with diversity, and that learned measures are – to some degree – transferable across models, domains, and even tasks. |
Tasks | Part-Of-Speech Tagging, Sentiment Analysis, Transfer Learning |
Published | 2017-07-17 |
URL | http://arxiv.org/abs/1707.05246v1 |
http://arxiv.org/pdf/1707.05246v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-select-data-for-transfer-learning |
Repo | |
Framework | |
Machine Learning Based Fast Power Integrity Classifier
Title | Machine Learning Based Fast Power Integrity Classifier |
Authors | HuaChun Zhang, Lynden Kagan, Chen Zheng |
Abstract | In this paper, we proposed a new machine learning based fast power integrity classifier that quickly flags the EM/IR hotspots. We discussed the features to extract to describe the power grid, cell power density, routing impact and controlled collapse chip connection (C4) bumps, etc. The continuous and discontinuous cases are identified and treated using different machine learning models. Nearest neighbors, random forest and neural network models are compared to select the best performance candidates. Experiments are run on open source benchmark, and result is showing promising prediction accuracy. |
Tasks | |
Published | 2017-11-08 |
URL | http://arxiv.org/abs/1711.03406v1 |
http://arxiv.org/pdf/1711.03406v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-based-fast-power-integrity |
Repo | |
Framework | |
DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization
Title | DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization |
Authors | Lin Xiao, Adams Wei Yu, Qihang Lin, Weizhu Chen |
Abstract | Machine learning with big data often involves large optimization models. For distributed optimization over a cluster of machines, frequent communication and synchronization of all model parameters (optimization variables) can be very costly. A promising solution is to use parameter servers to store different subsets of the model parameters, and update them asynchronously at different machines using local datasets. In this paper, we focus on distributed optimization of large linear models with convex loss functions, and propose a family of randomized primal-dual block coordinate algorithms that are especially suitable for asynchronous distributed implementation with parameter servers. In particular, we work with the saddle-point formulation of such problems which allows simultaneous data and model partitioning, and exploit its structure by doubly stochastic coordinate optimization with variance reduction (DSCOVR). Compared with other first-order distributed algorithms, we show that DSCOVR may require less amount of overall computation and communication, and less or no synchronization. We discuss the implementation details of the DSCOVR algorithms, and present numerical experiments on an industrial distributed computing system. |
Tasks | Distributed Optimization |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.05080v1 |
http://arxiv.org/pdf/1710.05080v1.pdf | |
PWC | https://paperswithcode.com/paper/dscovr-randomized-primal-dual-block |
Repo | |
Framework | |
Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals
Title | Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals |
Authors | Kiran Karra, Lamine Mili |
Abstract | This paper introduces a nonparametric copula-based index for detecting the strength and monotonicity structure of linear and nonlinear statistical dependence between pairs of random variables or stochastic signals. Our index, termed Copula Index for Detecting Dependence and Monotonicity (CIM), satisfies several desirable properties of measures of association, including Renyi’s properties, the data processing inequality (DPI), and consequently self-equitability. Synthetic data simulations reveal that the statistical power of CIM compares favorably to other state-of-the-art measures of association that are proven to satisfy the DPI. Simulation results with real-world data reveal the CIM’s unique ability to detect the monotonicity structure among stochastic signals to find interesting dependencies in large datasets. Additionally, simulations show that the CIM shows favorable performance to estimators of mutual information when discovering Markov network structure. |
Tasks | |
Published | 2017-03-20 |
URL | http://arxiv.org/abs/1703.06686v5 |
http://arxiv.org/pdf/1703.06686v5.pdf | |
PWC | https://paperswithcode.com/paper/copula-index-for-detecting-dependence-and |
Repo | |
Framework | |
Neural system identification for large populations separating “what” and “where”
Title | Neural system identification for large populations separating “what” and “where” |
Authors | David A. Klindt, Alexander S. Ecker, Thomas Euler, Matthias Bethge |
Abstract | Neuroscientists classify neurons into different types that perform similar computations at different locations in the visual field. Traditional methods for neural system identification do not capitalize on this separation of ‘what’ and ‘where’. Learning deep convolutional feature spaces that are shared among many neurons provides an exciting path forward, but the architectural design needs to account for data limitations: While new experimental techniques enable recordings from thousands of neurons, experimental time is limited so that one can sample only a small fraction of each neuron’s response space. Here, we show that a major bottleneck for fitting convolutional neural networks (CNNs) to neural data is the estimation of the individual receptive field locations, a problem that has been scratched only at the surface thus far. We propose a CNN architecture with a sparse readout layer factorizing the spatial (where) and feature (what) dimensions. Our network scales well to thousands of neurons and short recordings and can be trained end-to-end. We evaluate this architecture on ground-truth data to explore the challenges and limitations of CNN-based system identification. Moreover, we show that our network model outperforms current state-of-the art system identification models of mouse primary visual cortex. |
Tasks | |
Published | 2017-11-07 |
URL | http://arxiv.org/abs/1711.02653v2 |
http://arxiv.org/pdf/1711.02653v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-system-identification-for-large |
Repo | |
Framework | |
Image operator learning coupled with CNN classification and its application to staff line removal
Title | Image operator learning coupled with CNN classification and its application to staff line removal |
Authors | Frank D. Julca-Aguilar, Nina S. T. Hirata |
Abstract | Many image transformations can be modeled by image operators that are characterized by pixel-wise local functions defined on a finite support window. In image operator learning, these functions are estimated from training data using machine learning techniques. Input size is usually a critical issue when using learning algorithms, and it limits the size of practicable windows. We propose the use of convolutional neural networks (CNNs) to overcome this limitation. The problem of removing staff-lines in music score images is chosen to evaluate the effects of window and convolutional mask sizes on the learned image operator performance. Results show that the CNN based solution outperforms previous ones obtained using conventional learning algorithms or heuristic algorithms, indicating the potential of CNNs as base classifiers in image operator learning. The implementations will be made available on the TRIOSlib project site. |
Tasks | |
Published | 2017-09-19 |
URL | http://arxiv.org/abs/1709.06476v1 |
http://arxiv.org/pdf/1709.06476v1.pdf | |
PWC | https://paperswithcode.com/paper/image-operator-learning-coupled-with-cnn |
Repo | |
Framework | |
Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
Title | Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples |
Authors | Amit Sheth, Sujan Perera, Sanjaya Wijeratne, Krishnaprasad Thirunarayan |
Abstract | Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques. |
Tasks | |
Published | 2017-07-14 |
URL | http://arxiv.org/abs/1707.05308v1 |
http://arxiv.org/pdf/1707.05308v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-will-propel-machine-understanding-1 |
Repo | |
Framework | |
Breeding electric zebras in the fields of Medicine
Title | Breeding electric zebras in the fields of Medicine |
Authors | Federico Cabitza |
Abstract | A few notes on the use of machine learning in medicine and the related unintended consequences. |
Tasks | |
Published | 2017-01-15 |
URL | http://arxiv.org/abs/1701.04077v3 |
http://arxiv.org/pdf/1701.04077v3.pdf | |
PWC | https://paperswithcode.com/paper/breeding-electric-zebras-in-the-fields-of |
Repo | |
Framework | |
DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks
Title | DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks |
Authors | Yu Xiang, Dieter Fox |
Abstract | 3D scene understanding is important for robots to interact with the 3D world in a meaningful way. Most previous works on 3D scene understanding focus on recognizing geometrical or semantic properties of the scene independently. In this work, we introduce Data Associated Recurrent Neural Networks (DA-RNNs), a novel framework for joint 3D scene mapping and semantic labeling. DA-RNNs use a new recurrent neural network architecture for semantic labeling on RGB-D videos. The output of the network is integrated with mapping techniques such as KinectFusion in order to inject semantic information into the reconstructed 3D scene. Experiments conducted on a real world dataset and a synthetic dataset with RGB-D videos demonstrate the ability of our method in semantic 3D scene mapping. |
Tasks | Scene Understanding |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03098v2 |
http://arxiv.org/pdf/1703.03098v2.pdf | |
PWC | https://paperswithcode.com/paper/da-rnn-semantic-mapping-with-data-associated |
Repo | |
Framework | |
Posterior sampling for reinforcement learning: worst-case regret bounds
Title | Posterior sampling for reinforcement learning: worst-case regret bounds |
Authors | Shipra Agrawal, Randy Jia |
Abstract | We present an algorithm based on posterior sampling (aka Thompson sampling) that achieves near-optimal worst-case regret bounds when the underlying Markov Decision Process (MDP) is communicating with a finite, though unknown, diameter. Our main result is a high probability regret upper bound of $\tilde{O}(DS\sqrt{AT})$ for any communicating MDP with $S$ states, $A$ actions and diameter $D$. Here, regret compares the total reward achieved by the algorithm to the total expected reward of an optimal infinite-horizon undiscounted average reward policy, in time horizon $T$. This result closely matches the known lower bound of $\Omega(\sqrt{DSAT})$. Our techniques involve proving some novel results about the anti-concentration of Dirichlet distribution, which may be of independent interest. |
Tasks | |
Published | 2017-05-19 |
URL | https://arxiv.org/abs/1705.07041v3 |
https://arxiv.org/pdf/1705.07041v3.pdf | |
PWC | https://paperswithcode.com/paper/posterior-sampling-for-reinforcement-learning |
Repo | |
Framework | |
Stochastic, Distributed and Federated Optimization for Machine Learning
Title | Stochastic, Distributed and Federated Optimization for Machine Learning |
Authors | Jakub Konečný |
Abstract | We study optimization algorithms for the finite sum problems frequently arising in machine learning applications. First, we propose novel variants of stochastic gradient descent with a variance reduction property that enables linear convergence for strongly convex objectives. Second, we study distributed setting, in which the data describing the optimization problem does not fit into a single computing node. In this case, traditional methods are inefficient, as the communication costs inherent in distributed optimization become the bottleneck. We propose a communication-efficient framework which iteratively forms local subproblems that can be solved with arbitrary local optimization algorithms. Finally, we introduce the concept of Federated Optimization/Learning, where we try to solve the machine learning problems without having data stored in any centralized manner. The main motivation comes from industry when handling user-generated data. The current prevalent practice is that companies collect vast amounts of user data and store them in datacenters. An alternative we propose is not to collect the data in first place, and instead occasionally use the computational power of users’ devices to solve the very same optimization problems, while alleviating privacy concerns at the same time. In such setting, minimization of communication rounds is the primary goal, and we demonstrate that solving the optimization problems in such circumstances is conceptually tractable. |
Tasks | Distributed Optimization |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.01155v1 |
http://arxiv.org/pdf/1707.01155v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-distributed-and-federated |
Repo | |
Framework | |
Towards an unanimous international regulatory body for responsible use of Artificial Intelligence [UIRB-AI]
Title | Towards an unanimous international regulatory body for responsible use of Artificial Intelligence [UIRB-AI] |
Authors | Rajesh Chidambaram |
Abstract | Artificial Intelligence (AI), is once again in the phase of drastic advancements. Unarguably, the technology itself can revolutionize the way we live our everyday life. But the exponential growth of technology poses a daunting task for policy researchers and law makers in making amendments to the existing norms. In addition, not everyone in the society is studying the potential socio-economic intricacies and cultural drifts that AI can bring about. It is prudence to reflect from our historical past to propel the development of technology in the right direction. To benefit the society of the present and future, I scientifically explore the societal impact of AI. While there are many public and private partnerships working on similar aspects, here I describe the necessity for an Unanimous International Regulatory Body for all applications of AI (UIRB-AI). I also discuss the benefits and drawbacks of such an organization. To combat any drawbacks in the formation of an UIRB-AI, both idealistic and pragmatic perspectives are discussed alternatively. The paper further advances the discussion by proposing novel policies on how such organization should be structured and how it can bring about a win-win situation for everyone in the society. |
Tasks | |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.07752v3 |
http://arxiv.org/pdf/1712.07752v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-an-unanimous-international-regulatory |
Repo | |
Framework | |
Crafting Adversarial Examples For Speech Paralinguistics Applications
Title | Crafting Adversarial Examples For Speech Paralinguistics Applications |
Authors | Yuan Gong, Christian Poellabauer |
Abstract | Computational paralinguistic analysis is increasingly being used in a wide range of cyber applications, including security-sensitive applications such as speaker verification, deceptive speech detection, and medical diagnostics. While state-of-the-art machine learning techniques, such as deep neural networks, can provide robust and accurate speech analysis, they are susceptible to adversarial attacks. In this work, we propose an end-to-end scheme to generate adversarial examples for computational paralinguistic applications by perturbing directly the raw waveform of an audio recording rather than specific acoustic features. Our experiments show that the proposed adversarial perturbation can lead to a significant performance drop of state-of-the-art deep neural networks, while only minimally impairing the audio quality. |
Tasks | Medical Diagnosis, Speaker Verification |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03280v2 |
http://arxiv.org/pdf/1711.03280v2.pdf | |
PWC | https://paperswithcode.com/paper/crafting-adversarial-examples-for-speech |
Repo | |
Framework | |
Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin
Title | Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin |
Authors | Ritambhara Singh, Jack Lanchantin, Arshdeep Sekhon, Yanjun Qi |
Abstract | The past decade has seen a revolution in genomic technologies that enable a flood of genome-wide profiling of chromatin marks. Recent literature tried to understand gene regulation by predicting gene expression from large-scale chromatin measurements. Two fundamental challenges exist for such learning tasks: (1) genome-wide chromatin signals are spatially structured, high-dimensional and highly modular; and (2) the core aim is to understand what are the relevant factors and how they work together? Previous studies either failed to model complex dependencies among input signals or relied on separate feature analysis to explain the decisions. This paper presents an attention-based deep learning approach; we call AttentiveChrome, that uses a unified architecture to model and to interpret dependencies among chromatin factors for controlling gene regulation. AttentiveChrome uses a hierarchy of multiple Long short-term memory (LSTM) modules to encode the input signals and to model how various chromatin marks cooperate automatically. AttentiveChrome trains two levels of attention jointly with the target prediction, enabling it to attend differentially to relevant marks and to locate important positions per mark. We evaluate the model across 56 different cell types (tasks) in human. Not only is the proposed architecture more accurate, but its attention scores also provide a better interpretation than state-of-the-art feature visualization methods such as saliency map. Code and data are shared at www.deepchrome.org |
Tasks | |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.00339v3 |
http://arxiv.org/pdf/1708.00339v3.pdf | |
PWC | https://paperswithcode.com/paper/attend-and-predict-understanding-gene |
Repo | |
Framework | |