Paper Group ANR 4
Detection of Block-Exchangeable Structure in Large-Scale Correlation Matrices. Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments. Where Classification Fails, Interpretation Rises. Dynamic Fusion Networks for Machine Reading Comprehension. Towards Unsupervised Weed Scouting for Agricultural Robotics. Netw …
Detection of Block-Exchangeable Structure in Large-Scale Correlation Matrices
Title | Detection of Block-Exchangeable Structure in Large-Scale Correlation Matrices |
Authors | Samuel Perreault, Thierry Duchesne, Johanna G. Nešlehová |
Abstract | Correlation matrices are omnipresent in multivariate data analysis. When the number d of variables is large, the sample estimates of correlation matrices are typically noisy and conceal underlying dependence patterns. We consider the case when the variables can be grouped into K clusters with exchangeable dependence; this assumption is often made in applications, e.g., in finance and econometrics. Under this partial exchangeability condition, the corresponding correlation matrix has a block structure and the number of unknown parameters is reduced from d(d-1)/2 to at most K(K+1)/2. We propose a robust algorithm based on Kendall’s rank correlation to identify the clusters without assuming the knowledge of K a priori or anything about the margins except continuity. The corresponding block-structured estimator performs considerably better than the sample Kendall rank correlation matrix when K < d. The new estimator can also be much more efficient in finite samples even in the unstructured case K = d, although there is no gain asymptotically. When the distribution of the data is elliptical, the results extend to linear correlation matrices and their inverses. The procedure is illustrated on financial stock returns. |
Tasks | |
Published | 2017-06-19 |
URL | http://arxiv.org/abs/1706.05940v3 |
http://arxiv.org/pdf/1706.05940v3.pdf | |
PWC | https://paperswithcode.com/paper/detection-of-block-exchangeable-structure-in |
Repo | |
Framework | |
Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments
Title | Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments |
Authors | Victor Chernozhukov, Mert Demirer, Esther Duflo, Iván Fernández-Val |
Abstract | We propose strategies to estimate and make inference on key features of heterogeneous effects in randomized experiments. These key features include best linear predictors of the effects using machine learning proxies, average effects sorted by impact groups, and average characteristics of most and least impacted units. The approach is valid in high dimensional settings, where the effects are proxied by machine learning methods. We post-process these proxies into the estimates of the key features. Our approach is generic, it can be used in conjunction with penalized methods, deep and shallow neural networks, canonical and new random forests, boosted trees, and ensemble methods. It does not rely on strong assumptions. In particular, we don’t require conditions for consistency of the machine learning methods. Estimation and inference relies on repeated data splitting to avoid overfitting and achieve validity. For inference, we take medians of p-values and medians of confidence intervals, resulting from many different data splits, and then adjust their nominal level to guarantee uniform validity. This variational inference method is shown to be uniformly valid and quantifies the uncertainty coming from both parameter estimation and data splitting. We illustrate the use of the approach with two randomized experiments in development on the effects of microcredit and nudges to stimulate immunization demand. |
Tasks | |
Published | 2017-12-13 |
URL | https://arxiv.org/abs/1712.04802v4 |
https://arxiv.org/pdf/1712.04802v4.pdf | |
PWC | https://paperswithcode.com/paper/generic-machine-learning-inference-on |
Repo | |
Framework | |
Where Classification Fails, Interpretation Rises
Title | Where Classification Fails, Interpretation Rises |
Authors | Chanh Nguyen, Georgi Georgiev, Yujie Ji, Ting Wang |
Abstract | An intriguing property of deep neural networks is their inherent vulnerability to adversarial inputs, which significantly hinders their application in security-critical domains. Most existing detection methods attempt to use carefully engineered patterns to distinguish adversarial inputs from their genuine counterparts, which however can often be circumvented by adaptive adversaries. In this work, we take a completely different route by leveraging the definition of adversarial inputs: while deceiving for deep neural networks, they are barely discernible for human visions. Building upon recent advances in interpretable models, we construct a new detection framework that contrasts an input’s interpretation against its classification. We validate the efficacy of this framework through extensive experiments using benchmark datasets and attacks. We believe that this work opens a new direction for designing adversarial input detection methods. |
Tasks | |
Published | 2017-12-02 |
URL | http://arxiv.org/abs/1712.00558v1 |
http://arxiv.org/pdf/1712.00558v1.pdf | |
PWC | https://paperswithcode.com/paper/where-classification-fails-interpretation |
Repo | |
Framework | |
Dynamic Fusion Networks for Machine Reading Comprehension
Title | Dynamic Fusion Networks for Machine Reading Comprehension |
Authors | Yichong Xu, Jingjing Liu, Jianfeng Gao, Yelong Shen, Xiaodong Liu |
Abstract | This paper presents a novel neural model - Dynamic Fusion Network (DFN), for machine reading comprehension (MRC). DFNs differ from most state-of-the-art models in their use of a dynamic multi-strategy attention process, in which passages, questions and answer candidates are jointly fused into attention vectors, along with a dynamic multi-step reasoning module for generating answers. With the use of reinforcement learning, for each input sample that consists of a question, a passage and a list of candidate answers, an instance of DFN with a sample-specific network architecture can be dynamically constructed by determining what attention strategy to apply and how many reasoning steps to take. Experiments show that DFNs achieve the best result reported on RACE, a challenging MRC dataset that contains real human reading questions in a wide variety of types. A detailed empirical analysis also demonstrates that DFNs can produce attention vectors that summarize information from questions, passages and answer candidates more effectively than other popular MRC models. |
Tasks | Machine Reading Comprehension, Reading Comprehension |
Published | 2017-11-14 |
URL | http://arxiv.org/abs/1711.04964v2 |
http://arxiv.org/pdf/1711.04964v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-fusion-networks-for-machine-reading |
Repo | |
Framework | |
Towards Unsupervised Weed Scouting for Agricultural Robotics
Title | Towards Unsupervised Weed Scouting for Agricultural Robotics |
Authors | David Hall, Feras Dayoub, Jason Kulk, Chris McCool |
Abstract | Weed scouting is an important part of modern integrated weed management but can be time consuming and sparse when performed manually. Automated weed scouting and weed destruction has typically been performed using classification systems able to classify a set group of species known a priori. This greatly limits deployability as classification systems must be retrained for any field with a different set of weed species present within them. In order to overcome this limitation, this paper works towards developing a clustering approach to weed scouting which can be utilized in any field without the need for prior species knowledge. We demonstrate our system using challenging data collected in the field from an agricultural robotics platform. We show that considerable improvements can be made by (i) learning low-dimensional (bottleneck) features using a deep convolutional neural network to represent plants in general and (ii) tying views of the same area (plant) together. Deploying this algorithm on in-field data collected by AgBotII, we are able to successfully cluster cotton plants from grasses without prior knowledge or training for the specific plants in the field. |
Tasks | |
Published | 2017-02-04 |
URL | http://arxiv.org/abs/1702.01247v2 |
http://arxiv.org/pdf/1702.01247v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-unsupervised-weed-scouting-for |
Repo | |
Framework | |
Network-based methods for outcome prediction in the “sample space”
Title | Network-based methods for outcome prediction in the “sample space” |
Authors | Jessica Gliozzo |
Abstract | In this thesis we present the novel semi-supervised network-based algorithm P-Net, which is able to rank and classify patients with respect to a specific phenotype or clinical outcome under study. The peculiar and innovative characteristic of this method is that it builds a network of samples/patients, where the nodes represent the samples and the edges are functional or genetic relationships between individuals (e.g. similarity of expression profiles), to predict the phenotype under study. In other words, it constructs the network in the “sample space” and not in the “biomarker space” (where nodes represent biomolecules (e.g. genes, proteins) and edges represent functional or genetic relationships between nodes), as usual in state-of-the-art methods. To assess the performances of P-Net, we apply it on three different publicly available datasets from patients afflicted with a specific type of tumor: pancreatic cancer, melanoma and ovarian cancer dataset, by using the data and following the experimental set-up proposed in two recently published papers [Barter et al., 2014, Winter et al., 2012]. We show that network-based methods in the “sample space” can achieve results competitive with classical supervised inductive systems. Moreover, the graph representation of the samples can be easily visualized through networks and can be used to gain visual clues about the relationships between samples, taking into account the phenotype associated or predicted for each sample. To our knowledge this is one of the first works that proposes graph-based algorithms working in the “sample space” of the biomolecular profiles of the patients to predict their phenotype or outcome, thus contributing to a novel research line in the framework of the Network Medicine. |
Tasks | |
Published | 2017-02-04 |
URL | http://arxiv.org/abs/1702.01268v1 |
http://arxiv.org/pdf/1702.01268v1.pdf | |
PWC | https://paperswithcode.com/paper/network-based-methods-for-outcome-prediction |
Repo | |
Framework | |
Employee turnover prediction and retention policies design: a case study
Title | Employee turnover prediction and retention policies design: a case study |
Authors | Edouard Ribes, Karim Touahri, Benoît Perthame |
Abstract | This paper illustrates the similarities between the problems of customer churn and employee turnover. An example of employee turnover prediction model leveraging classical machine learning techniques is developed. Model outputs are then discussed to design & test employee retention policies. This type of retention discussion is, to our knowledge, innovative and constitutes the main value of this paper. |
Tasks | |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01377v1 |
http://arxiv.org/pdf/1707.01377v1.pdf | |
PWC | https://paperswithcode.com/paper/employee-turnover-prediction-and-retention |
Repo | |
Framework | |
Optical Flow in Mostly Rigid Scenes
Title | Optical Flow in Mostly Rigid Scenes |
Authors | Jonas Wulff, Laura Sevilla-Lara, Michael J. Black |
Abstract | The optical flow of natural scenes is a combination of the motion of the observer and the independent motion of objects. Existing algorithms typically focus on either recovering motion and structure under the assumption of a purely static world or optical flow for general unconstrained scenes. We combine these approaches in an optical flow algorithm that estimates an explicit segmentation of moving objects from appearance and physical constraints. In static regions we take advantage of strong constraints to jointly estimate the camera motion and the 3D structure of the scene over multiple frames. This allows us to also regularize the structure instead of the motion. Our formulation uses a Plane+Parallax framework, which works even under small baselines, and reduces the motion estimation to a one-dimensional search problem, resulting in more accurate estimation. In moving regions the flow is treated as unconstrained, and computed with an existing optical flow method. The resulting Mostly-Rigid Flow (MR-Flow) method achieves state-of-the-art results on both the MPI-Sintel and KITTI-2015 benchmarks. |
Tasks | Motion Estimation, Optical Flow Estimation |
Published | 2017-05-03 |
URL | http://arxiv.org/abs/1705.01352v1 |
http://arxiv.org/pdf/1705.01352v1.pdf | |
PWC | https://paperswithcode.com/paper/optical-flow-in-mostly-rigid-scenes |
Repo | |
Framework | |
Multiple Instance Learning with the Optimal Sub-Pattern Assignment Metric
Title | Multiple Instance Learning with the Optimal Sub-Pattern Assignment Metric |
Authors | Quang N. Tran, Ba-Ngu Vo, Dinh Phung, Ba-Tuong Vo, Thuong Nguyen |
Abstract | Multiple instance data are sets or multi-sets of unordered elements. Using metrics or distances for sets, we propose an approach to several multiple instance learning tasks, such as clustering (unsupervised learning), classification (supervised learning), and novelty detection (semi-supervised learning). In particular, we introduce the Optimal Sub-Pattern Assignment metric to multiple instance learning so as to provide versatile design choices. Numerical experiments on both simulated and real data are presented to illustrate the versatility of the proposed solution. |
Tasks | Multiple Instance Learning |
Published | 2017-03-27 |
URL | http://arxiv.org/abs/1703.08933v1 |
http://arxiv.org/pdf/1703.08933v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-instance-learning-with-the-optimal |
Repo | |
Framework | |
A New Data-Driven Sparse-Learning Approach to Study Chemical Reaction Networks
Title | A New Data-Driven Sparse-Learning Approach to Study Chemical Reaction Networks |
Authors | Farshad Harirchi, Doohyun Kim, Omar A. Khalil, Sijia Liu, Paolo Elvati, Angela Violi, Alfred O. Hero |
Abstract | Chemical kinetic mechanisms can be represented by sets of elementary reactions that are easily translated into mathematical terms using physicochemical relationships. The schematic representation of reactions captures the interactions between reacting species and products. Determining the minimal chemical interactions underlying the dynamic behavior of systems is a major task. In this paper, we introduce a novel approach for the identification of the influential reactions in chemical reaction networks for combustion applications, using a data-driven sparse-learning technique. The proposed approach identifies a set of influential reactions using species concentrations and reaction rates, with minimal computational cost without requiring additional data or simulations. The new approach is applied to analyze the combustion chemistry of H2 and C3H8 in a constant-volume homogeneous reactor. The influential reactions identified by the sparse-learning method are consistent with the current kinetics knowledge of chemical mechanisms. Additionally, we show that a reduced version of the parent mechanism can be generated as a combination of the influential reactions identified at different times and conditions and that for both H2 and C3H8 this reduced mechanism performs closely to the parent mechanism as a function of ignition delay over a wide range of conditions. Our results demonstrate the potential of the sparse-learning approach as an effective and efficient tool for mechanism analysis and mechanism reduction. |
Tasks | Sparse Learning |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06281v3 |
http://arxiv.org/pdf/1712.06281v3.pdf | |
PWC | https://paperswithcode.com/paper/a-new-data-driven-sparse-learning-approach-to |
Repo | |
Framework | |
Duluth at Semeval-2017 Task 7 : Puns upon a midnight dreary, Lexical Semantics for the weak and weary
Title | Duluth at Semeval-2017 Task 7 : Puns upon a midnight dreary, Lexical Semantics for the weak and weary |
Authors | Ted Pedersen |
Abstract | This paper describes the Duluth systems that participated in SemEval-2017 Task 7 : Detection and Interpretation of English Puns. The Duluth systems participated in all three subtasks, and relied on methods that included word sense disambiguation and measures of semantic relatedness. |
Tasks | Word Sense Disambiguation |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1704.08388v2 |
http://arxiv.org/pdf/1704.08388v2.pdf | |
PWC | https://paperswithcode.com/paper/duluth-at-semeval-2017-task-7-puns-upon-a |
Repo | |
Framework | |
An Attention Mechanism for Answer Selection Using a Combined Global and Local View
Title | An Attention Mechanism for Answer Selection Using a Combined Global and Local View |
Authors | Yoram Bachrach, Andrej Zukov-Gregoric, Sam Coope, Ed Tovell, Bogdan Maksak, Jose Rodriguez, Conan McMurtie |
Abstract | We propose a new attention mechanism for neural based question answering, which depends on varying granularities of the input. Previous work focused on augmenting recurrent neural networks with simple attention mechanisms which are a function of the similarity between a question embedding and an answer embeddings across time. We extend this by making the attention mechanism dependent on a global embedding of the answer attained using a separate network. We evaluate our system on InsuranceQA, a large question answering dataset. Our model outperforms current state-of-the-art results on InsuranceQA. Further, we visualize which sections of text our attention mechanism focuses on, and explore its performance across different parameter settings. |
Tasks | Answer Selection, Question Answering |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01378v4 |
http://arxiv.org/pdf/1707.01378v4.pdf | |
PWC | https://paperswithcode.com/paper/an-attention-mechanism-for-answer-selection |
Repo | |
Framework | |
Product recognition in store shelves as a sub-graph isomorphism problem
Title | Product recognition in store shelves as a sub-graph isomorphism problem |
Authors | Alessio Tonioni, Luigi Di Stefano |
Abstract | The arrangement of products in store shelves is carefully planned to maximize sales and keep customers happy. However, verifying compliance of real shelves to the ideal layout is a costly task routinely performed by the store personnel. In this paper, we propose a computer vision pipeline to recognize products on shelves and verify compliance to the planned layout. We deploy local invariant features together with a novel formulation of the product recognition problem as a sub-graph isomorphism between the items appearing in the given image and the ideal layout. This allows for auto-localizing the given image within the aisle or store and improving recognition dramatically. |
Tasks | |
Published | 2017-07-26 |
URL | http://arxiv.org/abs/1707.08378v2 |
http://arxiv.org/pdf/1707.08378v2.pdf | |
PWC | https://paperswithcode.com/paper/product-recognition-in-store-shelves-as-a-sub |
Repo | |
Framework | |
Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices
Title | Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices |
Authors | Tayfun Gokmen, O. Murat Onen, Wilfried Haensch |
Abstract | In a previous work we have detailed the requirements to obtain a maximal performance benefit by implementing fully connected deep neural networks (DNN) in form of arrays of resistive devices for deep learning. This concept of Resistive Processing Unit (RPU) devices we extend here towards convolutional neural networks (CNNs). We show how to map the convolutional layers to RPU arrays such that the parallelism of the hardware can be fully utilized in all three cycles of the backpropagation algorithm. We find that the noise and bound limitations imposed due to analog nature of the computations performed on the arrays effect the training accuracy of the CNNs. Noise and bound management techniques are presented that mitigate these problems without introducing any additional complexity in the analog circuits and can be addressed by the digital circuits. In addition, we discuss digitally programmable update management and device variability reduction techniques that can be used selectively for some of the layers in a CNN. We show that combination of all those techniques enables a successful application of the RPU concept for training CNNs. The techniques discussed here are more general and can be applied beyond CNN architectures and therefore enables applicability of RPU approach for large class of neural network architectures. |
Tasks | |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.08014v1 |
http://arxiv.org/pdf/1705.08014v1.pdf | |
PWC | https://paperswithcode.com/paper/training-deep-convolutional-neural-networks |
Repo | |
Framework | |
Proceedings of the Workshop on Data Mining for Oil and Gas
Title | Proceedings of the Workshop on Data Mining for Oil and Gas |
Authors | Alipio Jorge, German Larrazabal, Pablo Guillen, Rui L. Lopes |
Abstract | The process of exploring and exploiting Oil and Gas (O&G) generates a lot of data that can bring more efficiency to the industry. The opportunities for using data mining techniques in the “digital oil-field” remain largely unexplored or uncharted. With the high rate of data expansion, companies are scrambling to develop ways to develop near-real-time predictive analytics, data mining and machine learning capabilities, and are expanding their data storage infrastructure and resources. With these new goals, come the challenges of managing data growth, integrating intelligence tools, and analyzing the data to glean useful insights. Oil and Gas companies need data solutions to economically extract value from very large volumes of a wide variety of data generated from exploration, well drilling and production devices and sensors. Data mining for oil and gas industry throughout the lifecycle of the reservoir includes the following roles: locating hydrocarbons, managing geological data, drilling and formation evaluation, well construction, well completion, and optimizing production through the life of the oil field. For each of these phases during the lifecycle of oil field, data mining play a significant role. Based on which phase were talking about, knowledge creation through scientific models, data analytics and machine learning, a effective, productive, and on demand data insight is critical for decision making within the organization. The significant challenges posed by this complex and economically vital field justify a meeting of data scientists that are willing to share their experience and knowledge. Thus, the Worskhop on Data Mining for Oil and Gas (DM4OG) aims to provide a quality forum for researchers that work on the significant challenges arising from the synergy between data science, machine learning, and the modeling and optimization problems in the O&G industry. |
Tasks | Decision Making |
Published | 2017-05-09 |
URL | http://arxiv.org/abs/1705.03451v2 |
http://arxiv.org/pdf/1705.03451v2.pdf | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-workshop-on-data-mining |
Repo | |
Framework | |