Paper Group ANR 393
Machine Teaching: A New Paradigm for Building Machine Learning Systems. Attention-Based Multimodal Fusion for Video Description. A Mathematical Programming Approach for Integrated Multiple Linear Regression Subset Selection and Validation. Closed-Loop Policies for Operational Tests of Safety-Critical Systems. Extracting Core Claims from Scientific …
Machine Teaching: A New Paradigm for Building Machine Learning Systems
Title | Machine Teaching: A New Paradigm for Building Machine Learning Systems |
Authors | Patrice Y. Simard, Saleema Amershi, David M. Chickering, Alicia Edelman Pelton, Soroush Ghorashi, Christopher Meek, Gonzalo Ramos, Jina Suh, Johan Verwey, Mo Wang, John Wernsing |
Abstract | The current processes for building machine learning systems require practitioners with deep knowledge of machine learning. This significantly limits the number of machine learning systems that can be created and has led to a mismatch between the demand for machine learning systems and the ability for organizations to build them. We believe that in order to meet this growing demand for machine learning systems we must significantly increase the number of individuals that can teach machines. We postulate that we can achieve this goal by making the process of teaching machines easy, fast and above all, universally accessible. While machine learning focuses on creating new algorithms and improving the accuracy of “learners”, the machine teaching discipline focuses on the efficacy of the “teachers”. Machine teaching as a discipline is a paradigm shift that follows and extends principles of software engineering and programming languages. We put a strong emphasis on the teacher and the teacher’s interaction with data, as well as crucial components such as techniques and design principles of interaction and visualization. In this paper, we present our position regarding the discipline of machine teaching and articulate fundamental machine teaching principles. We also describe how, by decoupling knowledge about machine learning algorithms from the process of teaching, we can accelerate innovation and empower millions of new uses for machine learning models. |
Tasks | |
Published | 2017-07-21 |
URL | http://arxiv.org/abs/1707.06742v3 |
http://arxiv.org/pdf/1707.06742v3.pdf | |
PWC | https://paperswithcode.com/paper/machine-teaching-a-new-paradigm-for-building |
Repo | |
Framework | |
Attention-Based Multimodal Fusion for Video Description
Title | Attention-Based Multimodal Fusion for Video Description |
Authors | Chiori Hori, Takaaki Hori, Teng-Yok Lee, Kazuhiro Sumi, John R. Hershey, Tim K. Marks |
Abstract | Currently successful methods for video description are based on encoder-decoder sentence generation using recur-rent neural networks (RNNs). Recent work has shown the advantage of integrating temporal and/or spatial attention mechanisms into these models, in which the decoder net-work predicts each word in the description by selectively giving more weight to encoded features from specific time frames (temporal attention) or to features from specific spatial regions (spatial attention). In this paper, we propose to expand the attention model to selectively attend not just to specific times or spatial regions, but to specific modalities of input such as image features, motion features, and audio features. Our new modality-dependent attention mechanism, which we call multimodal attention, provides a natural way to fuse multimodal information for video description. We evaluate our method on the Youtube2Text dataset, achieving results that are competitive with current state of the art. More importantly, we demonstrate that our model incorporating multimodal attention as well as temporal attention significantly outperforms the model that uses temporal attention alone. |
Tasks | Video Description |
Published | 2017-01-11 |
URL | http://arxiv.org/abs/1701.03126v2 |
http://arxiv.org/pdf/1701.03126v2.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-multimodal-fusion-for-video |
Repo | |
Framework | |
A Mathematical Programming Approach for Integrated Multiple Linear Regression Subset Selection and Validation
Title | A Mathematical Programming Approach for Integrated Multiple Linear Regression Subset Selection and Validation |
Authors | Seokhyun Chung, Young Woong Park, Taesu Cheong |
Abstract | Subset selection for multiple linear regression aims to construct a regression model that minimizes errors by selecting a small number of explanatory variables. Once a model is built, various statistical tests and diagnostics are conducted to validate the model and to determine whether regression assumptions are met. Most traditional approaches require human decisions at this step, for example, the user adding or removing a variable until a satisfactory model is obtained. However, this trial-and-error strategy cannot guarantee that a subset that minimizes the errors while satisfying all regression assumptions will be found. In this paper, we propose a fully automated model building procedure for multiple linear regression subset selection that integrates model building and validation based on mathematical programming. The proposed model minimizes mean squared errors while ensuring that the majority of the important regression assumptions are met. When no subset satisfies all of the considered regression assumptions, our model provides an alternative subset that satisfies most of these assumptions. Computational results show that our model yields better solutions (i.e., satisfying more regression assumptions) compared to benchmark models while maintaining similar explanatory power. |
Tasks | |
Published | 2017-12-12 |
URL | http://arxiv.org/abs/1712.04543v1 |
http://arxiv.org/pdf/1712.04543v1.pdf | |
PWC | https://paperswithcode.com/paper/a-mathematical-programming-approach-for |
Repo | |
Framework | |
Closed-Loop Policies for Operational Tests of Safety-Critical Systems
Title | Closed-Loop Policies for Operational Tests of Safety-Critical Systems |
Authors | Jeremy Morton, Tim A. Wheeler, Mykel J. Kochenderfer |
Abstract | Manufacturers of safety-critical systems must make the case that their product is sufficiently safe for public deployment. Much of this case often relies upon critical event outcomes from real-world testing, requiring manufacturers to be strategic about how they allocate testing resources in order to maximize their chances of demonstrating system safety. This work frames the partially observable and belief-dependent problem of test scheduling as a Markov decision process, which can be solved efficiently to yield closed-loop manufacturer testing policies. By solving for policies over a wide range of problem formulations, we are able to provide high-level guidance for manufacturers and regulators on issues relating to the testing of safety-critical systems. This guidance spans an array of topics, including circumstances under which manufacturers should continue testing despite observed incidents, when manufacturers should test aggressively, and when regulators should increase or reduce the real-world testing requirements for an autonomous vehicle. |
Tasks | |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.08234v3 |
http://arxiv.org/pdf/1707.08234v3.pdf | |
PWC | https://paperswithcode.com/paper/closed-loop-policies-for-operational-tests-of |
Repo | |
Framework | |
Extracting Core Claims from Scientific Articles
Title | Extracting Core Claims from Scientific Articles |
Authors | Tom Jansen, Tobias Kuhn |
Abstract | The number of scientific articles has grown rapidly over the years and there are no signs that this growth will slow down in the near future. Because of this, it becomes increasingly difficult to keep up with the latest developments in a scientific field. To address this problem, we present here an approach to help researchers learn about the latest developments and findings by extracting in a normalized form core claims from scientific articles. This normalized representation is a controlled natural language of English sentences called AIDA, which has been proposed in previous work as a method to formally structure and organize scientific findings and discourse. We show how such AIDA sentences can be automatically extracted by detecting the core claim of an article, checking for AIDA compliance, and - if necessary - transforming it into a compliant sentence. While our algorithm is still far from perfect, our results indicate that the different steps are feasible and they support the claim that AIDA sentences might be a promising approach to improve scientific communication in the future. |
Tasks | |
Published | 2017-07-24 |
URL | http://arxiv.org/abs/1707.07678v1 |
http://arxiv.org/pdf/1707.07678v1.pdf | |
PWC | https://paperswithcode.com/paper/extracting-core-claims-from-scientific |
Repo | |
Framework | |
Simulated Autonomous Driving on Realistic Road Networks using Deep Reinforcement Learning
Title | Simulated Autonomous Driving on Realistic Road Networks using Deep Reinforcement Learning |
Authors | Patrick Klose, Rudolf Mester |
Abstract | Using Deep Reinforcement Learning (DRL) can be a promising approach to handle various tasks in the field of (simulated) autonomous driving. However, recent publications mainly consider learning in unusual driving environments. This paper presents Driving School for Autonomous Agents (DSA^2), a software for validating DRL algorithms in more usual driving environments based on artificial and realistic road networks. We also present the results of applying DSA^2 for handling the task of driving on a straight road while regulating the velocity of one vehicle according to different speed limits. |
Tasks | Autonomous Driving |
Published | 2017-12-12 |
URL | http://arxiv.org/abs/1712.04363v2 |
http://arxiv.org/pdf/1712.04363v2.pdf | |
PWC | https://paperswithcode.com/paper/simulated-autonomous-driving-on-realistic |
Repo | |
Framework | |
Robust Surface Reconstruction from Gradients via Adaptive Dictionary Regularization
Title | Robust Surface Reconstruction from Gradients via Adaptive Dictionary Regularization |
Authors | Andrew J. Wagenmaker, Brian E. Moore, Raj Rao Nadakuditi |
Abstract | This paper introduces a novel approach to robust surface reconstruction from photometric stereo normal vector maps that is particularly well-suited for reconstructing surfaces from noisy gradients. Specifically, we propose an adaptive dictionary learning based approach that attempts to simultaneously integrate the gradient fields while sparsely representing the spatial patches of the reconstructed surface in an adaptive dictionary domain. We show that our formulation learns the underlying structure of the surface, effectively acting as an adaptive regularizer that enforces a smoothness constraint on the reconstructed surface. Our method is general and may be coupled with many existing approaches in the literature to improve the integrity of the reconstructed surfaces. We demonstrate the performance of our method on synthetic data as well as real photometric stereo data and evaluate its robustness to noise. |
Tasks | Dictionary Learning |
Published | 2017-09-30 |
URL | http://arxiv.org/abs/1710.00230v1 |
http://arxiv.org/pdf/1710.00230v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-surface-reconstruction-from-gradients |
Repo | |
Framework | |
An Energy Minimization Approach to 3D Non-Rigid Deformable Surface Estimation Using RGBD Data
Title | An Energy Minimization Approach to 3D Non-Rigid Deformable Surface Estimation Using RGBD Data |
Authors | Bryan Willimon, Steven Hickson, Ian Walker, Stan Birchfield |
Abstract | We propose an algorithm that uses energy mini- mization to estimate the current configuration of a non-rigid object. Our approach utilizes an RGBD image to calculate corresponding SURF features, depth, and boundary informa- tion. We do not use predetermined features, thus enabling our system to operate on unmodified objects. Our approach relies on a 3D nonlinear energy minimization framework to solve for the configuration using a semi-implicit scheme. Results show various scenarios of dynamic posters and shirts in different configurations to illustrate the performance of the method. In particular, we show that our method is able to estimate the configuration of a textureless nonrigid object with no correspondences available. |
Tasks | |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00940v1 |
http://arxiv.org/pdf/1708.00940v1.pdf | |
PWC | https://paperswithcode.com/paper/an-energy-minimization-approach-to-3d-non |
Repo | |
Framework | |
Analysis of dropout learning regarded as ensemble learning
Title | Analysis of dropout learning regarded as ensemble learning |
Authors | Kazuyuki Hara, Daisuke Saitoh, Hayaru Shouno |
Abstract | Deep learning is the state-of-the-art in fields such as visual object recognition and speech recognition. This learning uses a large number of layers, huge number of units, and connections. Therefore, overfitting is a serious problem. To avoid this problem, dropout learning is proposed. Dropout learning neglects some inputs and hidden units in the learning process with a probability, p, and then, the neglected inputs and hidden units are combined with the learned network to express the final output. We find that the process of combining the neglected hidden units with the learned network can be regarded as ensemble learning, so we analyze dropout learning from this point of view. |
Tasks | Object Recognition, Speech Recognition |
Published | 2017-06-20 |
URL | http://arxiv.org/abs/1706.06859v1 |
http://arxiv.org/pdf/1706.06859v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-dropout-learning-regarded-as |
Repo | |
Framework | |
White Matter Fiber Segmentation Using Functional Varifolds
Title | White Matter Fiber Segmentation Using Functional Varifolds |
Authors | Kuldeep Kumar, Pietro Gori, Benjamin Charlier, Stanley Durrleman, Olivier Colliot, Christian Desrosiers |
Abstract | The extraction of fibers from dMRI data typically produces a large number of fibers, it is common to group fibers into bundles. To this end, many specialized distance measures, such as MCP, have been used for fiber similarity. However, these distance based approaches require point-wise correspondence and focus only on the geometry of the fibers. Recent publications have highlighted that using microstructure measures along fibers improves tractography analysis. Also, many neurodegenerative diseases impacting white matter require the study of microstructure measures as well as the white matter geometry. Motivated by these, we propose to use a novel computational model for fibers, called functional varifolds, characterized by a metric that considers both the geometry and microstructure measure (e.g. GFA) along the fiber pathway. We use it to cluster fibers with a dictionary learning and sparse coding-based framework, and present a preliminary analysis using HCP data. |
Tasks | Dictionary Learning |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.06144v1 |
http://arxiv.org/pdf/1709.06144v1.pdf | |
PWC | https://paperswithcode.com/paper/white-matter-fiber-segmentation-using |
Repo | |
Framework | |
Capacity Releasing Diffusion for Speed and Locality
Title | Capacity Releasing Diffusion for Speed and Locality |
Authors | Di Wang, Kimon Fountoulakis, Monika Henzinger, Michael W. Mahoney, Satish Rao |
Abstract | Diffusions and related random walk procedures are of central importance in many areas of machine learning, data analysis, and applied mathematics. Because they spread mass agnostically at each step in an iterative manner, they can sometimes spread mass “too aggressively,” thereby failing to find the “right” clusters. We introduce a novel Capacity Releasing Diffusion (CRD) Process, which is both faster and stays more local than the classical spectral diffusion process. As an application, we use our CRD Process to develop an improved local algorithm for graph clustering. Our local graph clustering method can find local clusters in a model of clustering where one begins the CRD Process in a cluster whose vertices are connected better internally than externally by an $O(\log^2 n)$ factor, where $n$ is the number of nodes in the cluster. Thus, our CRD Process is the first local graph clustering algorithm that is not subject to the well-known quadratic Cheeger barrier. Our result requires a certain smoothness condition, which we expect to be an artifact of our analysis. Our empirical evaluation demonstrates improved results, in particular for realistic social graphs where there are moderately good—but not very good—clusters. |
Tasks | Graph Clustering |
Published | 2017-06-19 |
URL | http://arxiv.org/abs/1706.05826v2 |
http://arxiv.org/pdf/1706.05826v2.pdf | |
PWC | https://paperswithcode.com/paper/capacity-releasing-diffusion-for-speed-and-1 |
Repo | |
Framework | |
When Slepian Meets Fiedler: Putting a Focus on the Graph Spectrum
Title | When Slepian Meets Fiedler: Putting a Focus on the Graph Spectrum |
Authors | Dimitri Van De Ville, Robin Demesmaeker, Maria Giulia Preti |
Abstract | The study of complex systems benefits from graph models and their analysis. In particular, the eigendecomposition of the graph Laplacian lets emerge properties of global organization from local interactions; e.g., the Fiedler vector has the smallest non-zero eigenvalue and plays a key role for graph clustering. Graph signal processing focusses on the analysis of signals that are attributed to the graph nodes. The eigendecomposition of the graph Laplacian allows to define the graph Fourier transform and extend conventional signal-processing operations to graphs. Here, we introduce the design of Slepian graph signals, by maximizing energy concentration in a predefined subgraph for a graph spectral bandlimit. We establish a novel link with classical Laplacian embedding and graph clustering, which provides a meaning to localized graph frequencies. |
Tasks | Graph Clustering |
Published | 2017-01-29 |
URL | http://arxiv.org/abs/1701.08401v2 |
http://arxiv.org/pdf/1701.08401v2.pdf | |
PWC | https://paperswithcode.com/paper/when-slepian-meets-fiedler-putting-a-focus-on |
Repo | |
Framework | |
Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields
Title | Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields |
Authors | Rémi Le Priol, Alexandre Piché, Simon Lacoste-Julien |
Abstract | This work investigates the training of conditional random fields (CRFs) via the stochastic dual coordinate ascent (SDCA) algorithm of Shalev-Shwartz and Zhang (2016). SDCA enjoys a linear convergence rate and a strong empirical performance for binary classification problems. However, it has never been used to train CRFs. Yet it benefits from an `exact’ line search with a single marginalization oracle call, unlike previous approaches. In this paper, we adapt SDCA to train CRFs, and we enhance it with an adaptive non-uniform sampling strategy based on block duality gaps. We perform experiments on four standard sequence prediction tasks. SDCA demonstrates performances on par with the state of the art, and improves over it on three of the four datasets, which have in common the use of sparse features. | |
Tasks | |
Published | 2017-12-22 |
URL | http://arxiv.org/abs/1712.08577v2 |
http://arxiv.org/pdf/1712.08577v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-stochastic-dual-coordinate-ascent |
Repo | |
Framework | |
Toward Controlled Generation of Text
Title | Toward Controlled Generation of Text |
Authors | Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, Eric P. Xing |
Abstract | Generic generation and manipulation of text is challenging and has limited success compared to recent deep generative modeling in visual domain. This paper aims at generating plausible natural language sentences, whose attributes are dynamically controlled by learning disentangled latent representations with designated semantics. We propose a new neural generative model which combines variational auto-encoders and holistic attribute discriminators for effective imposition of semantic structures. With differentiable approximation to discrete text samples, explicit constraints on independent attribute controls, and efficient collaborative learning of generator and discriminators, our model learns highly interpretable representations from even only word annotations, and produces realistic sentences with desired attributes. Quantitative evaluation validates the accuracy of sentence and attribute generation. |
Tasks | |
Published | 2017-03-02 |
URL | http://arxiv.org/abs/1703.00955v4 |
http://arxiv.org/pdf/1703.00955v4.pdf | |
PWC | https://paperswithcode.com/paper/toward-controlled-generation-of-text |
Repo | |
Framework | |
Learning crystal plasticity using digital image correlation: Examples from discrete dislocation dynamics
Title | Learning crystal plasticity using digital image correlation: Examples from discrete dislocation dynamics |
Authors | Stefanos Papanikolaou, Michail Tzimas, Andrew C. E. Reid, Stephen A. Langer |
Abstract | Digital image correlation (DIC) is a well-established, non-invasive technique for tracking and quantifying the deformation of mechanical samples under strain. While it provides an obvious way to observe incremental and aggregate displacement information, it seems likely that DIC data sets, which after all reflect the spatially-resolved response of a microstructure to loads, contain much richer information than has generally been extracted from them. In this paper, we demonstrate a machine-learning approach to quantifying the prior deformation history of a crystalline sample based on its response to a subsequent DIC test. This prior deformation history is encoded in the microstructure through the inhomogeneity of the dislocation microstructure, and in the spatial correlations of the dislocation patterns, which mediate the system’s response to the DIC test load. Our domain consists of deformed crystalline thin films generated by a discrete dislocation plasticity simulation. We explore the range of applicability of machine learning (ML) for typical experimental protocols, and as a function of possible size effects and stochasticity. Plasticity size effects may directly influence the data, rendering unsupervised techniques unable to distinguish different plasticity regimes. |
Tasks | |
Published | 2017-09-24 |
URL | http://arxiv.org/abs/1709.08225v2 |
http://arxiv.org/pdf/1709.08225v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-crystal-plasticity-using-digital |
Repo | |
Framework | |