Paper Group ANR 390
Teacher Improves Learning by Selecting a Training Subset. ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics. Multimodal Social Media Analysis for Gang Violence Prevention. Don’t get Lost in Negation: An Effective Negation Handled Dialogue Acts Prediction Algorithm for Twitter Customer Service Conversations. Evolutionary Ar …
Teacher Improves Learning by Selecting a Training Subset
Title | Teacher Improves Learning by Selecting a Training Subset |
Authors | Yuzhe Ma, Robert Nowak, Philippe Rigollet, Xuezhou Zhang, Xiaojin Zhu |
Abstract | We call a learner super-teachable if a teacher can trim down an iid training set while making the learner learn even better. We provide sharp super-teaching guarantees on two learners: the maximum likelihood estimator for the mean of a Gaussian, and the large margin classifier in 1D. For general learners, we provide a mixed-integer nonlinear programming-based algorithm to find a super teaching set. Empirical experiments show that our algorithm is able to find good super-teaching sets for both regression and classification problems. |
Tasks | |
Published | 2018-02-25 |
URL | http://arxiv.org/abs/1802.08946v1 |
http://arxiv.org/pdf/1802.08946v1.pdf | |
PWC | https://paperswithcode.com/paper/teacher-improves-learning-by-selecting-a |
Repo | |
Framework | |
ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics
Title | ChainQueen: A Real-Time Differentiable Physical Simulator for Soft Robotics |
Authors | Yuanming Hu, Jiancheng Liu, Andrew Spielberg, Joshua B. Tenenbaum, William T. Freeman, Jiajun Wu, Daniela Rus, Wojciech Matusik |
Abstract | Physical simulators have been widely used in robot planning and control. Among them, differentiable simulators are particularly favored, as they can be incorporated into gradient-based optimization algorithms that are efficient in solving inverse problems such as optimal control and motion planning. Simulating deformable objects is, however, more challenging compared to rigid body dynamics. The underlying physical laws of deformable objects are more complex, and the resulting systems have orders of magnitude more degrees of freedom and therefore they are significantly more computationally expensive to simulate. Computing gradients with respect to physical design or controller parameters is typically even more computationally challenging. In this paper, we propose a real-time, differentiable hybrid Lagrangian-Eulerian physical simulator for deformable objects, ChainQueen, based on the Moving Least Squares Material Point Method (MLS-MPM). MLS-MPM can simulate deformable objects including contact and can be seamlessly incorporated into inference, control and co-design systems. We demonstrate that our simulator achieves high precision in both forward simulation and backward gradient computation. We have successfully employed it in a diverse set of control tasks for soft robots, including problems with nearly 3,000 decision variables. |
Tasks | Motion Planning |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01054v1 |
http://arxiv.org/pdf/1810.01054v1.pdf | |
PWC | https://paperswithcode.com/paper/chainqueen-a-real-time-differentiable |
Repo | |
Framework | |
Multimodal Social Media Analysis for Gang Violence Prevention
Title | Multimodal Social Media Analysis for Gang Violence Prevention |
Authors | Philipp Blandfort, Desmond Patton, William R. Frey, Svebor Karaman, Surabhi Bhargava, Fei-Tzin Lee, Siddharth Varia, Chris Kedzie, Michael B. Gaskell, Rossano Schifanella, Kathleen McKeown, Shih-Fu Chang |
Abstract | Gang violence is a severe issue in major cities across the U.S. and recent studies [Patton et al. 2017] have found evidence of social media communications that can be linked to such violence in communities with high rates of exposure to gang activity. In this paper we partnered computer scientists with social work researchers, who have domain expertise in gang violence, to analyze how public tweets with images posted by youth who mention gang associations on Twitter can be leveraged to automatically detect psychosocial factors and conditions that could potentially assist social workers and violence outreach workers in prevention and early intervention programs. To this end, we developed a rigorous methodology for collecting and annotating tweets. We gathered 1,851 tweets and accompanying annotations related to visual concepts and the psychosocial codes: aggression, loss, and substance use. These codes are relevant to social work interventions, as they represent possible pathways to violence on social media. We compare various methods for classifying tweets into these three classes, using only the text of the tweet, only the image of the tweet, or both modalities as input to the classifier. In particular, we analyze the usefulness of mid-level visual concepts and the role of different modalities for this tweet classification task. Our experiments show that individually, text information dominates classification performance of the loss class, while image information dominates the aggression and substance use classes. Our multimodal approach provides a very promising improvement (18% relative in mean average precision) over the best single modality approach. Finally, we also illustrate the complexity of understanding social media data and elaborate on open challenges. |
Tasks | |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08465v1 |
http://arxiv.org/pdf/1807.08465v1.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-social-media-analysis-for-gang |
Repo | |
Framework | |
Don’t get Lost in Negation: An Effective Negation Handled Dialogue Acts Prediction Algorithm for Twitter Customer Service Conversations
Title | Don’t get Lost in Negation: An Effective Negation Handled Dialogue Acts Prediction Algorithm for Twitter Customer Service Conversations |
Authors | Mansurul Bhuiyan, Amita Misra, Saurabh Tripathy, Jalal Mahmud, Rama Akkiraju |
Abstract | In the last several years, Twitter is being adopted by the companies as an alternative platform to interact with the customers to address their concerns. With the abundance of such unconventional conversation resources, push for developing effective virtual agents is more than ever. To address this challenge, a better understanding of such customer service conversations is required. Lately, there have been several works proposing a novel taxonomy for fine-grained dialogue acts as well as develop algorithms for automatic detection of these acts. The outcomes of these works are providing stepping stones for the ultimate goal of building efficient and effective virtual agents. But none of these works consider handling the notion of negation into the proposed algorithms. In this work, we developed an SVM-based dialogue acts prediction algorithm for Twitter customer service conversations where negation handling is an integral part of the end-to-end solution. For negation handling, we propose several efficient heuristics as well as adopt recent state-of- art third party machine learning based solutions. Empirically we show model’s performance gain while handling negation compared to when we don’t. Our experiments show that for the informal text such as tweets, the heuristic-based approach is more effective. |
Tasks | |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.06107v1 |
http://arxiv.org/pdf/1807.06107v1.pdf | |
PWC | https://paperswithcode.com/paper/dont-get-lost-in-negation-an-effective |
Repo | |
Framework | |
Evolutionary Architecture Search For Deep Multitask Networks
Title | Evolutionary Architecture Search For Deep Multitask Networks |
Authors | Jason Liang, Elliot Meyerson, Risto Miikkulainen |
Abstract | Multitask learning, i.e. learning several tasks at once with the same neural network, can improve performance in each of the tasks. Designing deep neural network architectures for multitask learning is a challenge: There are many ways to tie the tasks together, and the design choices matter. The size and complexity of this problem exceeds human design ability, making it a compelling domain for evolutionary optimization. Using the existing state of the art soft ordering architecture as the starting point, methods for evolving the modules of this architecture and for evolving the overall topology or routing between modules are evaluated in this paper. A synergetic approach of evolving custom routings with evolved, shared modules for each task is found to be very powerful, significantly improving the state of the art in the Omniglot multitask, multialphabet character recognition domain. This result demonstrates how evolution can be instrumental in advancing deep neural network and complex system design in general. |
Tasks | Neural Architecture Search, Omniglot |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.03745v2 |
http://arxiv.org/pdf/1803.03745v2.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-architecture-search-for-deep |
Repo | |
Framework | |
Towards Riemannian Accelerated Gradient Methods
Title | Towards Riemannian Accelerated Gradient Methods |
Authors | Hongyi Zhang, Suvrit Sra |
Abstract | We propose a Riemannian version of Nesterov’s Accelerated Gradient algorithm (RAGD), and show that for geodesically smooth and strongly convex problems, within a neighborhood of the minimizer whose radius depends on the condition number as well as the sectional curvature of the manifold, RAGD converges to the minimizer with acceleration. Unlike the algorithm in (Liu et al., 2017) that requires the exact solution to a nonlinear equation which in turn may be intractable, our algorithm is constructive and computationally tractable. Our proof exploits a new estimate sequence and a novel bound on the nonlinear metric distortion, both ideas may be of independent interest. |
Tasks | |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02812v1 |
http://arxiv.org/pdf/1806.02812v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-riemannian-accelerated-gradient |
Repo | |
Framework | |
Improved Semantic Stixels via Multimodal Sensor Fusion
Title | Improved Semantic Stixels via Multimodal Sensor Fusion |
Authors | Florian Piewak, Peter Pinggera, Markus Enzweiler, David Pfeiffer, Marius Zöllner |
Abstract | This paper presents a compact and accurate representation of 3D scenes that are observed by a LiDAR sensor and a monocular camera. The proposed method is based on the well-established Stixel model originally developed for stereo vision applications. We extend this Stixel concept to incorporate data from multiple sensor modalities. The resulting mid-level fusion scheme takes full advantage of the geometric accuracy of LiDAR measurements as well as the high resolution and semantic detail of RGB images. The obtained environment model provides a geometrically and semantically consistent representation of the 3D scene at a significantly reduced amount of data while minimizing information loss at the same time. Since the different sensor modalities are considered as input to a joint optimization problem, the solution is obtained with only minor computational overhead. We demonstrate the effectiveness of the proposed multimodal Stixel algorithm on a manually annotated ground truth dataset. Our results indicate that the proposed mid-level fusion of LiDAR and camera data improves both the geometric and semantic accuracy of the Stixel model significantly while reducing the computational overhead as well as the amount of generated data in comparison to using a single modality on its own. |
Tasks | Sensor Fusion |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08993v2 |
http://arxiv.org/pdf/1809.08993v2.pdf | |
PWC | https://paperswithcode.com/paper/improved-semantic-stixels-via-multimodal |
Repo | |
Framework | |
FermiNets: Learning generative machines to generate efficient neural networks via generative synthesis
Title | FermiNets: Learning generative machines to generate efficient neural networks via generative synthesis |
Authors | Alexander Wong, Mohammad Javad Shafiee, Brendan Chwyl, Francis Li |
Abstract | The tremendous potential exhibited by deep learning is often offset by architectural and computational complexity, making widespread deployment a challenge for edge scenarios such as mobile and other consumer devices. To tackle this challenge, we explore the following idea: Can we learn generative machines to automatically generate deep neural networks with efficient network architectures? In this study, we introduce the idea of generative synthesis, which is premised on the intricate interplay between a generator-inquisitor pair that work in tandem to garner insights and learn to generate highly efficient deep neural networks that best satisfies operational requirements. What is most interesting is that, once a generator has been learned through generative synthesis, it can be used to generate not just one but a large variety of different, unique highly efficient deep neural networks that satisfy operational requirements. Experimental results for image classification, semantic segmentation, and object detection tasks illustrate the efficacy of generative synthesis in producing generators that automatically generate highly efficient deep neural networks (which we nickname FermiNets) with higher model efficiency and lower computational costs (reaching >10x more efficient and fewer multiply-accumulate operations than several tested state-of-the-art networks), as well as higher energy efficiency (reaching >4x improvements in image inferences per joule consumed on a Nvidia Tegra X2 mobile processor). As such, generative synthesis can be a powerful, generalized approach for accelerating and improving the building of deep neural networks for on-device edge scenarios. |
Tasks | Image Classification, Object Detection, Semantic Segmentation |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.05989v2 |
http://arxiv.org/pdf/1809.05989v2.pdf | |
PWC | https://paperswithcode.com/paper/ferminets-learning-generative-machines-to |
Repo | |
Framework | |
Leave-one-out Approach for Matrix Completion: Primal and Dual Analysis
Title | Leave-one-out Approach for Matrix Completion: Primal and Dual Analysis |
Authors | Lijun Ding, Yudong Chen |
Abstract | In this paper, we introduce a powerful technique based on Leave-one-out analysis to the study of low-rank matrix completion problems. Using this technique, we develop a general approach for obtaining fine-grained, entrywise bounds for iterative stochastic procedures in the presence of probabilistic dependency. We demonstrate the power of this approach in analyzing two of the most important algorithms for matrix completion: (i) the non-convex approach based on Projected Gradient Descent (PGD) for a rank-constrained formulation, also known as the Singular Value Projection algorithm, and (ii) the convex relaxation approach based on nuclear norm minimization (NNM). Using this approach, we establish the first convergence guarantee for the original form of PGD without regularization or sample splitting}, and in particular shows that it converges linearly in the infinity norm. For NNM, we use this approach to study a fictitious iterative procedure that arises in the dual analysis. Our results show that \NNM recovers an $ d $-by-$ d $ rank-$ r $ matrix with $\mathcal{O}(\mu r \log(\mu r) d \log d )$ observed entries. This bound has optimal dependence on the matrix dimension and is independent of the condition number. To the best of our knowledge, this is the first sample complexity result for a tractable matrix completion algorithm that satisfies these two properties simultaneously. |
Tasks | Low-Rank Matrix Completion, Matrix Completion |
Published | 2018-03-20 |
URL | https://arxiv.org/abs/1803.07554v2 |
https://arxiv.org/pdf/1803.07554v2.pdf | |
PWC | https://paperswithcode.com/paper/the-leave-one-out-approach-for-matrix |
Repo | |
Framework | |
Online Scoring with Delayed Information: A Convex Optimization Viewpoint
Title | Online Scoring with Delayed Information: A Convex Optimization Viewpoint |
Authors | Avishek Ghosh, Kannan Ramchandran |
Abstract | We consider a system where agents enter in an online fashion and are evaluated based on their attributes or context vectors. There can be practical situations where this context is partially observed, and the unobserved part comes after some delay. We assume that an agent, once left, cannot re-enter the system. Therefore, the job of the system is to provide an estimated score for the agent based on her instantaneous score and possibly some inference of the instantaneous score over the delayed score. In this paper, we estimate the delayed context via an online convex game between the agent and the system. We argue that the error in the score estimate accumulated over $T$ iterations is small if the regret of the online convex game is small. Further, we leverage side information about the delayed context in the form of a correlation function with the known context. We consider the settings where the delay is fixed or arbitrarily chosen by an adversary. Furthermore, we extend the formulation to the setting where the contexts are drawn from some Banach space. Overall, we show that the average penalty for not knowing the delayed context while making a decision scales with $\mathcal{O}(\frac{1}{\sqrt{T}})$, where this can be improved to $\mathcal{O}(\frac{\log T}{T})$ under special setting. |
Tasks | |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03379v1 |
http://arxiv.org/pdf/1807.03379v1.pdf | |
PWC | https://paperswithcode.com/paper/online-scoring-with-delayed-information-a |
Repo | |
Framework | |
Orthogonally Regularized Deep Networks For Image Super-resolution
Title | Orthogonally Regularized Deep Networks For Image Super-resolution |
Authors | Tiantong Guo, Hojjat S. Mousavi, Vishal Monga |
Abstract | Deep learning methods, in particular trained Convolutional Neural Networks (CNNs) have recently been shown to produce compelling state-of-the-art results for single image Super-Resolution (SR). Invariably, a CNN is learned to map the low resolution (LR) image to its corresponding high resolution (HR) version in the spatial domain. Aiming for faster inference and more efficient solutions than solving the SR problem in the spatial domain, we propose a novel network structure for learning the SR mapping function in an image transform domain, specifically the Discrete Cosine Transform (DCT). As a first contribution, we show that DCT can be integrated into the network structure as a Convolutional DCT (CDCT) layer. We further extend the network to allow the CDCT layer to become trainable (i.e. optimizable). Because this layer represents an image transform, we enforce pairwise orthogonality constraints on the individual basis functions/filters. This Orthogonally Regularized Deep SR network (ORDSR) simplifies the SR task by taking advantage of image transform domain while adapting the design of transform basis to the training image set. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-02-06 |
URL | http://arxiv.org/abs/1802.02018v1 |
http://arxiv.org/pdf/1802.02018v1.pdf | |
PWC | https://paperswithcode.com/paper/orthogonally-regularized-deep-networks-for |
Repo | |
Framework | |
Joint Correction of Attenuation and Scatter Using Deep Convolutional Neural Networks (DCNN) for Time-of-Flight PET
Title | Joint Correction of Attenuation and Scatter Using Deep Convolutional Neural Networks (DCNN) for Time-of-Flight PET |
Authors | Jaewon Yang, Dookun Park, Jae Ho Sohn, Zhen Jane Wang, Grant T. Gullberg, Youngho Seo |
Abstract | Deep convolutional neural networks (DCNN) have demonstrated its capability to convert MR image to pseudo CT for PET attenuation correction in PET/MRI. Conventionally, attenuated events are corrected in sinogram space using attenuation maps derived from CT or MR-derived pseudo CT. Separately, scattered events are iteratively estimated by a 3D model-based simulation using down-sampled attenuation and emission sinograms. However, no studies have investigated joint correction of attenuation and scatter using DCNN in image space. Therefore, we aim to develop and optimize a DCNN model for attenuation and scatter correction (ASC) simultaneously in PET image space without additional anatomical imaging or time-consuming iterative scatter simulation. For the first time, we demonstrated the feasibility of directly producing PET images corrected for attenuation and scatter using DCNN (PET-DCNN) from noncorrected PET (PET-NC) images. |
Tasks | |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11852v1 |
http://arxiv.org/pdf/1811.11852v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-correction-of-attenuation-and-scatter |
Repo | |
Framework | |
Similarity-preserving Image-image Domain Adaptation for Person Re-identification
Title | Similarity-preserving Image-image Domain Adaptation for Person Re-identification |
Authors | Weijian Deng, Liang Zheng, Qixiang Ye, Yi Yang, Jianbin Jiao |
Abstract | This article studies the domain adaptation problem in person re-identification (re-ID) under a “learning via translation” framework, consisting of two components, 1) translating the labeled images from the source to the target domain in an unsupervised manner, 2) learning a re-ID model using the translated images. The objective is to preserve the underlying human identity information after image translation, so that translated images with labels are effective for feature learning on the target domain. To this end, we propose a similarity preserving generative adversarial network (SPGAN) and its end-to-end trainable version, eSPGAN. Both aiming at similarity preserving, SPGAN enforces this property by heuristic constraints, while eSPGAN does so by optimally facilitating the re-ID model learning. More specifically, SPGAN separately undertakes the two components in the “learning via translation” framework. It first preserves two types of unsupervised similarity, namely, self-similarity of an image before and after translation, and domain-dissimilarity of a translated source image and a target image. It then learns a re-ID model using existing networks. In comparison, eSPGAN seamlessly integrates image translation and re-ID model learning. During the end-to-end training of eSPGAN, re-ID learning guides image translation to preserve the underlying identity information of an image. Meanwhile, image translation improves re-ID learning by providing identity-preserving training samples of the target domain style. In the experiment, we show that identities of the fake images generated by SPGAN and eSPGAN are well preserved. Based on this, we report the new state-of-the-art domain adaptation results on two large-scale person re-ID datasets. |
Tasks | Domain Adaptation, Person Re-Identification |
Published | 2018-11-26 |
URL | https://arxiv.org/abs/1811.10551v2 |
https://arxiv.org/pdf/1811.10551v2.pdf | |
PWC | https://paperswithcode.com/paper/similarity-preserving-image-image-domain |
Repo | |
Framework | |
Automatic Exploration of Machine Learning Experiments on OpenML
Title | Automatic Exploration of Machine Learning Experiments on OpenML |
Authors | Daniel Kühn, Philipp Probst, Janek Thomas, Bernd Bischl |
Abstract | Understanding the influence of hyperparameters on the performance of a machine learning algorithm is an important scientific topic in itself and can help to improve automatic hyperparameter tuning procedures. Unfortunately, experimental meta data for this purpose is still rare. This paper presents a large, free and open dataset addressing this problem, containing results on 38 OpenML data sets, six different machine learning algorithms and many different hyperparameter configurations. Results where generated by an automated random sampling strategy, termed the OpenML Random Bot. Each algorithm was cross-validated up to 20.000 times per dataset with different hyperparameters settings, resulting in a meta dataset of around 2.5 million experiments overall. |
Tasks | |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.10961v3 |
http://arxiv.org/pdf/1806.10961v3.pdf | |
PWC | https://paperswithcode.com/paper/automatic-exploration-of-machine-learning |
Repo | |
Framework | |
A Tempt to Unify Heterogeneous Driving Databases using Traffic Primitives
Title | A Tempt to Unify Heterogeneous Driving Databases using Traffic Primitives |
Authors | Jiacheng Zhu, Wenshuo Wang, Ding Zhao |
Abstract | A multitude of publicly-available driving datasets and data platforms have been raised for autonomous vehicles (AV). However, the heterogeneities of databases in size, structure and driving context make existing datasets practically ineffective due to a lack of uniform frameworks and searchable indexes. In order to overcome these limitations on existing public datasets, this paper proposes a data unification framework based on traffic primitives with ability to automatically unify and label heterogeneous traffic data. This is achieved by two steps: 1) Carefully arrange raw multidimensional time series driving data into a relational database and then 2) automatically extract labeled and indexed traffic primitives from traffic data through a Bayesian nonparametric learning method. Finally, we evaluate the effectiveness of our developed framework using the collected real vehicle data. |
Tasks | Autonomous Vehicles, Time Series |
Published | 2018-05-13 |
URL | http://arxiv.org/abs/1805.04925v1 |
http://arxiv.org/pdf/1805.04925v1.pdf | |
PWC | https://paperswithcode.com/paper/a-tempt-to-unify-heterogeneous-driving |
Repo | |
Framework | |