Paper Group ANR 449
Laying Down the Yellow Brick Road: Development of a Wizard-of-Oz Interface for Collecting Human-Robot Dialogue. Integrating Specialized Classifiers Based on Continuous Time Markov Chain. Towards dense volumetric pancreas segmentation in CT using 3D fully convolutional networks. Improving Deep Pancreas Segmentation in CT and MRI Images via Recurrent …
Laying Down the Yellow Brick Road: Development of a Wizard-of-Oz Interface for Collecting Human-Robot Dialogue
Title | Laying Down the Yellow Brick Road: Development of a Wizard-of-Oz Interface for Collecting Human-Robot Dialogue |
Authors | Claire Bonial, Matthew Marge, Ron artstein, Ashley Foots, Felix Gervits, Cory J. Hayes, Cassidy Henry, Susan G. Hill, Anton Leuski, Stephanie M. Lukin, Pooja Moolchandani, Kimberly A. Pollard, David Traum, Clare R. Voss |
Abstract | We describe the adaptation and refinement of a graphical user interface designed to facilitate a Wizard-of-Oz (WoZ) approach to collecting human-robot dialogue data. The data collected will be used to develop a dialogue system for robot navigation. Building on an interface previously used in the development of dialogue systems for virtual agents and video playback, we add templates with open parameters which allow the wizard to quickly produce a wide variety of utterances. Our research demonstrates that this approach to data collection is viable as an intermediate step in developing a dialogue system for physical robots in remote locations from their users - a domain in which the human and robot need to regularly verify and update a shared understanding of the physical environment. We show that our WoZ interface and the fixed set of utterances and templates therein provide for a natural pace of dialogue with good coverage of the navigation domain. |
Tasks | Robot Navigation |
Published | 2017-10-17 |
URL | http://arxiv.org/abs/1710.06406v1 |
http://arxiv.org/pdf/1710.06406v1.pdf | |
PWC | https://paperswithcode.com/paper/laying-down-the-yellow-brick-road-development |
Repo | |
Framework | |
Integrating Specialized Classifiers Based on Continuous Time Markov Chain
Title | Integrating Specialized Classifiers Based on Continuous Time Markov Chain |
Authors | Zhizhong Li, Dahua Lin |
Abstract | Specialized classifiers, namely those dedicated to a subset of classes, are often adopted in real-world recognition systems. However, integrating such classifiers is nontrivial. Existing methods, e.g. weighted average, usually implicitly assume that all constituents of an ensemble cover the same set of classes. Such methods can produce misleading predictions when used to combine specialized classifiers. This work explores a novel approach. Instead of combining predictions from individual classifiers directly, it first decomposes the predictions into sets of pairwise preferences, treating them as transition channels between classes, and thereon constructs a continuous-time Markov chain, and use the equilibrium distribution of this chain as the final prediction. This way allows us to form a coherent picture over all specialized predictions. On large public datasets, the proposed method obtains considerable improvement compared to mainstream ensemble methods, especially when the classifier coverage is highly unbalanced. |
Tasks | |
Published | 2017-09-07 |
URL | http://arxiv.org/abs/1709.02123v1 |
http://arxiv.org/pdf/1709.02123v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-specialized-classifiers-based-on |
Repo | |
Framework | |
Towards dense volumetric pancreas segmentation in CT using 3D fully convolutional networks
Title | Towards dense volumetric pancreas segmentation in CT using 3D fully convolutional networks |
Authors | Holger Roth, Masahiro Oda, Natsuki Shimizu, Hirohisa Oda, Yuichiro Hayashi, Takayuki Kitasaka, Michitaka Fujiwara, Kazunari Misawa, Kensaku Mori |
Abstract | Pancreas segmentation in computed tomography imaging has been historically difficult for automated methods because of the large shape and size variations between patients. In this work, we describe a custom-build 3D fully convolutional network (FCN) that can process a 3D image including the whole pancreas and produce an automatic segmentation. We investigate two variations of the 3D FCN architecture; one with concatenation and one with summation skip connections to the decoder part of the network. We evaluate our methods on a dataset from a clinical trial with gastric cancer patients, including 147 contrast enhanced abdominal CT scans acquired in the portal venous phase. Using the summation architecture, we achieve an average Dice score of 89.7 $\pm$ 3.8 (range [79.8, 94.8]) % in testing, achieving the new state-of-the-art performance in pancreas segmentation on this dataset. |
Tasks | Pancreas Segmentation |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06439v2 |
http://arxiv.org/pdf/1711.06439v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-dense-volumetric-pancreas |
Repo | |
Framework | |
Improving Deep Pancreas Segmentation in CT and MRI Images via Recurrent Neural Contextual Learning and Direct Loss Function
Title | Improving Deep Pancreas Segmentation in CT and MRI Images via Recurrent Neural Contextual Learning and Direct Loss Function |
Authors | Jinzheng Cai, Le Lu, Yuanpu Xie, Fuyong Xing, Lin Yang |
Abstract | Deep neural networks have demonstrated very promising performance on accurate segmentation of challenging organs (e.g., pancreas) in abdominal CT and MRI scans. The current deep learning approaches conduct pancreas segmentation by processing sequences of 2D image slices independently through deep, dense per-pixel masking for each image, without explicitly enforcing spatial consistency constraint on segmentation of successive slices. We propose a new convolutional/recurrent neural network architecture to address the contextual learning and segmentation consistency problem. A deep convolutional sub-network is first designed and pre-trained from scratch. The output layer of this network module is then connected to recurrent layers and can be fine-tuned for contextual learning, in an end-to-end manner. Our recurrent sub-network is a type of Long short-term memory (LSTM) network that performs segmentation on an image by integrating its neighboring slice segmentation predictions, in the form of a dependent sequence processing. Additionally, a novel segmentation-direct loss function (named Jaccard Loss) is proposed and deep networks are trained to optimize Jaccard Index (JI) directly. Extensive experiments are conducted to validate our proposed deep models, on quantitative pancreas segmentation using both CT and MRI scans. Our method outperforms the state-of-the-art work on CT [11] and MRI pancreas segmentation [1], respectively. |
Tasks | Pancreas Segmentation |
Published | 2017-07-16 |
URL | http://arxiv.org/abs/1707.04912v2 |
http://arxiv.org/pdf/1707.04912v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-deep-pancreas-segmentation-in-ct |
Repo | |
Framework | |
A model for interpreting social interactions in local image regions
Title | A model for interpreting social interactions in local image regions |
Authors | Guy Ben-Yosef, Alon Yachin, Shimon Ullman |
Abstract | Understanding social interactions (such as ‘hug’ or ‘fight’) is a basic and important capacity of the human visual system, but a challenging and still open problem for modeling. In this work we study visual recognition of social interactions, based on small but recognizable local regions. The approach is based on two novel key components: (i) A given social interaction can be recognized reliably from reduced images (called ‘minimal images’). (ii) The recognition of a social interaction depends on identifying components and relations within the minimal image (termed ‘interpretation’). We show psychophysics data for minimal images and modeling results for their interpretation. We discuss the integration of minimal configurations in recognizing social interactions in a detailed, high-resolution image. |
Tasks | |
Published | 2017-12-26 |
URL | http://arxiv.org/abs/1712.09299v1 |
http://arxiv.org/pdf/1712.09299v1.pdf | |
PWC | https://paperswithcode.com/paper/a-model-for-interpreting-social-interactions |
Repo | |
Framework | |
On Fairness and Calibration
Title | On Fairness and Calibration |
Authors | Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, Kilian Q. Weinberger |
Abstract | The machine learning community has become increasingly concerned with the potential for bias and discrimination in predictive models. This has motivated a growing line of work on what it means for a classification procedure to be “fair.” In this paper, we investigate the tension between minimizing error disparity across different population groups while maintaining calibrated probability estimates. We show that calibration is compatible only with a single error constraint (i.e. equal false-negatives rates across groups), and show that any algorithm that satisfies this relaxation is no better than randomizing a percentage of predictions for an existing classifier. These unsettling findings, which extend and generalize existing results, are empirically confirmed on several datasets. |
Tasks | Calibration |
Published | 2017-09-06 |
URL | http://arxiv.org/abs/1709.02012v2 |
http://arxiv.org/pdf/1709.02012v2.pdf | |
PWC | https://paperswithcode.com/paper/on-fairness-and-calibration |
Repo | |
Framework | |
SparCE: Sparsity aware General Purpose Core Extensions to Accelerate Deep Neural Networks
Title | SparCE: Sparsity aware General Purpose Core Extensions to Accelerate Deep Neural Networks |
Authors | Sanchari Sen, Shubham Jain, Swagath Venkataramani, Anand Raghunathan |
Abstract | Deep Neural Networks (DNNs) have emerged as the method of choice for solving a wide range of machine learning tasks. The enormous computational demands posed by DNNs have most commonly been addressed through the design of custom accelerators. However, these accelerators are prohibitive in many design scenarios (e.g., wearable devices and IoT sensors), due to stringent area/cost constraints. Accelerating DNNs on these low-power systems, comprising of mainly the general-purpose processor (GPP) cores, requires new approaches. We improve the performance of DNNs on GPPs by exploiting a key attribute of DNNs, i.e., sparsity. We propose Sparsity aware Core Extensions (SparCE)- a set of micro-architectural and ISA extensions that leverage sparsity and are minimally intrusive and low-overhead. We dynamically detect zero operands and skip a set of future instructions that use it. Our design ensures that the instructions to be skipped are prevented from even being fetched, as squashing instructions comes with a penalty. SparCE consists of 2 key micro-architectural enhancements- a Sparsity Register File (SpRF) that tracks zero registers and a Sparsity aware Skip Address (SASA) table that indicates instructions to be skipped. When an instruction is fetched, SparCE dynamically pre-identifies whether the following instruction(s) can be skipped and appropriately modifies the program counter, thereby skipping the redundant instructions and improving performance. We model SparCE using the gem5 architectural simulator, and evaluate our approach on 6 image-recognition DNNs in the context of both training and inference using the Caffe framework. On a scalar microprocessor, SparCE achieves 19%-31% reduction in application-level. We also evaluate SparCE on a 4-way SIMD ARMv8 processor using the OpenBLAS library, and demonstrate that SparCE achieves 8%-15% reduction in the application-level execution time. |
Tasks | |
Published | 2017-11-07 |
URL | http://arxiv.org/abs/1711.06315v2 |
http://arxiv.org/pdf/1711.06315v2.pdf | |
PWC | https://paperswithcode.com/paper/sparce-sparsity-aware-general-purpose-core |
Repo | |
Framework | |
Inductive Conformal Martingales for Change-Point Detection
Title | Inductive Conformal Martingales for Change-Point Detection |
Authors | Denis Volkhonskiy, Ilia Nouretdinov, Alexander Gammerman, Vladimir Vovk, Evgeny Burnaev |
Abstract | We consider the problem of quickest change-point detection in data streams. Classical change-point detection procedures, such as CUSUM, Shiryaev-Roberts and Posterior Probability statistics, are optimal only if the change-point model is known, which is an unrealistic assumption in typical applied problems. Instead we propose a new method for change-point detection based on Inductive Conformal Martingales, which requires only the independence and identical distribution of observations. We compare the proposed approach to standard methods, as well as to change-point detection oracles, which model a typical practical situation when we have only imprecise (albeit parametric) information about pre- and post-change data distributions. Results of comparison provide evidence that change-point detection based on Inductive Conformal Martingales is an efficient tool, capable to work under quite general conditions unlike traditional approaches. |
Tasks | Change Point Detection |
Published | 2017-06-11 |
URL | http://arxiv.org/abs/1706.03415v1 |
http://arxiv.org/pdf/1706.03415v1.pdf | |
PWC | https://paperswithcode.com/paper/inductive-conformal-martingales-for-change |
Repo | |
Framework | |
Low-rank Label Propagation for Semi-supervised Learning with 100 Millions Samples
Title | Low-rank Label Propagation for Semi-supervised Learning with 100 Millions Samples |
Authors | Raphael Petegrosso, Wei Zhang, Zhuliu Li, Yousef Saad, Rui Kuang |
Abstract | The success of semi-supervised learning crucially relies on the scalability to a huge amount of unlabelled data that are needed to capture the underlying manifold structure for better classification. Since computing the pairwise similarity between the training data is prohibitively expensive in most kinds of input data, currently, there is no general ready-to-use semi-supervised learning method/tool available for learning with tens of millions or more data points. In this paper, we adopted the idea of two low-rank label propagation algorithms, GLNP (Global Linear Neighborhood Propagation) and Kernel Nystr"om Approximation, and implemented the parallelized version of the two algorithms accelerated with Nesterov’s accelerated projected gradient descent for Big-data Label Propagation (BigLP). The parallel algorithms are tested on five real datasets ranging from 7000 to 10,000,000 in size and a simulation dataset of 100,000,000 samples. In the experiments, the implementation can scale up to datasets with 100,000,000 samples and hundreds of features and the algorithms also significantly improved the prediction accuracy when only a very small percentage of the data is labeled. The results demonstrate that the BigLP implementation is highly scalable to big data and effective in utilizing the unlabeled data for semi-supervised learning. |
Tasks | |
Published | 2017-02-28 |
URL | http://arxiv.org/abs/1702.08884v1 |
http://arxiv.org/pdf/1702.08884v1.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-label-propagation-for-semi |
Repo | |
Framework | |
Compiling and Processing Historical and Contemporary Portuguese Corpora
Title | Compiling and Processing Historical and Contemporary Portuguese Corpora |
Authors | Marcos Zampieri |
Abstract | This technical report describes the framework used for processing three large Portuguese corpora. Two corpora contain texts from newspapers, one published in Brazil and the other published in Portugal. The third corpus is Colonia, a historical Portuguese collection containing texts written between the 16th and the early 20th century. The report presents pre-processing methods, segmentation, and annotation of the corpora as well as indexing and querying methods. Finally, it presents published research papers using the corpora. |
Tasks | |
Published | 2017-10-02 |
URL | http://arxiv.org/abs/1710.00803v1 |
http://arxiv.org/pdf/1710.00803v1.pdf | |
PWC | https://paperswithcode.com/paper/compiling-and-processing-historical-and |
Repo | |
Framework | |
Learning with Changing Features
Title | Learning with Changing Features |
Authors | Amit Dhurandhar, Steve Hanneke, Liu Yang |
Abstract | In this paper we study the setting where features are added or change interpretation over time, which has applications in multiple domains such as retail, manufacturing, finance. In particular, we propose an approach to provably determine the time instant from which the new/changed features start becoming relevant with respect to an output variable in an agnostic (supervised) learning setting. We also suggest an efficient version of our approach which has the same asymptotic performance. Moreover, our theory also applies when we have more than one such change point. Independent post analysis of a change point identified by our method for a large retailer revealed that it corresponded in time with certain unflattering news stories about a brand that resulted in the change in customer behavior. We also applied our method to data from an advanced manufacturing plant identifying the time instant from which downstream features became relevant. To the best of our knowledge this is the first work that formally studies change point detection in a distribution independent agnostic setting, where the change point is based on the changing relationship between input and output. |
Tasks | Change Point Detection |
Published | 2017-04-29 |
URL | http://arxiv.org/abs/1705.00219v1 |
http://arxiv.org/pdf/1705.00219v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-with-changing-features |
Repo | |
Framework | |
Privacy Loss in Apple’s Implementation of Differential Privacy on MacOS 10.12
Title | Privacy Loss in Apple’s Implementation of Differential Privacy on MacOS 10.12 |
Authors | Jun Tang, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, Xiaofeng Wang |
Abstract | In June 2016, Apple announced that it will deploy differential privacy for some user data collection in order to ensure privacy of user data, even from Apple. The details of Apple’s approach remained sparse. Although several patents have since appeared hinting at the algorithms that may be used to achieve differential privacy, they did not include a precise explanation of the approach taken to privacy parameter choice. Such choice and the overall approach to privacy budget use and management are key questions for understanding the privacy protections provided by any deployment of differential privacy. In this work, through a combination of experiments, static and dynamic code analysis of macOS Sierra (Version 10.12) implementation, we shed light on the choices Apple made for privacy budget management. We discover and describe Apple’s set-up for differentially private data processing, including the overall data pipeline, the parameters used for differentially private perturbation of each piece of data, and the frequency with which such data is sent to Apple’s servers. We find that although Apple’s deployment ensures that the (differential) privacy loss per each datum submitted to its servers is $1$ or $2$, the overall privacy loss permitted by the system is significantly higher, as high as $16$ per day for the four initially announced applications of Emojis, New words, Deeplinks and Lookup Hints. Furthermore, Apple renews the privacy budget available every day, which leads to a possible privacy loss of 16 times the number of days since user opt-in to differentially private data collection for those four applications. We advocate that in order to claim the full benefits of differentially private data collection, Apple must give full transparency of its implementation, enable user choice in areas related to privacy loss, and set meaningful defaults on the privacy loss permitted. |
Tasks | |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.02753v2 |
http://arxiv.org/pdf/1709.02753v2.pdf | |
PWC | https://paperswithcode.com/paper/privacy-loss-in-apples-implementation-of |
Repo | |
Framework | |
Revisiting Distributed Synchronous SGD
Title | Revisiting Distributed Synchronous SGD |
Authors | Xinghao Pan, Jianmin Chen, Rajat Monga, Samy Bengio, Rafal Jozefowicz |
Abstract | Distributed training of deep learning models on large-scale training data is typically conducted with asynchronous stochastic optimization to maximize the rate of updates, at the cost of additional noise introduced from asynchrony. In contrast, the synchronous approach is often thought to be impractical due to idle time wasted on waiting for straggling workers. We revisit these conventional beliefs in this paper, and examine the weaknesses of both approaches. We demonstrate that a third approach, synchronous optimization with backup workers, can avoid asynchronous noise while mitigating for the worst stragglers. Our approach is empirically validated and shown to converge faster and to better test accuracies. |
Tasks | Stochastic Optimization |
Published | 2017-02-19 |
URL | http://arxiv.org/abs/1702.05800v2 |
http://arxiv.org/pdf/1702.05800v2.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-distributed-synchronous-sgd-1 |
Repo | |
Framework | |
To Go or Not To Go? A Near Unsupervised Learning Approach For Robot Navigation
Title | To Go or Not To Go? A Near Unsupervised Learning Approach For Robot Navigation |
Authors | Noriaki Hirose, Amir Sadeghian, Patrick Goebel, Silvio Savarese |
Abstract | It is important for robots to be able to decide whether they can go through a space or not, as they navigate through a dynamic environment. This capability can help them avoid injury or serious damage, e.g., as a result of running into people and obstacles, getting stuck, or falling off an edge. To this end, we propose an unsupervised and a near-unsupervised method based on Generative Adversarial Networks (GAN) to classify scenarios as traversable or not based on visual data. Our method is inspired by the recent success of data-driven approaches on computer vision problems and anomaly detection, and reduces the need for vast amounts of negative examples at training time. Collecting negative data indicating that a robot should not go through a space is typically hard and dangerous because of collisions, whereas collecting positive data can be automated and done safely based on the robot’s own traveling experience. We verify the generality and effectiveness of the proposed approach on a test dataset collected in a previously unseen environment with a mobile robot. Furthermore, we show that our method can be used to build costmaps (we call as “GoNoGo” costmaps) for robot path planning using visual data only. |
Tasks | Anomaly Detection, Robot Navigation |
Published | 2017-09-16 |
URL | http://arxiv.org/abs/1709.05439v1 |
http://arxiv.org/pdf/1709.05439v1.pdf | |
PWC | https://paperswithcode.com/paper/to-go-or-not-to-go-a-near-unsupervised |
Repo | |
Framework | |
Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight
Title | Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight |
Authors | Yen-Chen Lin, Ming-Yu Liu, Min Sun, Jia-Bin Huang |
Abstract | Deep reinforcement learning has shown promising results in learning control policies for complex sequential decision-making tasks. However, these neural network-based policies are known to be vulnerable to adversarial examples. This vulnerability poses a potentially serious threat to safety-critical systems such as autonomous vehicles. In this paper, we propose a defense mechanism to defend reinforcement learning agents from adversarial attacks by leveraging an action-conditioned frame prediction module. Our core idea is that the adversarial examples targeting at a neural network-based policy are not effective for the frame prediction model. By comparing the action distribution produced by a policy from processing the current observed frame to the action distribution produced by the same policy from processing the predicted frame from the action-conditioned frame prediction module, we can detect the presence of adversarial examples. Beyond detecting the presence of adversarial examples, our method allows the agent to continue performing the task using the predicted frame when the agent is under attack. We evaluate the performance of our algorithm using five games in Atari 2600. Our results demonstrate that the proposed defense mechanism achieves favorable performance against baseline algorithms in detecting adversarial examples and in earning rewards when the agents are under attack. |
Tasks | Autonomous Vehicles, Decision Making |
Published | 2017-10-02 |
URL | http://arxiv.org/abs/1710.00814v1 |
http://arxiv.org/pdf/1710.00814v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-adversarial-attacks-on-neural |
Repo | |
Framework | |