Paper Group ANR 685
JUMPER: Learning When to Make Classification Decisions in Reading. Residual Attention based Network for Hand Bone Age Assessment. The Responsibility Quantification (ResQu) Model of Human Interaction with Automation. A method for automatic forensic facial reconstruction based on dense statistics of soft tissue thickness. Study of Robust Diffusion Re …
JUMPER: Learning When to Make Classification Decisions in Reading
Title | JUMPER: Learning When to Make Classification Decisions in Reading |
Authors | Xianggen Liu, Lili Mou, Haotian Cui, Zhengdong Lu, Sen Song |
Abstract | In early years, text classification is typically accomplished by feature-based machine learning models; recently, deep neural networks, as a powerful learning machine, make it possible to work with raw input as the text stands. However, exiting end-to-end neural networks lack explicit interpretation of the prediction. In this paper, we propose a novel framework, JUMPER, inspired by the cognitive process of text reading, that models text classification as a sequential decision process. Basically, JUMPER is a neural system that scans a piece of text sequentially and makes classification decisions at the time it wishes. Both the classification result and when to make the classification are part of the decision process, which is controlled by a policy network and trained with reinforcement learning. Experimental results show that a properly trained JUMPER has the following properties: (1) It can make decisions whenever the evidence is enough, therefore reducing total text reading by 30-40% and often finding the key rationale of prediction. (2) It achieves classification accuracy better than or comparable to state-of-the-art models in several benchmark and industrial datasets. |
Tasks | Text Classification |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02314v1 |
http://arxiv.org/pdf/1807.02314v1.pdf | |
PWC | https://paperswithcode.com/paper/jumper-learning-when-to-make-classification |
Repo | |
Framework | |
Residual Attention based Network for Hand Bone Age Assessment
Title | Residual Attention based Network for Hand Bone Age Assessment |
Authors | Eric Wu, Bin Kong, Xin Wang, Junjie Bai, Yi Lu, Feng Gao, Shaoting Zhang, Kunlin Cao, Qi Song, Siwei Lyu, Youbing Yin |
Abstract | Computerized automatic methods have been employed to boost the productivity as well as objectiveness of hand bone age assessment. These approaches make predictions according to the whole X-ray images, which include other objects that may introduce distractions. Instead, our framework is inspired by the clinical workflow (Tanner-Whitehouse) of hand bone age assessment, which focuses on the key components of the hand. The proposed framework is composed of two components: a Mask R-CNN subnet of pixelwise hand segmentation and a residual attention network for hand bone age assessment. The Mask R-CNN subnet segments the hands from X-ray images to avoid the distractions of other objects (e.g., X-ray tags). The hierarchical attention components of the residual attention subnet force our network to focus on the key components of the X-ray images and generate the final predictions as well as the associated visual supports, which is similar to the assessment procedure of clinicians. We evaluate the performance of the proposed pipeline on the RSNA pediatric bone age dataset and the results demonstrate its superiority over the previous methods. |
Tasks | Hand Segmentation |
Published | 2018-12-21 |
URL | http://arxiv.org/abs/1901.05876v1 |
http://arxiv.org/pdf/1901.05876v1.pdf | |
PWC | https://paperswithcode.com/paper/residual-attention-based-network-for-hand |
Repo | |
Framework | |
The Responsibility Quantification (ResQu) Model of Human Interaction with Automation
Title | The Responsibility Quantification (ResQu) Model of Human Interaction with Automation |
Authors | Nir Douer, Joachim Meyer |
Abstract | Intelligent systems and advanced automation are involved in information collection and evaluation, in decision-making and in the implementation of chosen actions. In such systems, human responsibility becomes equivocal. Understanding human casual responsibility is particularly important when intelligent autonomous systems can harm people, as with autonomous vehicles or, most notably, with autonomous weapon systems (AWS). Using Information Theory, we develop a responsibility quantification (ResQu) model of human involvement in intelligent automated systems and demonstrate its applications on decisions regarding AWS. The analysis reveals that human comparative responsibility to outcomes is often low, even when major functions are allocated to the human. Thus, broadly stated policies of keeping humans in the loop and having meaningful human control are misleading and cannot truly direct decisions on how to involve humans in intelligent systems and advanced automation. The current model is an initial step in the complex goal to create a comprehensive responsibility model, that will enable quantification of human causal responsibility. It assumes stationarity, full knowledge regarding the characteristic of the human and automation and ignores temporal aspects. Despite these limitations, it can aid in the analysis of systems designs alternatives and policy decisions regarding human responsibility in intelligent systems and advanced automation. |
Tasks | Autonomous Vehicles, Decision Making |
Published | 2018-10-30 |
URL | https://arxiv.org/abs/1810.12644v3 |
https://arxiv.org/pdf/1810.12644v3.pdf | |
PWC | https://paperswithcode.com/paper/the-responsibility-quantification-resqu-model |
Repo | |
Framework | |
A method for automatic forensic facial reconstruction based on dense statistics of soft tissue thickness
Title | A method for automatic forensic facial reconstruction based on dense statistics of soft tissue thickness |
Authors | Thomas Gietzen, Robert Brylka, Jascha Achenbach, Katja zum Hebel, Elmar Schömer, Mario Botsch, Ulrich Schwanecke, Ralf Schulze |
Abstract | In this paper, we present a method for automated estimation of a human face given a skull remain. The proposed method is based on three statistical models. A volumetric (tetrahedral) skull model encoding the variations of different skulls, a surface head model encoding the head variations, and a dense statistic of facial soft tissue thickness (FSTT). All data are automatically derived from computed tomography (CT) head scans and optical face scans. In order to obtain a proper dense FSTT statistic, we register a skull model to each skull extracted from a CT scan and determine the FSTT value for each vertex of the skull model towards the associated extracted skin surface. The FSTT values at predefined landmarks from our statistic are well in agreement with data from the literature. To recover a face from a skull remain, we first fit our skull model to the given skull. Next, we generate spheres with radius of the respective FSTT value obtained from our statistic at each vertex of the registered skull. Finally, we fit a head model to the union of all spheres. The proposed automated method enables a probabilistic face-estimation that facilitates forensic recovery even from incomplete skull remains. The FSTT statistic allows the generation of plausible head variants, which can be adjusted intuitively using principal component analysis. We validate our face recovery process using an anonymized head CT scan. The estimation generated from the given skull visually compares well with the skin surface extracted from the CT scan itself. |
Tasks | Computed Tomography (CT) |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07334v1 |
http://arxiv.org/pdf/1808.07334v1.pdf | |
PWC | https://paperswithcode.com/paper/a-method-for-automatic-forensic-facial |
Repo | |
Framework | |
Study of Robust Diffusion Recursive Least Squares Algorithms with Side Information for Networked Agents
Title | Study of Robust Diffusion Recursive Least Squares Algorithms with Side Information for Networked Agents |
Authors | Y. Yu, R. C. de Lamare, Y. Zakharov |
Abstract | This work develops a robust diffusion recursive least squares algorithm to mitigate the performance degradation often experienced in networks of agents in the presence of impulsive noise. This algorithm minimizes an exponentially weighted least-squares cost function subject to a time-dependent constraint on the squared norm of the intermediate estimate update at each node. With the help of side information, the constraint is recursively updated in a diffusion strategy. Moreover, a control strategy for resetting the constraint is also proposed to retain good tracking capability when the estimated parameters suddenly change. Simulations show the superiority of the proposed algorithm over previously reported techniques in various impulsive noise scenarios. |
Tasks | |
Published | 2018-12-24 |
URL | http://arxiv.org/abs/1812.09985v1 |
http://arxiv.org/pdf/1812.09985v1.pdf | |
PWC | https://paperswithcode.com/paper/study-of-robust-diffusion-recursive-least |
Repo | |
Framework | |
Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption
Title | Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption |
Authors | Martin Bompaire, Emmanuel Bacry, Stéphane Gaïffas |
Abstract | The minimization of convex objectives coming from linear supervised learning problems, such as penalized generalized linear models, can be formulated as finite sums of convex functions. For such problems, a large set of stochastic first-order solvers based on the idea of variance reduction are available and combine both computational efficiency and sound theoretical guarantees (linear convergence rates). Such rates are obtained under both gradient-Lipschitz and strong convexity assumptions. Motivated by learning problems that do not meet the gradient-Lipschitz assumption, such as linear Poisson regression, we work under another smoothness assumption, and obtain a linear convergence rate for a shifted version of Stochastic Dual Coordinate Ascent (SDCA) that improves the current state-of-the-art. Our motivation for considering a solver working on the Fenchel-dual problem comes from the fact that such objectives include many linear constraints, that are easier to deal with in the dual. Our approach and theoretical findings are validated on several datasets, for Poisson regression and another objective coming from the negative log-likelihood of the Hawkes process, which is a family of models which proves extremely useful for the modeling of information propagation in social networks and causality inference. |
Tasks | |
Published | 2018-07-10 |
URL | http://arxiv.org/abs/1807.03545v2 |
http://arxiv.org/pdf/1807.03545v2.pdf | |
PWC | https://paperswithcode.com/paper/dual-optimization-for-convex-constrained |
Repo | |
Framework | |
A Tropical Approach to Neural Networks with Piecewise Linear Activations
Title | A Tropical Approach to Neural Networks with Piecewise Linear Activations |
Authors | Vasileios Charisopoulos, Petros Maragos |
Abstract | We present a new, unifying approach following some recent developments on the complexity of neural networks with piecewise linear activations. We treat neural network layers with piecewise linear activations as tropical polynomials, which generalize polynomials in the so-called $(\max, +)$ or tropical algebra, with possibly real-valued exponents. Motivated by the discussion in (arXiv:1402.1869), this approach enables us to refine their upper bounds on linear regions of layers with ReLU or leaky ReLU activations to $\min\left{ 2^m, \sum_{j=0}^n \binom{m}{j} \right}$, where $n, m$ are the number of inputs and outputs, respectively. Additionally, we recover their upper bounds on maxout layers. Our work follows a novel path, exclusively under the lens of tropical geometry, which is independent of the improvements reported in (arXiv:1611.01491, arXiv:1711.02114). Finally, we present a geometric approach for effective counting of linear regions using random sampling in order to avoid the computational overhead of exact counting approaches |
Tasks | |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08749v2 |
http://arxiv.org/pdf/1805.08749v2.pdf | |
PWC | https://paperswithcode.com/paper/a-tropical-approach-to-neural-networks-with |
Repo | |
Framework | |
Early Stratification of Patients at Risk for Postoperative Complications after Elective Colectomy
Title | Early Stratification of Patients at Risk for Postoperative Complications after Elective Colectomy |
Authors | Wen Wang, Rema Padman, Nirav Shah |
Abstract | Stratifying patients at risk for postoperative complications may facilitate timely and accurate workups and reduce the burden of adverse events on patients and the health system. Currently, a widely-used surgical risk calculator created by the American College of Surgeons, NSQIP, uses 21 preoperative covariates to assess risk of postoperative complications, but lacks dynamic, real-time capabilities to accommodate postoperative information. We propose a new Hidden Markov Model sequence classifier for analyzing patients’ postoperative temperature sequences that incorporates their time-invariant characteristics in both transition probability and initial state probability in order to develop a postoperative “real-time” complication detector. Data from elective Colectomy surgery indicate that our method has improved classification performance compared to 8 other machine learning classifiers when using the full temperature sequence associated with the patients’ length of stay. Additionally, within 44 hours after surgery, the performance of the model is close to that of full-length temperature sequence. |
Tasks | |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12227v1 |
http://arxiv.org/pdf/1811.12227v1.pdf | |
PWC | https://paperswithcode.com/paper/early-stratification-of-patients-at-risk-for |
Repo | |
Framework | |
End-to-end Graph-based TAG Parsing with Neural Networks
Title | End-to-end Graph-based TAG Parsing with Neural Networks |
Authors | Jungo Kasai, Robert Frank, Pauli Xu, William Merrill, Owen Rambow |
Abstract | We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs. Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points. The graph-based parsing architecture allows for global inference and rich feature representations for TAG parsing, alleviating the fundamental trade-off between transition-based and graph-based parsing systems. We also demonstrate that the proposed parser achieves state-of-the-art performance in the downstream tasks of Parsing Evaluation using Textual Entailments (PETE) and Unbounded Dependency Recovery. This provides further support for the claim that TAG is a viable formalism for problems that require rich structural analysis of sentences. |
Tasks | |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06610v3 |
http://arxiv.org/pdf/1804.06610v3.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-graph-based-tag-parsing-with |
Repo | |
Framework | |
Multiple Sclerosis Lesion Segmentation from Brain MRI via Fully Convolutional Neural Networks
Title | Multiple Sclerosis Lesion Segmentation from Brain MRI via Fully Convolutional Neural Networks |
Authors | Snehashis Roy, John A. Butman, Daniel S. Reich, Peter A. Calabresi, Dzung L. Pham |
Abstract | Multiple Sclerosis (MS) is an autoimmune disease that leads to lesions in the central nervous system. Magnetic resonance (MR) images provide sufficient imaging contrast to visualize and detect lesions, particularly those in the white matter. Quantitative measures based on various features of lesions have been shown to be useful in clinical trials for evaluating therapies. Therefore robust and accurate segmentation of white matter lesions from MR images can provide important information about the disease status and progression. In this paper, we propose a fully convolutional neural network (CNN) based method to segment white matter lesions from multi-contrast MR images. The proposed CNN based method contains two convolutional pathways. The first pathway consists of multiple parallel convolutional filter banks catering to multiple MR modalities. In the second pathway, the outputs of the first one are concatenated and another set of convolutional filters are applied. The output of this last pathway produces a membership function for lesions that may be thresholded to obtain a binary segmentation. The proposed method is evaluated on a dataset of 100 MS patients, as well as the ISBI 2015 challenge data consisting of 14 patients. The comparison is performed against four publicly available MS lesion segmentation methods. Significant improvement in segmentation quality over the competing methods is demonstrated on various metrics, such as Dice and false positive ratio. While evaluating on the ISBI 2015 challenge data, our method produces a score of 90.48, where a score of 90 is considered to be comparable to a human rater. |
Tasks | Lesion Segmentation |
Published | 2018-03-24 |
URL | http://arxiv.org/abs/1803.09172v1 |
http://arxiv.org/pdf/1803.09172v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-sclerosis-lesion-segmentation-from |
Repo | |
Framework | |
The largest cognitive systems will be optoelectronic
Title | The largest cognitive systems will be optoelectronic |
Authors | Jeffrey M. Shainline |
Abstract | Electrons and photons offer complementary strengths for information processing. Photons are excellent for communication, while electrons are superior for computation and memory. Cognition requires distributed computation to be communicated across the system for information integration. We present reasoning from neuroscience, network theory, and device physics supporting the conjecture that large-scale cognitive systems will benefit from electronic devices performing synaptic, dendritic, and neuronal information processing operating in conjunction with photonic communication. On the chip scale, integrated dielectric waveguides enable fan-out to thousands of connections. On the system scale, fiber and free-space optics can be employed. The largest cognitive systems will be limited by the distance light can travel during the period of a network oscillation. We calculate that optoelectronic networks the area of a large data center ($10^5$,m$^2$) will be capable of system-wide information integration at $1$,MHz. At frequencies of cortex-wide integration in the human brain ($4$,Hz, theta band), optoelectronic systems could integrate information across the surface of the earth. |
Tasks | |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02572v1 |
http://arxiv.org/pdf/1809.02572v1.pdf | |
PWC | https://paperswithcode.com/paper/the-largest-cognitive-systems-will-be |
Repo | |
Framework | |
l0-norm Based Centers Selection for Training Fault Tolerant RBF Networks and Selecting Centers
Title | l0-norm Based Centers Selection for Training Fault Tolerant RBF Networks and Selecting Centers |
Authors | Hao Wang, Chi-Sing Leung, Hing Cheung So, Ruibin Feng, Zifa Han |
Abstract | The aim of this paper is to train an RBF neural network and select centers under concurrent faults. It is well known that fault tolerance is a very attractive property for neural networks. And center selection is an important procedure during the training process of an RBF neural network. In this paper, we devise two novel algorithms to address these two issues simultaneously. Both of them are based on the ADMM framework. In the first method, the minimax concave penalty (MCP) function is introduced to select centers. In the second method, an l0-norm term is directly used, and the hard threshold (HT) is utilized to address the l0-norm term. Under several mild conditions, we can prove that both methods can globally converge to a unique limit point. Simulation results show that, under concurrent fault, the proposed algorithms are superior to many existing methods. |
Tasks | |
Published | 2018-05-30 |
URL | http://arxiv.org/abs/1805.11987v3 |
http://arxiv.org/pdf/1805.11987v3.pdf | |
PWC | https://paperswithcode.com/paper/l0-norm-based-centers-selection-for-training |
Repo | |
Framework | |
Learning A Shared Transform Model for Skull to Digital Face Image Matching
Title | Learning A Shared Transform Model for Skull to Digital Face Image Matching |
Authors | Maneet Singh, Shruti Nagpal, Richa Singh, Mayank Vatsa, Afzel Noore |
Abstract | Human skull identification is an arduous task, traditionally requiring the expertise of forensic artists and anthropologists. This paper is an effort to automate the process of matching skull images to digital face images, thereby establishing an identity of the skeletal remains. In order to achieve this, a novel Shared Transform Model is proposed for learning discriminative representations. The model learns robust features while reducing the intra-class variations between skulls and digital face images. Such a model can assist law enforcement agencies by speeding up the process of skull identification, and reducing the manual load. Experimental evaluation performed on two pre-defined protocols of the publicly available IdentifyMe dataset demonstrates the efficacy of the proposed model. |
Tasks | |
Published | 2018-08-14 |
URL | http://arxiv.org/abs/1808.04571v1 |
http://arxiv.org/pdf/1808.04571v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-shared-transform-model-for-skull |
Repo | |
Framework | |
A Tight Runtime Analysis for the $(μ+ λ)$ EA
Title | A Tight Runtime Analysis for the $(μ+ λ)$ EA |
Authors | Denis Antipov, Benjamin Doerr |
Abstract | Despite significant progress in the theory of evolutionary algorithms, the theoretical understanding of evolutionary algorithms which use non-trivial populations remains challenging and only few rigorous results exist. Already for the most basic problem, the determination of the asymptotic runtime of the $(\mu+\lambda)$ evolutionary algorithm on the simple OneMax benchmark function, only the special cases $\mu=1$ and $\lambda=1$ have been solved. In this work, we analyze this long-standing problem and show the asymptotically tight result that the runtime $T$, the number of iterations until the optimum is found, satisfies [E[T] = \Theta\bigg(\frac{n\log n}{\lambda}+\frac{n}{\lambda / \mu} + \frac{n\log^+\log^+ \lambda/ \mu}{\log^+ \lambda / \mu}\bigg),] where $\log^+ x := \max{1, \log x}$ for all $x > 0$. The same methods allow to improve the previous-best $O(\frac{n \log n}{\lambda} + n \log \lambda)$ runtime guarantee for the $(\lambda+\lambda)$~EA with fair parent selection to a tight $\Theta(\frac{n \log n}{\lambda} + n)$ runtime result. |
Tasks | |
Published | 2018-12-28 |
URL | https://arxiv.org/abs/1812.11061v2 |
https://arxiv.org/pdf/1812.11061v2.pdf | |
PWC | https://paperswithcode.com/paper/a-tight-runtime-analysis-for-the-ea |
Repo | |
Framework | |
Characterizing Departures from Linearity in Word Translation
Title | Characterizing Departures from Linearity in Word Translation |
Authors | Ndapa Nakashole, Raphael Flauger |
Abstract | We investigate the behavior of maps learned by machine translation methods. The maps translate words by projecting between word embedding spaces of different languages. We locally approximate these maps using linear maps, and find that they vary across the word embedding space. This demonstrates that the underlying maps are non-linear. Importantly, we show that the locally linear maps vary by an amount that is tightly correlated with the distance between the neighborhoods on which they are trained. Our results can be used to test non-linear methods, and to drive the design of more accurate maps for word translation. |
Tasks | Machine Translation |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.04508v2 |
http://arxiv.org/pdf/1806.04508v2.pdf | |
PWC | https://paperswithcode.com/paper/characterizing-departures-from-linearity-in |
Repo | |
Framework | |