Paper Group ANR 450
Transport Analysis of Infinitely Deep Neural Network. A Hackathon for Classical Tibetan. Mahalanobis Distance for Class Averaging of Cryo-EM Images. Fractional Order Load-Frequency Control of Interconnected Power Systems Using Chaotic Multi-objective Optimization. Identifying Candidate Risk Factors for Prescription Drug Side Effects using Causal Co …
Transport Analysis of Infinitely Deep Neural Network
Title | Transport Analysis of Infinitely Deep Neural Network |
Authors | Sho Sonoda, Noboru Murata |
Abstract | We investigated the feature map inside deep neural networks (DNNs) by tracking the transport map. We are interested in the role of depth (why do DNNs perform better than shallow models?) and the interpretation of DNNs (what do intermediate layers do?) Despite the rapid development in their application, DNNs remain analytically unexplained because the hidden layers are nested and the parameters are not faithful. Inspired by the integral representation of shallow NNs, which is the continuum limit of the width, or the hidden unit number, we developed the flow representation and transport analysis of DNNs. The flow representation is the continuum limit of the depth or the hidden layer number, and it is specified by an ordinary differential equation with a vector field. We interpret an ordinary DNN as a transport map or a Euler broken line approximation of the flow. Technically speaking, a dynamical system is a natural model for the nested feature maps. In addition, it opens a new way to the coordinate-free treatment of DNNs by avoiding the redundant parametrization of DNNs. Following Wasserstein geometry, we analyze a flow in three aspects: dynamical system, continuity equation, and Wasserstein gradient flow. A key finding is that we specified a series of transport maps of the denoising autoencoder (DAE). Starting from the shallow DAE, this paper develops three topics: the transport map of the deep DAE, the equivalence between the stacked DAE and the composition of DAEs, and the development of the double continuum limit or the integral representation of the flow representation. As partial answers to the research questions, we found that deeper DAEs converge faster and the extracted features are better; in addition, a deep Gaussian DAE transports mass to decrease the Shannon entropy of the data distribution. |
Tasks | Denoising |
Published | 2016-05-10 |
URL | http://arxiv.org/abs/1605.02832v2 |
http://arxiv.org/pdf/1605.02832v2.pdf | |
PWC | https://paperswithcode.com/paper/transport-analysis-of-infinitely-deep-neural |
Repo | |
Framework | |
A Hackathon for Classical Tibetan
Title | A Hackathon for Classical Tibetan |
Authors | Orna Almogi, Lena Dankin, Nachum Dershowitz, Lior Wolf |
Abstract | We describe the course of a hackathon dedicated to the development of linguistic tools for Tibetan Buddhist studies. Over a period of five days, a group of seventeen scholars, scientists, and students developed and compared algorithms for intertextual alignment and text classification, along with some basic language tools, including a stemmer and word segmenter. |
Tasks | Text Classification |
Published | 2016-09-27 |
URL | http://arxiv.org/abs/1609.08389v2 |
http://arxiv.org/pdf/1609.08389v2.pdf | |
PWC | https://paperswithcode.com/paper/a-hackathon-for-classical-tibetan |
Repo | |
Framework | |
Mahalanobis Distance for Class Averaging of Cryo-EM Images
Title | Mahalanobis Distance for Class Averaging of Cryo-EM Images |
Authors | Tejal Bhamre, Zhizhen Zhao, Amit Singer |
Abstract | Single particle reconstruction (SPR) from cryo-electron microscopy (EM) is a technique in which the 3D structure of a molecule needs to be determined from its contrast transfer function (CTF) affected, noisy 2D projection images taken at unknown viewing directions. One of the main challenges in cryo-EM is the typically low signal to noise ratio (SNR) of the acquired images. 2D classification of images, followed by class averaging, improves the SNR of the resulting averages, and is used for selecting particles from micrographs and for inspecting the particle images. We introduce a new affinity measure, akin to the Mahalanobis distance, to compare cryo-EM images belonging to different defocus groups. The new similarity measure is employed to detect similar images, thereby leading to an improved algorithm for class averaging. We evaluate the performance of the proposed class averaging procedure on synthetic datasets, obtaining state of the art classification. |
Tasks | |
Published | 2016-11-10 |
URL | http://arxiv.org/abs/1611.03193v4 |
http://arxiv.org/pdf/1611.03193v4.pdf | |
PWC | https://paperswithcode.com/paper/mahalanobis-distance-for-class-averaging-of |
Repo | |
Framework | |
Fractional Order Load-Frequency Control of Interconnected Power Systems Using Chaotic Multi-objective Optimization
Title | Fractional Order Load-Frequency Control of Interconnected Power Systems Using Chaotic Multi-objective Optimization |
Authors | Indranil Pan, Saptarshi Das |
Abstract | Fractional order proportional-integral-derivative (FOPID) controllers are designed for load frequency control (LFC) of two interconnected power systems. Conflicting time domain design objectives are considered in a multi objective optimization (MOO) based design framework to design the gains and the fractional differ-integral orders of the FOPID controllers in the two areas. Here, we explore the effect of augmenting two different chaotic maps along with the uniform random number generator (RNG) in the popular MOO algorithm - the Non-dominated Sorting Genetic Algorithm-II (NSGA-II). Different measures of quality for MOO e.g. hypervolume indicator, moment of inertia based diversity metric, total Pareto spread, spacing metric are adopted to select the best set of controller parameters from multiple runs of all the NSGA-II variants (i.e. nominal and chaotic versions). The chaotic versions of the NSGA-II algorithm are compared with the standard NSGA-II in terms of solution quality and computational time. In addition, the Pareto optimal fronts showing the trade-off between the two conflicting time domain design objectives are compared to show the advantage of using the FOPID controller over that with simple PID controller. The nature of fast/slow and high/low noise amplification effects of the FOPID structure or the four quadrant operation in the two inter-connected areas of the power system is also explored. A fuzzy logic based method has been adopted next to select the best compromise solution from the best Pareto fronts corresponding to each MOO comparison criteria. The time domain system responses are shown for the fuzzy best compromise solutions under nominal operating conditions. Comparative analysis on the merits and de-merits of each controller structure is reported then. A robustness analysis is also done for the PID and the FOPID controllers. |
Tasks | |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09802v1 |
http://arxiv.org/pdf/1611.09802v1.pdf | |
PWC | https://paperswithcode.com/paper/fractional-order-load-frequency-control-of |
Repo | |
Framework | |
Identifying Candidate Risk Factors for Prescription Drug Side Effects using Causal Contrast Set Mining
Title | Identifying Candidate Risk Factors for Prescription Drug Side Effects using Causal Contrast Set Mining |
Authors | Jenna Reps, Zhaoyang Guo, Haoyue Zhu, Uwe Aickelin |
Abstract | Big longitudinal observational databases present the opportunity to extract new knowledge in a cost effective manner. Unfortunately, the ability of these databases to be used for causal inference is limited due to the passive way in which the data are collected resulting in various forms of bias. In this paper we investigate a method that can overcome these limitations and determine causal contrast set rules efficiently from big data. In particular, we present a new methodology for the purpose of identifying risk factors that increase a patients likelihood of experiencing the known rare side effect of renal failure after ingesting aminosalicylates. The results show that the methodology was able to identify previously researched risk factors such as being prescribed diuretics and highlighted that patients with a higher than average risk of renal failure may be even more susceptible to experiencing it as a side effect after ingesting aminosalicylates. |
Tasks | Causal Inference |
Published | 2016-07-20 |
URL | http://arxiv.org/abs/1607.05845v1 |
http://arxiv.org/pdf/1607.05845v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-candidate-risk-factors-for |
Repo | |
Framework | |
Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions
Title | Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions |
Authors | Johanna Carvajal, Arnold Wiliem, Chris McCool, Brian Lovell, Conrad Sanderson |
Abstract | We present a comparative evaluation of various techniques for action recognition while keeping as many variables as possible controlled. We employ two categories of Riemannian manifolds: symmetric positive definite matrices and linear subspaces. For both categories we use their corresponding nearest neighbour classifiers, kernels, and recent kernelised sparse representations. We compare against traditional action recognition techniques based on Gaussian mixture models and Fisher vectors (FVs). We evaluate these action recognition techniques under ideal conditions, as well as their sensitivity in more challenging conditions (variations in scale and translation). Despite recent advancements for handling manifolds, manifold based techniques obtain the lowest performance and their kernel representations are more unstable in the presence of challenging conditions. The FV approach obtains the highest accuracy under ideal conditions. Moreover, FV best deals with moderate scale and translation changes. |
Tasks | Temporal Action Localization |
Published | 2016-02-04 |
URL | http://arxiv.org/abs/1602.01599v3 |
http://arxiv.org/pdf/1602.01599v3.pdf | |
PWC | https://paperswithcode.com/paper/comparative-evaluation-of-action-recognition |
Repo | |
Framework | |
Neural computation from first principles: Using the maximum entropy method to obtain an optimal bits-per-joule neuron
Title | Neural computation from first principles: Using the maximum entropy method to obtain an optimal bits-per-joule neuron |
Authors | William B Levy, Toby Berger, Mustafa Sungkar |
Abstract | Optimization results are one method for understanding neural computation from Nature’s perspective and for defining the physical limits on neuron-like engineering. Earlier work looks at individual properties or performance criteria and occasionally a combination of two, such as energy and information. Here we make use of Jaynes’ maximum entropy method and combine a larger set of constraints, possibly dimensionally distinct, each expressible as an expectation. The method identifies a likelihood-function and a sufficient statistic arising from each such optimization. This likelihood is a first-hitting time distribution in the exponential class. Particular constraint sets are identified that, from an optimal inference perspective, justify earlier neurocomputational models. Interactions between constraints, mediated through the inferred likelihood, restrict constraint-set parameterizations, e.g., the energy-budget limits estimation performance which, in turn, matches an axonal communication constraint. Such linkages are, for biologists, experimental predictions of the method. In addition to the related likelihood, at least one type of constraint set implies marginal distributions, and in this case, a Shannon bits/joule statement arises. |
Tasks | |
Published | 2016-06-06 |
URL | http://arxiv.org/abs/1606.03063v2 |
http://arxiv.org/pdf/1606.03063v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-computation-from-first-principles |
Repo | |
Framework | |
Recurrent neural network training with preconditioned stochastic gradient descent
Title | Recurrent neural network training with preconditioned stochastic gradient descent |
Authors | Xi-Lin Li |
Abstract | This paper studies the performance of a recently proposed preconditioned stochastic gradient descent (PSGD) algorithm on recurrent neural network (RNN) training. PSGD adaptively estimates a preconditioner to accelerate gradient descent, and is designed to be simple, general and easy to use, as stochastic gradient descent (SGD). RNNs, especially the ones requiring extremely long term memories, are difficult to train. We have tested PSGD on a set of synthetic pathological RNN learning problems and the real world MNIST handwritten digit recognition task. Experimental results suggest that PSGD is able to achieve highly competitive performance without using any trick like preprocessing, pretraining or parameter tweaking. |
Tasks | Handwritten Digit Recognition |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04449v2 |
http://arxiv.org/pdf/1606.04449v2.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-network-training-with |
Repo | |
Framework | |
Automated Segmentation of Retinal Layers from Optical Coherent Tomography Images Using Geodesic Distance
Title | Automated Segmentation of Retinal Layers from Optical Coherent Tomography Images Using Geodesic Distance |
Authors | Jinming Duan, Christopher Tench, Irene Gottlob, Frank Proudlock, Li Bai |
Abstract | Optical coherence tomography (OCT) is a non-invasive imaging technique that can produce images of the eye at the microscopic level. OCT image segmentation to localise retinal layer boundaries is a fundamental procedure for diagnosing and monitoring the progression of retinal and optical nerve disorders. In this paper, we introduce a novel and accurate geodesic distance method (GDM) for OCT segmentation of both healthy and pathological images in either two- or three-dimensional spaces. The method uses a weighted geodesic distance by an exponential function, taking into account both horizontal and vertical intensity variations. The weighted geodesic distance is efficiently calculated from an Eikonal equation via the fast sweeping method. The segmentation is then realised by solving an ordinary differential equation with the geodesic distance. The results of the GDM are compared with manually segmented retinal layer boundaries/surfaces. Extensive experiments demonstrate that the proposed GDM is robust to complex retinal structures with large curvatures and irregularities and it outperforms the parametric active contour algorithm as well as the graph theoretic based approaches for delineating the retinal layers in both healthy and pathological images. |
Tasks | Semantic Segmentation |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.02214v1 |
http://arxiv.org/pdf/1609.02214v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-segmentation-of-retinal-layers-from |
Repo | |
Framework | |
Shift-Reduce Constituent Parsing with Neural Lookahead Features
Title | Shift-Reduce Constituent Parsing with Neural Lookahead Features |
Authors | Jiangming Liu, Yue Zhang |
Abstract | Transition-based models can be fast and accurate for constituent parsing. Compared with chart-based models, they leverage richer features by extracting history information from a parser stack, which spans over non-local constituents. On the other hand, during incremental parsing, constituent information on the right hand side of the current word is not utilized, which is a relative weakness of shift-reduce parsing. To address this limitation, we leverage a fast neural model to extract lookahead features. In particular, we build a bidirectional LSTM model, which leverages the full sentence information to predict the hierarchy of constituents that each word starts and ends. The results are then passed to a strong transition-based constituent parser as lookahead features. The resulting parser gives 1.3% absolute improvement in WSJ and 2.3% in CTB compared to the baseline, given the highest reported accuracies for fully-supervised parsing. |
Tasks | |
Published | 2016-12-02 |
URL | http://arxiv.org/abs/1612.00567v1 |
http://arxiv.org/pdf/1612.00567v1.pdf | |
PWC | https://paperswithcode.com/paper/shift-reduce-constituent-parsing-with-neural |
Repo | |
Framework | |
Scaling Factorial Hidden Markov Models: Stochastic Variational Inference without Messages
Title | Scaling Factorial Hidden Markov Models: Stochastic Variational Inference without Messages |
Authors | Yin Cheng Ng, Pawel Chilinski, Ricardo Silva |
Abstract | Factorial Hidden Markov Models (FHMMs) are powerful models for sequential data but they do not scale well with long sequences. We propose a scalable inference and learning algorithm for FHMMs that draws on ideas from the stochastic variational inference, neural network and copula literatures. Unlike existing approaches, the proposed algorithm requires no message passing procedure among latent variables and can be distributed to a network of computers to speed up learning. Our experiments corroborate that the proposed algorithm does not introduce further approximation bias compared to the proven structured mean-field algorithm, and achieves better performance with long sequences and large FHMMs. |
Tasks | |
Published | 2016-08-12 |
URL | http://arxiv.org/abs/1608.03817v3 |
http://arxiv.org/pdf/1608.03817v3.pdf | |
PWC | https://paperswithcode.com/paper/scaling-factorial-hidden-markov-models |
Repo | |
Framework | |
A Siamese Long Short-Term Memory Architecture for Human Re-Identification
Title | A Siamese Long Short-Term Memory Architecture for Human Re-Identification |
Authors | Rahul Rama Varior, Bing Shuai, Jiwen Lu, Dong Xu, Gang Wang |
Abstract | Matching pedestrians across multiple camera views known as human re-identification (re-identification) is a challenging problem in visual surveillance. In the existing works concentrating on feature extraction, representations are formed locally and independent of other regions. We present a novel siamese Long Short-Term Memory (LSTM) architecture that can process image regions sequentially and enhance the discriminative capability of local feature representation by leveraging contextual information. The feedback connections and internal gating mechanism of the LSTM cells enable our model to memorize the spatial dependencies and selectively propagate relevant contextual information through the network. We demonstrate improved performance compared to the baseline algorithm with no LSTM units and promising results compared to state-of-the-art methods on Market-1501, CUHK03 and VIPeR datasets. Visualization of the internal mechanism of LSTM cells shows meaningful patterns can be learned by our method. |
Tasks | Person Re-Identification |
Published | 2016-07-28 |
URL | http://arxiv.org/abs/1607.08381v1 |
http://arxiv.org/pdf/1607.08381v1.pdf | |
PWC | https://paperswithcode.com/paper/a-siamese-long-short-term-memory-architecture |
Repo | |
Framework | |
Recurrent Attention Models for Depth-Based Person Identification
Title | Recurrent Attention Models for Depth-Based Person Identification |
Authors | Albert Haque, Alexandre Alahi, Li Fei-Fei |
Abstract | We present an attention-based model that reasons on human body shape and motion dynamics to identify individuals in the absence of RGB information, hence in the dark. Our approach leverages unique 4D spatio-temporal signatures to address the identification problem across days. Formulated as a reinforcement learning task, our model is based on a combination of convolutional and recurrent neural networks with the goal of identifying small, discriminative regions indicative of human identity. We demonstrate that our model produces state-of-the-art results on several published datasets given only depth images. We further study the robustness of our model towards viewpoint, appearance, and volumetric changes. Finally, we share insights gleaned from interpretable 2D, 3D, and 4D visualizations of our model’s spatio-temporal attention. |
Tasks | Person Identification |
Published | 2016-11-22 |
URL | http://arxiv.org/abs/1611.07212v1 |
http://arxiv.org/pdf/1611.07212v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-attention-models-for-depth-based |
Repo | |
Framework | |
The IBM Speaker Recognition System: Recent Advances and Error Analysis
Title | The IBM Speaker Recognition System: Recent Advances and Error Analysis |
Authors | Seyed Omid Sadjadi, Jason Pelecanos, Sriram Ganapathy |
Abstract | We present the recent advances along with an error analysis of the IBM speaker recognition system for conversational speech. Some of the key advancements that contribute to our system include: a nearest-neighbor discriminant analysis (NDA) approach (as opposed to LDA) for intersession variability compensation in the i-vector space, the application of speaker and channel-adapted features derived from an automatic speech recognition (ASR) system for speaker recognition, and the use of a DNN acoustic model with a very large number of output units (~10k senones) to compute the frame-level soft alignments required in the i-vector estimation process. We evaluate these techniques on the NIST 2010 SRE extended core conditions (C1-C9), as well as the 10sec-10sec condition. To our knowledge, results achieved by our system represent the best performances published to date on these conditions. For example, on the extended tel-tel condition (C5) the system achieves an EER of 0.59%. To garner further understanding of the remaining errors (on C5), we examine the recordings associated with the low scoring target trials, where various issues are identified for the problematic recordings/trials. Interestingly, it is observed that correcting the pathological recordings not only improves the scores for the target trials but also for the nontarget trials. |
Tasks | Speaker Recognition, Speech Recognition |
Published | 2016-05-05 |
URL | http://arxiv.org/abs/1605.01635v1 |
http://arxiv.org/pdf/1605.01635v1.pdf | |
PWC | https://paperswithcode.com/paper/the-ibm-speaker-recognition-system-recent |
Repo | |
Framework | |
Crossing the Road Without Traffic Lights: An Android-based Safety Device
Title | Crossing the Road Without Traffic Lights: An Android-based Safety Device |
Authors | Adi Perry, Dor Verbin, Nahum Kiryati |
Abstract | In the absence of pedestrian crossing lights, finding a safe moment to cross the road is often hazardous and challenging, especially for people with visual impairments. We present a reliable low-cost solution, an Android device attached to a traffic sign or lighting pole near the crossing, indicating whether it is safe to cross the road. The indication can be by sound, display, vibration, and various communication modalities provided by the Android device. The integral system camera is aimed at approaching traffic. Optical flow is computed from the incoming video stream, and projected onto an influx map, automatically acquired during a brief training period. The crossing safety is determined based on a 1-dimensional temporal signal derived from the projection. We implemented the complete system on a Samsung Galaxy K-Zoom Android smartphone, and obtained real-time operation. The system achieves promising experimental results, providing pedestrians with sufficiently early warning of approaching vehicles. The system can serve as a stand-alone safety device, that can be installed where pedestrian crossing lights are ruled out. Requiring no dedicated infrastructure, it can be powered by a solar panel and remotely maintained via the cellular network. |
Tasks | Optical Flow Estimation |
Published | 2016-10-11 |
URL | http://arxiv.org/abs/1610.03393v1 |
http://arxiv.org/pdf/1610.03393v1.pdf | |
PWC | https://paperswithcode.com/paper/crossing-the-road-without-traffic-lights-an |
Repo | |
Framework | |