Paper Group AWR 64
A Domain Guided CNN Architecture for Predicting Age from Structural Brain Images. Distributed regression modeling for selecting markers under data protection constraints. Probabilistic Verification of Fairness Properties via Concentration. Single-Agent Policy Tree Search With Guarantees. Free as in Free Word Order: An Energy Based Model for Word Se …
A Domain Guided CNN Architecture for Predicting Age from Structural Brain Images
Title | A Domain Guided CNN Architecture for Predicting Age from Structural Brain Images |
Authors | Pascal Sturmfels, Saige Rutherford, Mike Angstadt, Mark Peterson, Chandra Sripada, Jenna Wiens |
Abstract | Given the wide success of convolutional neural networks (CNNs) applied to natural images, researchers have begun to apply them to neuroimaging data. To date, however, exploration of novel CNN architectures tailored to neuroimaging data has been limited. Several recent works fail to leverage the 3D structure of the brain, instead treating the brain as a set of independent 2D slices. Approaches that do utilize 3D convolutions rely on architectures developed for object recognition tasks in natural 2D images. Such architectures make assumptions about the input that may not hold for neuroimaging. For example, existing architectures assume that patterns in the brain exhibit translation invariance. However, a pattern in the brain may have different meaning depending on where in the brain it is located. There is a need to explore novel architectures that are tailored to brain images. We present two simple modifications to existing CNN architectures based on brain image structure. Applied to the task of brain age prediction, our network achieves a mean absolute error (MAE) of 1.4 years and trains 30% faster than a CNN baseline that achieves a MAE of 1.6 years. Our results suggest that lessons learned from developing models on natural images may not directly transfer to neuroimaging tasks. Instead, there remains a large space of unexplored questions regarding model development in this area, whose answers may differ from conventional wisdom. |
Tasks | Object Recognition |
Published | 2018-08-11 |
URL | http://arxiv.org/abs/1808.04362v1 |
http://arxiv.org/pdf/1808.04362v1.pdf | |
PWC | https://paperswithcode.com/paper/a-domain-guided-cnn-architecture-for |
Repo | https://github.com/saigerutherford/anatomically_defined_CNNs |
Framework | tf |
Distributed regression modeling for selecting markers under data protection constraints
Title | Distributed regression modeling for selecting markers under data protection constraints |
Authors | Daniela Zöller, Stefan Lenz, Harald Binder |
Abstract | Data protection constraints frequently require a distributed analysis of data, i.e., individual-level data remains at many different sites, but analysis nevertheless has to be performed jointly. The corresponding aggregated data is often exchanged manually, requiring explicit permission before transfer, i.e., the number of data calls and the amount of data should be limited. Thus, only simple aggregated summary statistics are typically transferred with just a single call. This does not allow for more complex tasks such as variable selection. As an alternative, we propose a multivariable regression approach for identifying important markers by automatic variable selection based on aggregated data from different locations in iterative calls. To minimize the amount of transferred data and the number of calls, we also provide a heuristic variant of the approach. When performing a global data standardization, the proposed methods yields the same results as when pooling individual-level data. In a simulation study, the information loss introduced by a local standardization is seen to be minimal. In a typical scenario, the heuristic decreases the number of data calls from more than 10 to 3, rendering manual data releases feasible. To make our approach widely available for application, we provide an implementation on top of the DataSHIELD framework. |
Tasks | |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00422v2 |
http://arxiv.org/pdf/1803.00422v2.pdf | |
PWC | https://paperswithcode.com/paper/distributed-regression-modeling-for-selecting |
Repo | https://github.com/danielazoeller/ds_DistributedBoosting.jl |
Framework | none |
Probabilistic Verification of Fairness Properties via Concentration
Title | Probabilistic Verification of Fairness Properties via Concentration |
Authors | Osbert Bastani, Xin Zhang, Armando Solar-Lezama |
Abstract | As machine learning systems are increasingly used to make real world legal and financial decisions, it is of paramount importance that we develop algorithms to verify that these systems do not discriminate against minorities. We design a scalable algorithm for verifying fairness specifications. Our algorithm obtains strong correctness guarantees based on adaptive concentration inequalities; such inequalities enable our algorithm to adaptively take samples until it has enough data to make a decision. We implement our algorithm in a tool called VeriFair, and show that it scales to large machine learning models, including a deep recurrent neural network that is more than five orders of magnitude larger than the largest previously-verified neural network. While our technique only gives probabilistic guarantees due to the use of random samples, we show that we can choose the probability of error to be extremely small. |
Tasks | |
Published | 2018-12-02 |
URL | https://arxiv.org/abs/1812.02573v2 |
https://arxiv.org/pdf/1812.02573v2.pdf | |
PWC | https://paperswithcode.com/paper/verifying-fairness-properties-via |
Repo | https://github.com/obastani/verifair |
Framework | tf |
Single-Agent Policy Tree Search With Guarantees
Title | Single-Agent Policy Tree Search With Guarantees |
Authors | Laurent Orseau, Levi H. S. Lelis, Tor Lattimore, Théophane Weber |
Abstract | We introduce two novel tree search algorithms that use a policy to guide search. The first algorithm is a best-first enumeration that uses a cost function that allows us to prove an upper bound on the number of nodes to be expanded before reaching a goal state. We show that this best-first algorithm is particularly well suited for `needle-in-a-haystack’ problems. The second algorithm is based on sampling and we prove an upper bound on the expected number of nodes it expands before reaching a set of goal states. We show that this algorithm is better suited for problems where many paths lead to a goal. We validate these tree search algorithms on 1,000 computer-generated levels of Sokoban, where the policy used to guide the search comes from a neural network trained using A3C. Our results show that the policy tree search algorithms we introduce are competitive with a state-of-the-art domain-independent planner that uses heuristic search. | |
Tasks | |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.10928v2 |
http://arxiv.org/pdf/1811.10928v2.pdf | |
PWC | https://paperswithcode.com/paper/single-agent-policy-tree-search-with |
Repo | https://github.com/deepmind/boxoban-levels |
Framework | none |
Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit
Title | Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit |
Authors | Amrith Krishna, Bishal Santra, Sasi Prasanth Bandaru, Gaurav Sahu, Vishnu Dutt Sharma, Pavankumar Satuluri, Pawan Goyal |
Abstract | The configurational information in sentences of a free word order language such as Sanskrit is of limited use. Thus, the context of the entire sentence will be desirable even for basic processing tasks such as word segmentation. We propose a structured prediction framework that jointly solves the word segmentation and morphological tagging tasks in Sanskrit. We build an energy based model where we adopt approaches generally employed in graph based parsing techniques (McDonald et al., 2005a; Carreras, 2007). Our model outperforms the state of the art with an F-Score of 96.92 (percentage improvement of 7.06%) while using less than one-tenth of the task-specific training data. We find that the use of a graph based ap- proach instead of a traditional lattice-based sequential labelling approach leads to a percentage gain of 12.6% in F-Score for the segmentation task. |
Tasks | Morphological Tagging, Structured Prediction |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01446v2 |
http://arxiv.org/pdf/1809.01446v2.pdf | |
PWC | https://paperswithcode.com/paper/free-as-in-free-word-order-an-energy-based |
Repo | https://github.com/Demfier/ebm-sanskrit-word-segmentation |
Framework | none |
Learning Functions in Large Networks requires Modularity and produces Multi-Agent Dynamics
Title | Learning Functions in Large Networks requires Modularity and produces Multi-Agent Dynamics |
Authors | C. H. Huck Yang, Rise Ooi, Tom Hiscock, Victor Eguiluz, Jesper Tegnér |
Abstract | Networks are abundant in biological systems. Small sized over-represented network motifs have been discovered, and it has been suggested that these constitute functional building blocks. We ask whether larger dynamical network motifs exist in biological networks, thus contributing to the higher-order organization of a network. To end this, we introduce a gradient descent machine learning (ML) approach and genetic algorithms to learn larger functional motifs in contrast to an (unfeasible) exhaustive search. We use the French Flag (FF) and Switch functional motif as case studies motivated from biology. While our algorithm successfully learns large functional motifs, we identify a threshold size of approximately 20 nodes beyond which learning breaks down. Therefore we investigate the stability of the motifs. We find that the size of the real negative eigenvalues of the Jacobian decreases with increasing system size, thus conferring instability. Finally, without imposing learning an input-output for all the components of the network, we observe that unconstrained middle components of the network still learn the desired function, a form of homogeneous team learning. We conclude that the size limitation of learnability, most likely due to stability constraints, impose a definite requirement for modularity in networked systems while enabling team learning within unconstrained parts of the module. Thus, the observation that community structures and modularity are abundant in biological networks could be accounted for by a computational compositional network structure. |
Tasks | |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03001v2 |
http://arxiv.org/pdf/1807.03001v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-functions-in-large-networks-requires |
Repo | https://github.com/huckiyang/EvoluGeneNet-Adjacency-Matrix-Visualizer |
Framework | tf |
Anomaly Detection With Multiple-Hypotheses Predictions
Title | Anomaly Detection With Multiple-Hypotheses Predictions |
Authors | Duc Tam Nguyen, Zhongyu Lou, Michael Klar, Thomas Brox |
Abstract | In one-class-learning tasks, only the normal case (foreground) can be modeled with data, whereas the variation of all possible anomalies is too erratic to be described by samples. Thus, due to the lack of representative data, the wide-spread discriminative approaches cannot cover such learning tasks, and rather generative models, which attempt to learn the input density of the foreground, are used. However, generative models suffer from a large input dimensionality (as in images) and are typically inefficient learners. We propose to learn the data distribution of the foreground more efficiently with a multi-hypotheses autoencoder. Moreover, the model is criticized by a discriminator, which prevents artificial data modes not supported by data, and enforces diversity across hypotheses. Our multiple-hypothesesbased anomaly detection framework allows the reliable identification of out-of-distribution samples. For anomaly detection on CIFAR-10, it yields up to 3.9% points improvement over previously reported results. On a real anomaly detection task, the approach reduces the error of the baseline models from 6.8% to 1.5%. |
Tasks | Anomaly Detection |
Published | 2018-10-31 |
URL | https://arxiv.org/abs/1810.13292v5 |
https://arxiv.org/pdf/1810.13292v5.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detectionwith-multiple-hypotheses |
Repo | https://github.com/YeongHyeon/ConAD |
Framework | tf |
Online Variance Reduction for Stochastic Optimization
Title | Online Variance Reduction for Stochastic Optimization |
Authors | Zalán Borsos, Andreas Krause, Kfir Y. Levy |
Abstract | Modern stochastic optimization methods often rely on uniform sampling which is agnostic to the underlying characteristics of the data. This might degrade the convergence by yielding estimates that suffer from a high variance. A possible remedy is to employ non-uniform importance sampling techniques, which take the structure of the dataset into account. In this work, we investigate a recently proposed setting which poses variance reduction as an online optimization problem with bandit feedback. We devise a novel and efficient algorithm for this setting that finds a sequence of importance sampling distributions competitive with the best fixed distribution in hindsight, the first result of this kind. While we present our method for sampling datapoints, it naturally extends to selecting coordinates or even blocks of thereof. Empirical validations underline the benefits of our method in several settings. |
Tasks | Stochastic Optimization |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04715v3 |
http://arxiv.org/pdf/1802.04715v3.pdf | |
PWC | https://paperswithcode.com/paper/online-variance-reduction-for-stochastic |
Repo | https://github.com/zalanborsos/online-variance-reduction |
Framework | none |
Sever: A Robust Meta-Algorithm for Stochastic Optimization
Title | Sever: A Robust Meta-Algorithm for Stochastic Optimization |
Authors | Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart |
Abstract | In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers. To address this, we introduce a new meta-algorithm that can take in a base learner such as least squares or stochastic gradient descent, and harden the learner to be resistant to outliers. Our method, Sever, possesses strong theoretical guarantees yet is also highly scalable – beyond running the base learner itself, it only requires computing the top singular vector of a certain $n \times d$ matrix. We apply Sever on a drug design dataset and a spam classification dataset, and find that in both cases it has substantially greater robustness than several baselines. On the spam dataset, with $1%$ corruptions, we achieved $7.4%$ test error, compared to $13.4%-20.5%$ for the baselines, and $3%$ error on the uncorrupted dataset. Similarly, on the drug design dataset, with $10%$ corruptions, we achieved $1.42$ mean-squared error test error, compared to $1.51$-$2.33$ for the baselines, and $1.23$ error on the uncorrupted dataset. |
Tasks | Stochastic Optimization |
Published | 2018-03-07 |
URL | https://arxiv.org/abs/1803.02815v2 |
https://arxiv.org/pdf/1803.02815v2.pdf | |
PWC | https://paperswithcode.com/paper/sever-a-robust-meta-algorithm-for-stochastic |
Repo | https://github.com/hoonose/sever |
Framework | none |
Learning concise representations for regression by evolving networks of trees
Title | Learning concise representations for regression by evolving networks of trees |
Authors | William La Cava, Tilak Raj Singh, James Taggart, Srinivas Suri, Jason H. Moore |
Abstract | We propose and study a method for learning interpretable representations for the task of regression. Features are represented as networks of multi-type expression trees comprised of activation functions common in neural networks in addition to other elementary functions. Differentiable features are trained via gradient descent, and the performance of features in a linear model is used to weight the rate of change among subcomponents of each representation. The search process maintains an archive of representations with accuracy-complexity trade-offs to assist in generalization and interpretation. We compare several stochastic optimization approaches within this framework. We benchmark these variants on 100 open-source regression problems in comparison to state-of-the-art machine learning approaches. Our main finding is that this approach produces the highest average test scores across problems while producing representations that are orders of magnitude smaller than the next best performing method (gradient boosting). We also report a negative result in which attempts to directly optimize the disentanglement of the representation result in more highly correlated features. |
Tasks | Stochastic Optimization |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.00981v3 |
http://arxiv.org/pdf/1807.00981v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-concise-representations-for |
Repo | https://github.com/by1tTZ4IsQkAO80F/iclr_2019 |
Framework | none |
One-Shot Generation of Near-Optimal Topology through Theory-Driven Machine Learning
Title | One-Shot Generation of Near-Optimal Topology through Theory-Driven Machine Learning |
Authors | Ruijin Cang, Hope Yao, Yi Ren |
Abstract | We introduce a theory-driven mechanism for learning a neural network model that performs generative topology design in one shot given a problem setting, circumventing the conventional iterative process that computational design tasks usually entail. The proposed mechanism can lead to machines that quickly response to new design requirements based on its knowledge accumulated through past experiences of design generation. Achieving such a mechanism through supervised learning would require an impractically large amount of problem-solution pairs for training, due to the known limitation of deep neural networks in knowledge generalization. To this end, we introduce an interaction between a student (the neural network) and a teacher (the optimality conditions underlying topology optimization): The student learns from existing data and is tested on unseen problems. Deviation of the student’s solutions from the optimality conditions is quantified, and used for choosing new data points to learn from. We call this learning mechanism “theory-driven”, as it explicitly uses domain-specific theories to guide the learning, thus distinguishing itself from purely data-driven supervised learning. We show through a compliance minimization problem that the proposed learning mechanism leads to topology generation with near-optimal structural compliance, much improved from standard supervised learning under the same computational budget. |
Tasks | |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10787v3 |
http://arxiv.org/pdf/1807.10787v3.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-generation-of-near-optimal-topology |
Repo | https://github.com/DesignInformaticsLab/Theory_Driven_TO |
Framework | none |
Recognizing Birds from Sound - The 2018 BirdCLEF Baseline System
Title | Recognizing Birds from Sound - The 2018 BirdCLEF Baseline System |
Authors | Stefan Kahl, Thomas Wilhelm-Stein, Holger Klinck, Danny Kowerko, Maximilian Eibl |
Abstract | Reliable identification of bird species in recorded audio files would be a transformative tool for researchers, conservation biologists, and birders. In recent years, artificial neural networks have greatly improved the detection quality of machine learning systems for bird species recognition. We present a baseline system using convolutional neural networks. We publish our code base as reference for participants in the 2018 LifeCLEF bird identification task and discuss our experiments and potential improvements. |
Tasks | |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07177v1 |
http://arxiv.org/pdf/1804.07177v1.pdf | |
PWC | https://paperswithcode.com/paper/recognizing-birds-from-sound-the-2018 |
Repo | https://github.com/kahst/BirdCLEF-Baseline |
Framework | none |
Morpho-MNIST: Quantitative Assessment and Diagnostics for Representation Learning
Title | Morpho-MNIST: Quantitative Assessment and Diagnostics for Representation Learning |
Authors | Daniel C. Castro, Jeremy Tan, Bernhard Kainz, Ender Konukoglu, Ben Glocker |
Abstract | Revealing latent structure in data is an active field of research, having introduced exciting technologies such as variational autoencoders and adversarial networks, and is essential to push machine learning towards unsupervised knowledge discovery. However, a major challenge is the lack of suitable benchmarks for an objective and quantitative evaluation of learned representations. To address this issue we introduce Morpho-MNIST, a framework that aims to answer: “to what extent has my model learned to represent specific factors of variation in the data?” We extend the popular MNIST dataset by adding a morphometric analysis enabling quantitative comparison of trained models, identification of the roles of latent variables, and characterisation of sample diversity. We further propose a set of quantifiable perturbations to assess the performance of unsupervised and supervised methods on challenging tasks such as outlier detection and domain adaptation. Data and code are available at https://github.com/dccastro/Morpho-MNIST. |
Tasks | Domain Adaptation, Outlier Detection, Representation Learning |
Published | 2018-09-27 |
URL | https://arxiv.org/abs/1809.10780v3 |
https://arxiv.org/pdf/1809.10780v3.pdf | |
PWC | https://paperswithcode.com/paper/morpho-mnist-quantitative-assessment-and |
Repo | https://github.com/dccastro/Morpho-MNIST |
Framework | none |
Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems
Title | Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems |
Authors | Xiujun Li, Yu Wang, Siqi Sun, Sarah Panda, Jingjing Liu, Jianfeng Gao |
Abstract | This proposal introduces a Dialogue Challenge for building end-to-end task-completion dialogue systems, with the goal of encouraging the dialogue research community to collaborate and benchmark on standard datasets and unified experimental environment. In this special session, we will release human-annotated conversational data in three domains (movie-ticket booking, restaurant reservation, and taxi booking), as well as an experiment platform with built-in simulators in each domain, for training and evaluation purposes. The final submitted systems will be evaluated both in simulated setting and by human judges. |
Tasks | |
Published | 2018-07-29 |
URL | http://arxiv.org/abs/1807.11125v2 |
http://arxiv.org/pdf/1807.11125v2.pdf | |
PWC | https://paperswithcode.com/paper/microsoft-dialogue-challenge-building-end-to |
Repo | https://github.com/ysglh/Task-Oriented-Dialogue-Dataset-Survey |
Framework | none |
Inference and Sampling of $K_{33}$-free Ising Models
Title | Inference and Sampling of $K_{33}$-free Ising Models |
Authors | Valerii Likhosherstov, Yury Maximov, Michael Chertkov |
Abstract | We call an Ising model tractable when it is possible to compute its partition function value (statistical inference) in polynomial time. The tractability also implies an ability to sample configurations of this model in polynomial time. The notion of tractability extends the basic case of planar zero-field Ising models. Our starting point is to describe algorithms for the basic case computing partition function and sampling efficiently. To derive the algorithms, we use an equivalent linear transition to perfect matching counting and sampling on an expanded dual graph. Then, we extend our tractable inference and sampling algorithms to models, whose triconnected components are either planar or graphs of $O(1)$ size. In particular, it results in a polynomial-time inference and sampling algorithms for $K_{33}$ (minor) free topologies of zero-field Ising models - a generalization of planar graphs with a potentially unbounded genus. |
Tasks | |
Published | 2018-12-22 |
URL | https://arxiv.org/abs/1812.09587v2 |
https://arxiv.org/pdf/1812.09587v2.pdf | |
PWC | https://paperswithcode.com/paper/inference-and-sampling-of-k_33-free-ising |
Repo | https://github.com/ValeryTyumen/planar_ising |
Framework | none |