Paper Group NANR 131
Domain Adaptation Through Label Propagation: Learning Clustered and Aligned Features. Deep Innovation Protection. Scalable Differentially Private Data Generation via Private Aggregation of Teacher Ensembles. Expected Tight Bounds for Robust Deep Neural Network Training. Unsupervised Spatiotemporal Data Inpainting. Fair Resource Allocation in Federa …
Domain Adaptation Through Label Propagation: Learning Clustered and Aligned Features
Title | Domain Adaptation Through Label Propagation: Learning Clustered and Aligned Features |
Authors | Anonymous |
Abstract | The difficulty of obtaining sufficient labeled data for supervised learning has motivated domain adaptation, in which a classifier is trained in one domain, source domain, but operates in another, target domain. Reducing domain discrepancy has improved the performance, but it is hampered by the embedded features that do not form clearly separable and aligned clusters. We address this issue by propagating labels using a manifold structure, and by enforcing cycle consistency to align the clusters of features in each domain more closely. Specifically, we prove that cycle consistency leads the embedded features distant from all but one clusters if the source domain is ideally clustered. We additionally utilize more information from approximated local manifold and pursue local manifold consistency for more improvement. Results for various domain adaptation scenarios show tighter clustering and an improvement in classification accuracy. |
Tasks | Domain Adaptation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HJgY6R4YPH |
https://openreview.net/pdf?id=HJgY6R4YPH | |
PWC | https://paperswithcode.com/paper/domain-adaptation-through-label-propagation |
Repo | |
Framework | |
Deep Innovation Protection
Title | Deep Innovation Protection |
Authors | Anonymous |
Abstract | Evolutionary-based optimization approaches have recently shown promising results in domains such as Atari and robot locomotion but less so in solving 3D tasks directly from pixels. This paper presents a method called Deep Innovation Protection (DIP) that allows training complex world models end-to-end for such 3D environments. The main idea behind the approach is to employ multiobjective optimization to temporally reduce the selection pressure on specific components in a world model, allowing other components to adapt. We investigate the emergent representations of these evolved networks, which learn a model of the world without the need for a specific forward-prediction loss. |
Tasks | Multiobjective Optimization |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SygLu0VtPH |
https://openreview.net/pdf?id=SygLu0VtPH | |
PWC | https://paperswithcode.com/paper/deep-innovation-protection |
Repo | |
Framework | |
Scalable Differentially Private Data Generation via Private Aggregation of Teacher Ensembles
Title | Scalable Differentially Private Data Generation via Private Aggregation of Teacher Ensembles |
Authors | Anonymous |
Abstract | We present a novel approach named G-PATE for training differentially private data generator. The generator can be used to produce synthetic datasets with strong privacy guarantee while preserving high data utility. Our approach leverages generative adversarial nets to generate data and exploits the PATE (Private Aggregation of Teacher Ensembles) framework to protect data privacy. Compared to existing methods, our approach significantly improves the use of privacy budget. This is possible since we only need to ensure differential privacy for the generator, which is the part of the model that actually needs to be published for private data generation. To achieve this, we connect a student generator with an ensemble of teacher discriminators and propose a private gradient aggregation mechanism to ensure differential privacy on all the information that flows from the teacher discriminators to the student generator. Theoretically, we prove that our algorithm ensures differential privacy for the generator. Empirically, we provide thorough experiments to demonstrate the superiority of our method over prior work on both image and non-image datasets. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Hkl6i0EFPH |
https://openreview.net/pdf?id=Hkl6i0EFPH | |
PWC | https://paperswithcode.com/paper/scalable-differentially-private-data |
Repo | |
Framework | |
Expected Tight Bounds for Robust Deep Neural Network Training
Title | Expected Tight Bounds for Robust Deep Neural Network Training |
Authors | Anonymous |
Abstract | Training Deep Neural Networks (DNNs) that are robust to norm bounded adversarial attacks remains an elusive problem. While verification based methods are generally too expensive to robustly train large networks, it was demonstrated by Gowal et. al. that bounded input intervals can be inexpensively propagated from layer to layer through deep networks. This interval bound propagation (IBP) approach led to high robustness and was the first to be employed on large networks. However, due to the very loose nature of the IBP bounds, particularly for large/deep networks, the required training procedure is complex and involved. In this paper, we closely examine the bounds of a block of layers composed of an affine layer, followed by a ReLU, followed by another affine layer. To this end, we propose \emph{expected} bounds (true bounds in expectation), which are provably tighter than IBP bounds in expectation. We then extend this result to deeper networks through blockwise propagation and show that we can achieve orders of magnitudes tighter bounds compared to IBP. Using these tight bounds, we demonstrate that a simple standard training procedure can achieve impressive robustness-accuracy trade-off across several architectures on both MNIST and CIFAR10. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Syld53NtvH |
https://openreview.net/pdf?id=Syld53NtvH | |
PWC | https://paperswithcode.com/paper/expected-tight-bounds-for-robust-deep-neural |
Repo | |
Framework | |
Unsupervised Spatiotemporal Data Inpainting
Title | Unsupervised Spatiotemporal Data Inpainting |
Authors | Anonymous |
Abstract | We tackle the problem of inpainting occluded area in spatiotemporal sequences, such as cloud occluded satellite observations, in an unsupervised manner. We place ourselves in the setting where there is neither access to paired nor unpaired training data. We consider several cases in which the underlying information of the observed sequence in certain areas is lost through an observation operator. In this case, the only available information is provided by the observation of the sequence, the nature of the measurement process and its associated statistics. We propose an unsupervised-learning framework to retrieve the most probable sequence using a generative adversarial network. We demonstrate the capacity of our model to exhibit strong reconstruction capacity on several video datasets such as satellite sequences or natural videos. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rylqmxBKvH |
https://openreview.net/pdf?id=rylqmxBKvH | |
PWC | https://paperswithcode.com/paper/unsupervised-spatiotemporal-data-inpainting |
Repo | |
Framework | |
Fair Resource Allocation in Federated Learning
Title | Fair Resource Allocation in Federated Learning |
Authors | Anonymous |
Abstract | Federated learning involves training statistical models in massive, heterogeneous networks. Naively minimizing an aggregate loss function in such a network may disproportionately advantage or disadvantage some of the devices. In this work, we propose q-Fair Federated Learning (q-FFL), a novel optimization objective inspired by fair resource allocation in wireless networks that encourages a more fair (i.e., more uniform) accuracy distribution across devices in federated networks. To solve q-FFL, we devise a communication-efficient method, q-FedAvg, that is suited to federated networks. We validate both the effectiveness of q-FFL and the efficiency of q-FedAvg on a suite of federated datasets with both convex and non-convex models, and show that q-FFL (along with q-FedAvg) outperforms existing baselines in terms of the resulting fairness, flexibility, and efficiency. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=ByexElSYDr |
https://openreview.net/pdf?id=ByexElSYDr | |
PWC | https://paperswithcode.com/paper/fair-resource-allocation-in-federated |
Repo | |
Framework | |
Better Knowledge Retention through Metric Learning
Title | Better Knowledge Retention through Metric Learning |
Authors | Anonymous |
Abstract | In a continual learning setting, new categories may be introduced over time, and an ideal learning system should perform well on both the original categories and the new categories. While deep neural nets have achieved resounding success in the classical setting, they are known to forget about knowledge acquired in prior episodes of learning if the examples encountered in the current episode of learning are drastically different from those encountered in prior episodes. This makes deep neural nets ill-suited to continual learning. In this paper, we propose a new model that can both leverage the expressive power of deep neural nets and is resilient to forgetting when new categories are introduced. We demonstrate an improvement in terms of accuracy on original classes compared to a vanilla deep neural net. |
Tasks | Continual Learning, Metric Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1lEjlHKPH |
https://openreview.net/pdf?id=r1lEjlHKPH | |
PWC | https://paperswithcode.com/paper/better-knowledge-retention-through-metric |
Repo | |
Framework | |
A Training Scheme for the Uncertain Neuromorphic Computing Chips
Title | A Training Scheme for the Uncertain Neuromorphic Computing Chips |
Authors | Qingtian Zhang, Bin Gao, Huaqiang Wu |
Abstract | Uncertainty is a very important feature of the intelligence and helps the brain become a flexible, creative and powerful intelligent system. The crossbar-based neuromorphic computing chips, in which the computing is mainly performed by analog circuits, have the uncertainty and can be used to imitate the brain. However, most of the current deep neural networks have not taken the uncertainty of the neuromorphic computing chip into consideration. Therefore, their performances on the neuromorphic computing chips are not as good as on the original platforms (CPUs/GPUs). In this work, we proposed the uncertainty adaptation training scheme (UATS) that tells the uncertainty to the neural network in the training process. The experimental results show that the neural networks can achieve comparable inference performances on the uncertain neuromorphic computing chip compared to the results on the original platforms, and much better than the performances without this training scheme. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Byekm0VtwS |
https://openreview.net/pdf?id=Byekm0VtwS | |
PWC | https://paperswithcode.com/paper/a-training-scheme-for-the-uncertain |
Repo | |
Framework | |
Continual Learning using the SHDL Framework with Skewed Replay Distributions
Title | Continual Learning using the SHDL Framework with Skewed Replay Distributions |
Authors | Anonymous |
Abstract | Human and animals continuously acquire, adapt as well as transfer knowledge throughout their lifespan. The ability to learn continuously is crucial for the effective functioning of agents interacting with the real world and processing continuous streams of information. Continuous learning has been a long-standing challenge for neural networks as the repeated acquisition of information from non-uniform data distributions generally lead to catastrophic forgetting or interference. This work proposes a modular architecture capable of continuous acquisition of tasks while averting catastrophic forgetting. Specifically, our contributions are: (i) Efficient Architecture: a modular architecture emulating the visual cortex that can learn meaningful representations with limited labelled examples, (ii) Knowledge Retention: retention of learned knowledge via limited replay of past experiences, (iii) Forward Transfer: efficient and relatively faster learning on new tasks, and (iv) Naturally Skewed Distributions: The learning in the above-mentioned claims is performed on non-uniform data distributions which better represent the natural statistics of our ongoing experience. Several experiments that substantiate the above-mentioned claims are demonstrated on the CIFAR-100 dataset. |
Tasks | Continual Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BkghKgStPH |
https://openreview.net/pdf?id=BkghKgStPH | |
PWC | https://paperswithcode.com/paper/continual-learning-using-the-shdl-framework |
Repo | |
Framework | |
Toward Understanding The Effect of Loss Function on The Performance of Knowledge Graph Embedding
Title | Toward Understanding The Effect of Loss Function on The Performance of Knowledge Graph Embedding |
Authors | Anonymous |
Abstract | Knowledge graphs (KGs) represent world’s facts in structured forms. KG completion exploits the existing facts in a KG to discover new ones. Translation-based embedding model (TransE) is a prominent formulation to do KG completion. Despite the efficiency of TransE in memory and time, it suffers from several limitations in encoding relation patterns such as symmetric, reflexive etc. To resolve this problem, most of the attempts have circled around the revision of the score function of TransE i.e., proposing a more complicated score function such as Trans(A, D, G, H, R, etc) to mitigate the limitations. In this paper, we tackle this problem from a different perspective. We show that existing theories corresponding to the limitations of TransE are inaccurate because they ignore the effect of loss function. Accordingly, we pose theoretical investigations of the main limitations of TransE in the light of loss function. To the best of our knowledge, this has not been investigated so far comprehensively. We show that by a proper selection of the loss function for training the TransE model, the main limitations of the model are mitigated. This is explained by setting upper-bound for the scores of positive samples, showing the region of truth (i.e., the region that a triple is considered positive by the model). Our theoretical proofs with experimental results fill the gap between the capability of translation-based class of embedding models and the loss function. The theories emphasis the importance of the selection of the loss functions for training the models. Our experimental evaluations on different loss functions used for training the models justify our theoretical proofs and confirm the importance of the loss functions on the performance. |
Tasks | Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HJxKhyStPH |
https://openreview.net/pdf?id=HJxKhyStPH | |
PWC | https://paperswithcode.com/paper/toward-understanding-the-effect-of-loss |
Repo | |
Framework | |
Learning Semantically Meaningful Representations Through Embodiment
Title | Learning Semantically Meaningful Representations Through Embodiment |
Authors | Anonymous |
Abstract | How do humans acquire a meaningful understanding of the world with little to no supervision or semantic labels provided by the environment? Here we investigate embodiment and a closed loop between action and perception as one key component in this process. We take a close look at the representations learned by a deep reinforcement learning agent that is trained with visual and vector observations collected in a 3D environment with sparse rewards. We show that this agent learns semantically meaningful and stable representations of its environment without receiving any semantic labels. Our results show that the agent learns to represent the action relevant information extracted from pixel input in a wide variety of sparse activation patterns. The quality of the representations learned shows the strength of embodied learning and its advantages over fully supervised approaches with regards to robustness and generalizability. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1lEd64YwH |
https://openreview.net/pdf?id=r1lEd64YwH | |
PWC | https://paperswithcode.com/paper/learning-semantically-meaningful |
Repo | |
Framework | |
Growing Action Spaces
Title | Growing Action Spaces |
Authors | Anonymous |
Abstract | In complex tasks, such as those with large combinatorial action spaces, random exploration may be too inefficient to achieve meaningful learning progress. In this work, we use a curriculum of progressively growing action spaces to accelerate learning. We assume the environment is out of our control, but that the agent may set an internal curriculum by initially restricting its action space. Our approach uses off-policy reinforcement learning to estimate optimal value functions for multiple action spaces simultaneously and efficiently transfers data, value estimates, and state representations from restricted action spaces to the full task. We show the efficacy of our approach in proof-of-concept control tasks and on challenging large-scale StarCraft micromanagement tasks with large, multi-agent action spaces. |
Tasks | Starcraft |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Skl4LTEtDS |
https://openreview.net/pdf?id=Skl4LTEtDS | |
PWC | https://paperswithcode.com/paper/growing-action-spaces-1 |
Repo | |
Framework | |
On the implicit minimization of alternative loss functions when training deep networks
Title | On the implicit minimization of alternative loss functions when training deep networks |
Authors | Anonymous |
Abstract | Understanding the implicit bias of optimization algorithms is important in order to improve generalization of neural networks. One approach to try to exploit such understanding would be to then make the bias explicit in the loss function. Conversely, an interesting approach to gain more insights into the implicit bias could be to study how different loss functions are being implicitly minimized when training the network. In this work, we concentrate our study on the inductive bias occurring when minimizing the cross-entropy loss with different batch sizes and learning rates. We investigate how three loss functions are being implicitly minimized during training. These three loss functions are the Hinge loss with different margins, the cross-entropy loss with different temperatures and a newly introduced Gcdf loss with different standard deviations. This Gcdf loss establishes a connection between a sharpness measure for the 0−1 loss and margin based loss functions. We find that a common behavior is emerging for all the loss functions considered. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1lclxBYDS |
https://openreview.net/pdf?id=r1lclxBYDS | |
PWC | https://paperswithcode.com/paper/on-the-implicit-minimization-of-alternative |
Repo | |
Framework | |
Adversarial Training and Provable Defenses: Bridging the Gap
Title | Adversarial Training and Provable Defenses: Bridging the Gap |
Authors | Anonymous |
Abstract | We propose a new method to train neural networks based on a novel combination of adversarial training and provable defenses. The key idea is to model training as a procedure which includes both, the verifier and the adversary. In every iteration, the verifier aims to certify the network using convex relaxation while the adversary tries to find inputs inside that convex relaxation which cause verification to fail. We experimentally show that this training method is promising and achieves the best of both worlds – it produces a model with state-of-the-art accuracy (74.8%) and certified robustness (55.9%) on the challenging CIFAR-10 dataset with a 2/255 L-infinity perturbation. This is a significant improvement over the currently known best results of 68.3% accuracy and 53.9% certified robustness, achieved using a 5 times larger network than our work. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJxSDxrKDr |
https://openreview.net/pdf?id=SJxSDxrKDr | |
PWC | https://paperswithcode.com/paper/adversarial-training-and-provable-defenses |
Repo | |
Framework | |
Evaluating The Search Phase of Neural Architecture Search
Title | Evaluating The Search Phase of Neural Architecture Search |
Authors | Anonymous |
Abstract | Neural Architecture Search (NAS) aims to facilitate the design of deep networks for new tasks. Existing techniques rely on two stages: searching over the architecture space and validating the best architecture. NAS algorithms are currently compared solely based one their results on the downstream task. While intuitive, this fails to explicitly evaluate the effectiveness of their search strategies. In this paper, we propose to evaluate the NAS search phase. To this end, we compare the quality of the solutions obtained by NAS search policies with that of random architecture selection. We find that: (i) On average, the state-of-the-art NAS algorithms perform similarly to the random policy; (ii) the widely-used weight sharing strategy degrades the ranking of the NAS candidates to the point of not reflecting their true performance, thus reducing the effectiveness of the search process. We believe that our evaluation framework will be key to designing NAS strategies that consistently discover architectures superior to random ones. |
Tasks | Neural Architecture Search |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=H1loF2NFwr |
https://openreview.net/pdf?id=H1loF2NFwr | |
PWC | https://paperswithcode.com/paper/evaluating-the-search-phase-of-neural-1 |
Repo | |
Framework | |