Paper Group ANR 790
Gauge theory and twins paradox of disentangled representations. Deep Griffin-Lim Iteration. Trust-Region Variational Inference with Gaussian Mixture Models. Proceedings of AAAI 2019 Workshop on Network Interpretability for Deep Learning. Deep Robotic Prediction with hierarchical RGB-D Fusion. Supervised Learning of the Next-Best-View for 3D Object …
Gauge theory and twins paradox of disentangled representations
Title | Gauge theory and twins paradox of disentangled representations |
Authors | X. Dong, L. Zhou |
Abstract | Achieving disentangled representations of information is one of the key goals of deep network based machine learning system. Recently there are more discussions on this issue. In this paper, by comparing the geometric structure of disentangled representation and the geometry of the evolution of mixed states in quantum mechanics, we give a fibre bundle based geometric picture of disentangled representation which can be regarded as a kind of gauge theory. From this perspective we can build a connection between the disentangled representations and the twins paradox in relativity. This can help to clarify some problems about disentangled representation. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.10545v1 |
https://arxiv.org/pdf/1906.10545v1.pdf | |
PWC | https://paperswithcode.com/paper/gauge-theory-and-twins-paradox-of |
Repo | |
Framework | |
Deep Griffin-Lim Iteration
Title | Deep Griffin-Lim Iteration |
Authors | Yoshiki Masuyama, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada |
Abstract | This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processing-based approach and a deep neural network (DNN). To retrieve a time-domain signal from its amplitude spectrogram, the corresponding phase is required. One of the popular phase reconstruction methods is the Griffin-Lim algorithm (GLA), which is based on the redundancy of the short-time Fourier transform. However, GLA often involves many iterations and produces low-quality signals owing to the lack of prior knowledge of the target signal. In order to address these issues, in this study, we propose an architecture which stacks a sub-block including two GLA-inspired fixed layers and a DNN. The number of stacked sub-blocks is adjustable, and we can trade the performance and computational load based on requirements of applications. The effectiveness of the proposed method is investigated by reconstructing phases from amplitude spectrograms of speeches. |
Tasks | |
Published | 2019-03-10 |
URL | http://arxiv.org/abs/1903.03971v1 |
http://arxiv.org/pdf/1903.03971v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-griffin-lim-iteration |
Repo | |
Framework | |
Trust-Region Variational Inference with Gaussian Mixture Models
Title | Trust-Region Variational Inference with Gaussian Mixture Models |
Authors | Oleg Arenz, Mingjun Zhong, Gerhard Neumann |
Abstract | Many methods for machine learning rely on approximate inference from intractable probability distributions. Variational inference approximates such distributions by tractable models that can be subsequently used for approximate inference. Learning sufficiently accurate approximations requires a rich model family and careful exploration of the relevant modes of the target distribution. We propose a method for learning accurate GMM approximations of intractable probability distributions based on insights from policy search by establishing information-geometric trust regions for principled exploration. For efficient improvement of the GMM approximation, we derive a lower bound on the corresponding optimization objective enabling us to update the components independently. The use of the lower bound ensures convergence to a local optimum of the original objective. The number of components is adapted online by adding new components in promising regions and by deleting components with negligible weight. We demonstrate on several domains that we can learn approximations of complex, multi-modal distributions with a quality that is unmet by previous variational inference methods, and that the GMM approximation can be used for drawing samples that are on par with samples created by state-of-the-art MCMC samplers while requiring up to three orders of magnitude less computational resources. |
Tasks | |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04710v1 |
https://arxiv.org/pdf/1907.04710v1.pdf | |
PWC | https://paperswithcode.com/paper/trust-region-variational-inference-with |
Repo | |
Framework | |
Proceedings of AAAI 2019 Workshop on Network Interpretability for Deep Learning
Title | Proceedings of AAAI 2019 Workshop on Network Interpretability for Deep Learning |
Authors | Quanshi Zhang, Lixin Fan, Bolei Zhou |
Abstract | This is the Proceedings of AAAI 2019 Workshop on Network Interpretability for Deep Learning |
Tasks | |
Published | 2019-01-25 |
URL | http://arxiv.org/abs/1901.08813v2 |
http://arxiv.org/pdf/1901.08813v2.pdf | |
PWC | https://paperswithcode.com/paper/proceedings-of-aaai-2019-workshop-on-network |
Repo | |
Framework | |
Deep Robotic Prediction with hierarchical RGB-D Fusion
Title | Deep Robotic Prediction with hierarchical RGB-D Fusion |
Authors | Yaoxian Song, Jun Wen, Yuejiao Fei, Changbin Yu |
Abstract | Robotic arm grasping is a fundamental operation in robotic control task goals. Most current methods for robotic grasping focus on RGB-D policy in the table surface scenario or 3D point cloud analysis and inference in the 3D space. Comparing to these methods, we propose a novel real-time multimodal hierarchical encoder-decoder neural network that fuses RGB and depth data to realize robotic humanoid grasping in 3D space with only partial observation. The quantification of raw depth data’s uncertainty and depth estimation fusing RGB is considered. We develop a general labeling method to label ground-truth on common RGB-D datasets. We evaluate the effectiveness and performance of our method on a physical robot setup and our method achieves over 90% success rate in both table surface and 3D space scenarios. |
Tasks | Depth Estimation, Robotic Grasping |
Published | 2019-09-14 |
URL | https://arxiv.org/abs/1909.06585v2 |
https://arxiv.org/pdf/1909.06585v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-robotic-prediction-with-hierarchical-rgb |
Repo | |
Framework | |
Supervised Learning of the Next-Best-View for 3D Object Reconstruction
Title | Supervised Learning of the Next-Best-View for 3D Object Reconstruction |
Authors | Miguel Mendoza, J. Irving Vasquez-Gomez, Hind Taud, Luis Enrique Sucar, Carolina Reta |
Abstract | Motivated by the advances in 3D sensing technology and the spreading of low-cost robotic platforms, 3D object reconstruction has become a common task in many areas. Nevertheless, the selection of the optimal sensor pose that maximizes the reconstructed surface is a problem that remains open. It is known in the literature as the next-best-view planning problem. In this paper, we propose a novel next-best-view planning scheme based on supervised deep learning. The scheme contains an algorithm for automatic generation of datasets and an original three-dimensional convolutional neural network (3D-CNN) used to learn the next-best-view. Unlike previous work where the problem is addressed as a search, the trained 3D-CNN directly predicts the sensor pose. We present a comparison of the proposed network against a similar net, and we present several experiments of the reconstruction of unknown objects validating the effectiveness of the proposed scheme. |
Tasks | 3D Object Reconstruction, Object Reconstruction |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05833v1 |
https://arxiv.org/pdf/1905.05833v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-learning-of-the-next-best-view-for |
Repo | |
Framework | |
Multiplicative Up-Drift
Title | Multiplicative Up-Drift |
Authors | Benjamin Doerr, Timo Kötzing |
Abstract | Drift analysis aims at translating the expected progress of an evolutionary algorithm (or more generally, a random process) into a probabilistic guarantee on its run time (hitting time). So far, drift arguments have been successfully employed in the rigorous analysis of evolutionary algorithms, however, only for the situation that the progress is constant or becomes weaker when approaching the target. Motivated by questions like how fast fit individuals take over a population, we analyze random processes exhibiting a $(1+\delta)$-multiplicative growth in expectation. We prove a drift theorem translating this expected progress into a hitting time. This drift theorem gives a simple and insightful proof of the level-based theorem first proposed by Lehre (2011). Our version of this theorem has, for the first time, the best-possible near-linear dependence on $1/\delta$ (the previous results had an at least near-quadratic dependence), and it only requires a population size near-linear in $\delta$ (this was super-quadratic in previous results). These improvements immediately lead to stronger run time guarantees for a number of applications. We also discuss the case of large $\delta$ and show stronger results for this setting. |
Tasks | |
Published | 2019-04-11 |
URL | https://arxiv.org/abs/1904.05682v3 |
https://arxiv.org/pdf/1904.05682v3.pdf | |
PWC | https://paperswithcode.com/paper/multiplicative-up-drift |
Repo | |
Framework | |
Building 3D Object Models during Manipulation by Reconstruction-Aware Trajectory Optimization
Title | Building 3D Object Models during Manipulation by Reconstruction-Aware Trajectory Optimization |
Authors | Kanrun Huang, Tucker Hermans |
Abstract | Object shape provides important information for robotic manipulation; for instance, selecting an effective grasp depends on both the global and local shape of the object of interest, while reaching into clutter requires accurate surface geometry to avoid unintended contact with the environment. Model-based 3D object manipulation is a widely studied problem; however, obtaining the accurate 3D object models for multiple objects often requires tedious work. In this letter, we exploit Gaussian process implicit surfaces (GPIS) extracted from RGB-D sensor data to grasp an unknown object. We propose a reconstruction-aware trajectory optimization that makes use of the extracted GPIS model plan a motion to improve the ability to estimate the object’s 3D geometry, while performing a pick-and-place action. We present a probabilistic approach for a robot to autonomously learn and track the object, while achieve the manipulation task. We use a sampling-based trajectory generation method to explore the unseen parts of the object using the estimated conditional entropy of the GPIS model. We validate our method with physical robot experiments across eleven different objects of varying shape from the YCB object dataset. Our experiments show that our reconstruction-aware trajectory optimization provides higher-quality 3D object reconstruction when compared with directly solving the manipulation task or using a heuristic to view unseen portions of the object. |
Tasks | 3D Object Reconstruction, Object Reconstruction |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.03907v1 |
https://arxiv.org/pdf/1905.03907v1.pdf | |
PWC | https://paperswithcode.com/paper/building-3d-object-models-during-manipulation |
Repo | |
Framework | |
Scalable High Performance SDN Switch Architecture on FPGA for Core Networks
Title | Scalable High Performance SDN Switch Architecture on FPGA for Core Networks |
Authors | Sasindu Wijeratne, Ashen Ekanayake, Sandaruwan Jayaweera, Danuka Ravishan, Ajith Pasqual |
Abstract | Due to the increasing heterogeneity in network user requirements, dynamically varying day to day network traffic patterns and delay in-network service deployment, there is a huge demand for scalability and flexibility in modern networking infrastructure, which in return has paved way for the introduction of Software Defined Networking (SDN) in core networks. In this paper, we present an FPGA-based switch that is fully compliant with OpenFlow; the pioneering protocol for southbound interface of SDN. The switch architecture is completely implemented on hardware. The design consists of an OpenFlow Southbound agent which can process OpenFlow packets at a rate of 10Gbps. The proposed architecture speed scales up to 400Gbps while it consumes only 60% of resources on a Xilinx Virtex-7 featuring XC7VX485T FPGA. |
Tasks | |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13683v1 |
https://arxiv.org/pdf/1910.13683v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-high-performance-sdn-switch |
Repo | |
Framework | |
A Baseline for Few-Shot Image Classification
Title | A Baseline for Few-Shot Image Classification |
Authors | Guneet S. Dhillon, Pratik Chaudhari, Avinash Ravichandran, Stefano Soatto |
Abstract | Fine-tuning a deep network trained with the standard cross-entropy loss is a strong baseline for few-shot learning. When fine-tuned transductively, this outperforms the current state-of-the-art on standard datasets such as Mini-ImageNet, Tiered-ImageNet, CIFAR-FS and FC-100 with the same hyper-parameters. The simplicity of this approach enables us to demonstrate the first few-shot learning results on the ImageNet-21k dataset. We find that using a large number of meta-training classes results in high few-shot accuracies even for a large number of few-shot classes. We do not advocate our approach as the solution for few-shot learning, but simply use the results to highlight limitations of current benchmarks and few-shot protocols. We perform extensive studies on benchmark datasets to propose a metric that quantifies the “hardness” of a few-shot episode. This metric can be used to report the performance of few-shot algorithms in a more systematic way. |
Tasks | Few-Shot Image Classification, Few-Shot Learning, Image Classification |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.02729v4 |
https://arxiv.org/pdf/1909.02729v4.pdf | |
PWC | https://paperswithcode.com/paper/a-baseline-for-few-shot-image-classification |
Repo | |
Framework | |
Learning about spatial inequalities: Capturing the heterogeneity in the urban environment
Title | Learning about spatial inequalities: Capturing the heterogeneity in the urban environment |
Authors | J. Siqueira-Gay, M. A. Giannotti, M. Sester |
Abstract | Transportation systems can be conceptualized as an instrument of spreading people and resources over the territory, playing an important role in developing sustainable cities. The current rationale of transport provision is based on population demand, disregarding land use and socioeconomic information. To meet the challenge to promote a more equitable resource distribution, this work aims at identifying and describing patterns of urban services supply, their accessibility, and household income. By using a multidimensional approach, the spatial inequalities of a large city of the global south reveal that the low-income population has low access mainly to hospitals and cultural centers. A low-income group presents an intermediate level of accessibility to public schools and sports centers, evidencing the diverse condition of citizens in the peripheries. These complex outcomes generated by the interaction of land use and public transportation emphasize the importance of comprehensive methodological approaches to support decisions of urban projects, plans and programs. Reducing spatial inequalities, especially providing services for deprived groups, is fundamental to promote the sustainable use of resources and optimize the daily commuting. |
Tasks | |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1908.00625v1 |
https://arxiv.org/pdf/1908.00625v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-about-spatial-inequalities-capturing |
Repo | |
Framework | |
Pix2Vex: Image-to-Geometry Reconstruction using a Smooth Differentiable Renderer
Title | Pix2Vex: Image-to-Geometry Reconstruction using a Smooth Differentiable Renderer |
Authors | Felix Petersen, Amit H. Bermano, Oliver Deussen, Daniel Cohen-Or |
Abstract | The long-coveted task of reconstructing 3D geometry from images is still a standing problem. In this paper, we build on the power of neural networks and introduce Pix2Vex, a network trained to convert camera-captured images into 3D geometry. We present a novel differentiable renderer ($DR$) as a forward validation means during training. Our key insight is that $DR$s produce images of a particular appearance, different from typical input images. Hence, we propose adding an image-to-image translation component, converting between these rendering styles. This translation closes the training loop, while allowing to use minimal supervision only, without needing any 3D model as ground truth. Unlike state-of-the-art methods, our $DR$ is $C^\infty$ smooth and thus does not display any discontinuities at occlusions or dis-occlusions. Through our novel training scheme, our network can train on different types of images, where previous work can typically only train on images of a similar appearance to those rendered by a $DR$. |
Tasks | 3D Object Reconstruction, Domain Adaptation, Image-to-Image Translation, Object Reconstruction |
Published | 2019-03-26 |
URL | https://arxiv.org/abs/1903.11149v2 |
https://arxiv.org/pdf/1903.11149v2.pdf | |
PWC | https://paperswithcode.com/paper/pix2vex-image-to-geometry-reconstruction |
Repo | |
Framework | |
On The Evaluation of Machine Translation Systems Trained With Back-Translation
Title | On The Evaluation of Machine Translation Systems Trained With Back-Translation |
Authors | Sergey Edunov, Myle Ott, Marc’Aurelio Ranzato, Michael Auli |
Abstract | Back-translation is a widely used data augmentation technique which leverages target monolingual data. However, its effectiveness has been challenged since automatic metrics such as BLEU only show significant improvements for test examples where the source itself is a translation, or translationese. This is believed to be due to translationese inputs better matching the back-translated training data. In this work, we show that this conjecture is not empirically supported and that back-translation improves translation quality of both naturally occurring text as well as translationese according to professional human translators. We provide empirical evidence to support the view that back-translation is preferred by humans because it produces more fluent outputs. BLEU cannot capture human preferences because references are translationese when source sentences are natural text. We recommend complementing BLEU with a language model score to measure fluency. |
Tasks | Data Augmentation, Language Modelling, Machine Translation |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05204v1 |
https://arxiv.org/pdf/1908.05204v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-evaluation-of-machine-translation |
Repo | |
Framework | |
Learning to Impute: A General Framework for Semi-supervised Learning
Title | Learning to Impute: A General Framework for Semi-supervised Learning |
Authors | Wei-Hong Li, Chuan-Sheng Foo, Hakan Bilen |
Abstract | Recent semi-supervised learning methods have shown to achieve comparable results to their supervised counterparts while using only a small portion of labels in image classification tasks thanks to their regularization strategies. In this paper, we take a more direct approach for semi-supervised learning and propose learning to impute the labels of unlabeled samples such that a network achieves better generalization when it is trained on these labels. We pose the problem in a learning-to-learn formulation which can easily be incorporated to the state-of-the-art semi-supervised techniques and boost their performance especially when the labels are limited. We demonstrate that our method is applicable to both classification and regression problems including image classification and facial landmark detection tasks. |
Tasks | Facial Landmark Detection, Image Classification |
Published | 2019-12-22 |
URL | https://arxiv.org/abs/1912.10364v1 |
https://arxiv.org/pdf/1912.10364v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-impute-a-general-framework-for-1 |
Repo | |
Framework | |
Software Tools for Big Data Resources in Family Names Dictionaries
Title | Software Tools for Big Data Resources in Family Names Dictionaries |
Authors | Adam Rambousek, Harry Parkin, Ales Horak |
Abstract | This paper describes the design and development of specific software tools used during the creation of Family Names in Britain and Ireland (FaNBI) research project, started by the University of the West of England in 2010 and finished successfully in 2016. First, the overview of the project and methodology is provided. Next section contains the description of dictionary management tools and software tools to combine input data resources. |
Tasks | |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.09234v1 |
http://arxiv.org/pdf/1904.09234v1.pdf | |
PWC | https://paperswithcode.com/paper/190409234 |
Repo | |
Framework | |