Paper Group ANR 1113
Make $\ell_1$ Regularization Effective in Training Sparse CNN. Coordinating Disaster Emergency Response with Heuristic Reinforcement Learning. Deep Hierarchical Machine: a Flexible Divide-and-Conquer Architecture. Learning from Informants: Relations between Learning Success Criteria. Estimating Total Search Space Size for Specific Piece Sets in Che …
Make $\ell_1$ Regularization Effective in Training Sparse CNN
Title | Make $\ell_1$ Regularization Effective in Training Sparse CNN |
Authors | Juncai He, Xiaodong Jia, Jinchao Xu, Lian Zhang, Liang Zhao |
Abstract | Compressed Sensing using $\ell_1$ regularization is among the most powerful and popular sparsification technique in many applications, but why has it not been used to obtain sparse deep learning model such as convolutional neural network (CNN)? This paper is aimed to provide an answer to this question and to show how to make it work. We first demonstrate that the commonly used stochastic gradient decent (SGD) and variants training algorithm is not an appropriate match with $\ell_1$ regularization and then replace it with a different training algorithm based on a regularized dual averaging (RDA) method. RDA was originally designed specifically for convex problem, but with new theoretical insight and algorithmic modifications (using proper initialization and adaptivity), we have made it an effective match with $\ell_1$ regularization to achieve a state-of-the-art sparsity for CNN compared to other weight pruning methods without compromising accuracy (achieving 95% sparsity for ResNet18 on CIFAR-10, for example). |
Tasks | |
Published | 2018-07-11 |
URL | https://arxiv.org/abs/1807.04222v4 |
https://arxiv.org/pdf/1807.04222v4.pdf | |
PWC | https://paperswithcode.com/paper/modified-regularized-dual-averaging-method |
Repo | |
Framework | |
Coordinating Disaster Emergency Response with Heuristic Reinforcement Learning
Title | Coordinating Disaster Emergency Response with Heuristic Reinforcement Learning |
Authors | Long Nguyen, Zhou Yang, Jiazhen Zhu, Jia Li, Fang Jin |
Abstract | A crucial and time-sensitive task when any disaster occurs is to rescue victims and distribute resources to the right groups and locations. This task is challenging in populated urban areas, due to the huge burst of help requests generated in a very short period. To improve the efficiency of the emergency response in the immediate aftermath of a disaster, we propose a heuristic multi-agent reinforcement learning scheduling algorithm, named as ResQ, which can effectively schedule the rapid deployment of volunteers to rescue victims in dynamic settings. The core concept is to quickly identify victims and volunteers from social network data and then schedule rescue parties with an adaptive learning algorithm. This framework performs two key functions: 1) identify trapped victims and rescue volunteers, and 2) optimize the volunteers’ rescue strategy in a complex time-sensitive environment. The proposed ResQ algorithm can speed up the training processes through a heuristic function which reduces the state-action space by identifying the set of particular actions over others. Experimental results showed that the proposed heuristic multi-agent reinforcement learning based scheduling outperforms several state-of-art methods, in terms of both reward rate and response times. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2018-11-12 |
URL | http://arxiv.org/abs/1811.05010v1 |
http://arxiv.org/pdf/1811.05010v1.pdf | |
PWC | https://paperswithcode.com/paper/coordinating-disaster-emergency-response-with |
Repo | |
Framework | |
Deep Hierarchical Machine: a Flexible Divide-and-Conquer Architecture
Title | Deep Hierarchical Machine: a Flexible Divide-and-Conquer Architecture |
Authors | Shichao Li, Xin Yang, Tim Cheng |
Abstract | We propose Deep Hierarchical Machine (DHM), a model inspired from the divide-and-conquer strategy while emphasizing representation learning ability and flexibility. A stochastic routing framework as used by recent deep neural decision/regression forests is incorporated, but we remove the need to evaluate unnecessary computation paths by utilizing a different topology and introducing a probabilistic pruning technique. We also show a specified version of DHM (DSHM) for efficiency, which inherits the sparse feature extraction process as in traditional decision tree with pixel-difference feature. To achieve sparse feature extraction, we propose to utilize sparse convolution operation in DSHM and show one possibility of introducing sparse convolution kernels by using local binary convolution layer. DHM can be applied to both classification and regression problems, and we validate it on standard image classification and face alignment tasks to show its advantages over past architectures. |
Tasks | Face Alignment, Image Classification, Representation Learning |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.00647v1 |
http://arxiv.org/pdf/1812.00647v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-hierarchical-machine-a-flexible-divide |
Repo | |
Framework | |
Learning from Informants: Relations between Learning Success Criteria
Title | Learning from Informants: Relations between Learning Success Criteria |
Authors | Martin Aschenbach, Timo Kötzing, Karen Seidel |
Abstract | Learning from positive and negative information, so-called \emph{informants}, being one of the models for human and machine learning introduced by Gold, is investigated. Particularly, naturally arising questions about this learning setting, originating in results on learning from solely positive information, are answered. By a carefully arranged argument learners can be assumed to only change their hypothesis in case it is inconsistent with the data (such a learning behavior is called \emph{conservative}). The deduced main theorem states the relations between the most important delayable learning success criteria, being the ones not ruined by a delayed in time hypothesis output. Additionally, our investigations concerning the non-delayable requirement of consistent learning underpin the claim for \emph{delayability} being the right structural property to gain a deeper understanding concerning the nature of learning success criteria. Moreover, we obtain an anomalous \emph{hierarchy} when allowing for an increasing finite number of \emph{anomalies} of the hypothesized language by the learner compared with the language to be learned. In contrast to the vacillatory hierarchy for learning from solely positive information, we observe a \emph{duality} depending on whether infinitely many \emph{vacillations} between different (almost) correct hypotheses are still considered a successful learning behavior. |
Tasks | |
Published | 2018-01-31 |
URL | https://arxiv.org/abs/1801.10502v4 |
https://arxiv.org/pdf/1801.10502v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-informants-relations-between |
Repo | |
Framework | |
Estimating Total Search Space Size for Specific Piece Sets in Chess
Title | Estimating Total Search Space Size for Specific Piece Sets in Chess |
Authors | Azlan Iqbal |
Abstract | Automatic chess problem or puzzle composition typically involves generating and testing various different positions, sometimes using particular piece sets. Once a position has been generated, it is then usually tested for positional legality based on the game rules. However, it is useful to be able to estimate what the search space size for particular piece combinations is to begin with. So if a desirable chess problem was successfully generated by examining ‘merely’ 100,000 or so positions in a theoretical search space of about 100 billion, this would imply the composing approach used was quite viable and perhaps even impressive. In this article, I explain a method of calculating the size of this search space using a combinatorics and permutations approach. While the mathematics itself may already be established, a precise method and justification of applying it with regard to the chessboard and chess pieces has not been documented, to the best of our knowledge. Additionally, the method could serve as a useful starting point for further estimations of search space size which filter out positions for legality and rotation, depending on how the automatic composer is allowed to place pieces on the board (because this affects its total search space size). |
Tasks | |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1803.00874v1 |
http://arxiv.org/pdf/1803.00874v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-total-search-space-size-for |
Repo | |
Framework | |
Interpretable Partitioned Embedding for Customized Fashion Outfit Composition
Title | Interpretable Partitioned Embedding for Customized Fashion Outfit Composition |
Authors | Zunlei Feng, Zhenyun Yu, Yezhou Yang, Yongcheng Jing, Junxiao Jiang, Mingli Song |
Abstract | Intelligent fashion outfit composition becomes more and more popular in these years. Some deep learning based approaches reveal competitive composition recently. However, the unexplainable characteristic makes such deep learning based approach cannot meet the the designer, businesses and consumers’ urge to comprehend the importance of different attributes in an outfit composition. To realize interpretable and customized fashion outfit compositions, we propose a partitioned embedding network to learn interpretable representations from clothing items. The overall network architecture consists of three components: an auto-encoder module, a supervised attributes module and a multi-independent module. The auto-encoder module serves to encode all useful information into the embedding. In the supervised attributes module, multiple attributes labels are adopted to ensure that different parts of the overall embedding correspond to different attributes. In the multi-independent module, adversarial operation are adopted to fulfill the mutually independent constraint. With the interpretable and partitioned embedding, we then construct an outfit composition graph and an attribute matching map. Given specified attributes description, our model can recommend a ranked list of outfit composition with interpretable matching scores. Extensive experiments demonstrate that 1) the partitioned embedding have unmingled parts which corresponding to different attributes and 2) outfits recommended by our model are more desirable in comparison with the existing methods. |
Tasks | |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.04845v4 |
http://arxiv.org/pdf/1806.04845v4.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-partitioned-embedding-for |
Repo | |
Framework | |
Solar Cell Surface Defect Inspection Based on Multispectral Convolutional Neural Network
Title | Solar Cell Surface Defect Inspection Based on Multispectral Convolutional Neural Network |
Authors | Haiyong Chen, Yue Pang, Qidi Hu, Kun Liu |
Abstract | Similar and indeterminate defect detection of solar cell surface with heterogeneous texture and complex background is a challenge of solar cell manufacturing. The traditional manufacturing process relies on human eye detection which requires a large number of workers without a stable and good detection effect. In order to solve the problem, a visual defect detection method based on multi-spectral deep convolutional neural network (CNN) is designed in this paper. Firstly, a selected CNN model is established. By adjusting the depth and width of the model, the influence of model depth and kernel size on the recognition result is evaluated. The optimal CNN model structure is selected. Secondly, the light spectrum features of solar cell color image are analyzed. It is found that a variety of defects exhibited different distinguishable characteristics in different spectral bands. Thus, a multi-spectral CNN model is constructed to enhance the discrimination ability of the model to distinguish between complex texture background features and defect features. Finally, some experimental results and K-fold cross validation show that the multi-spectral deep CNN model can effectively detect the solar cell surface defects with higher accuracy and greater adaptability. The accuracy of defect recognition reaches 94.30%. Applying such an algorithm can increase the efficiency of solar cell manufacturing and make the manufacturing process smarter. |
Tasks | |
Published | 2018-12-15 |
URL | http://arxiv.org/abs/1812.06220v1 |
http://arxiv.org/pdf/1812.06220v1.pdf | |
PWC | https://paperswithcode.com/paper/solar-cell-surface-defect-inspection-based-on |
Repo | |
Framework | |
A Survey of Conventional and Artificial Intelligence / Learning based Resource Allocation and Interference Mitigation Schemes in D2D Enabled Networks
Title | A Survey of Conventional and Artificial Intelligence / Learning based Resource Allocation and Interference Mitigation Schemes in D2D Enabled Networks |
Authors | Kamran Zia, Nauman Javed, Muhammad Nadeem Sial, Sohail Ahmed, Hifsa Iram, Asad Amir Pirzada |
Abstract | 5th generation networks are envisioned to provide seamless and ubiquitous connection to 1000-fold more devices and is believed to provide ultra-low latency and higher data rates up to tens of Gbps. Different technologies enabling these requirements are being developed including mmWave communications, Massive MIMO and beamforming, Device to Device (D2D) communications and Heterogeneous Networks. D2D communication is a promising technology to enable applications requiring high bandwidth such as online streaming and online gaming etc. It can also provide ultra- low latencies required for applications like vehicle to vehicle communication for autonomous driving. D2D communication can provide higher data rates with high energy efficiency and spectral efficiency compared to conventional communication. The performance benefits of D2D communication can be best achieved when D2D users reuses the spectrum being utilized by the conventional cellular users. This spectrum sharing in a multi-tier heterogeneous network will introduce complex interference among D2D users and cellular users which needs to be resolved. Motivated by limited number of surveys for interference mitigation and resource allocation in D2D enabled heterogeneous networks, we have surveyed different conventional and artificial intelligence based interference mitigation and resource allocation schemes developed in recent years. Our contribution lies in the analysis of conventional interference mitigation techniques and their shortcomings. Finally, the strengths of AI based techniques are determined and open research challenges deduced from the recent research are presented. |
Tasks | Autonomous Driving |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08748v1 |
http://arxiv.org/pdf/1809.08748v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-conventional-and-artificial |
Repo | |
Framework | |
Multi-Label Wireless Interference Identification with Convolutional Neural Networks
Title | Multi-Label Wireless Interference Identification with Convolutional Neural Networks |
Authors | Sergej Grunau, Dimitri Block, Uwe Meier |
Abstract | The steadily growing use of license-free frequency bands require reliable coexistence management and therefore proper wireless interference identification (WII). In this work, we propose a WII approach based upon a deep convolutional neural network (CNN) which classifies multiple IEEE 802.15.1, IEEE 802.11 b/g and IEEE 802.15.4 interfering signals in the presence of a utilized signal. The generated multi-label dataset contains frequency- and time-limited sensing snapshots with the bandwidth of 10 MHz and duration of 12.8 $\mu$s, respectively. Each snapshot combines one utilized signal with up to multiple interfering signals. The approach shows promising results for same-technology interference with a classification accuracy of approximately 100 % for IEEE 802.15.1 and IEEE 802.15.4 signals. For IEEE 802.11 b/g signals the accuracy increases for cross-technology interference with at least 90 %. |
Tasks | |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04395v1 |
http://arxiv.org/pdf/1804.04395v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-label-wireless-interference |
Repo | |
Framework | |
Gradient Layer: Enhancing the Convergence of Adversarial Training for Generative Models
Title | Gradient Layer: Enhancing the Convergence of Adversarial Training for Generative Models |
Authors | Atsushi Nitanda, Taiji Suzuki |
Abstract | We propose a new technique that boosts the convergence of training generative adversarial networks. Generally, the rate of training deep models reduces severely after multiple iterations. A key reason for this phenomenon is that a deep network is expressed using a highly non-convex finite-dimensional model, and thus the parameter gets stuck in a local optimum. Because of this, methods often suffer not only from degeneration of the convergence speed but also from limitations in the representational power of the trained network. To overcome this issue, we propose an additional layer called the gradient layer to seek a descent direction in an infinite-dimensional space. Because the layer is constructed in the infinite-dimensional space, we are not restricted by the specific model structure of finite-dimensional models. As a result, we can get out of the local optima in finite-dimensional models and move towards the global optimal function more directly. In this paper, this phenomenon is explained from the functional gradient method perspective of the gradient layer. Interestingly, the optimization procedure using the gradient layer naturally constructs the deep structure of the network. Moreover, we demonstrate that this procedure can be regarded as a discretization method of the gradient flow that naturally reduces the objective function. Finally, the method is tested using several numerical experiments, which show its fast convergence. |
Tasks | |
Published | 2018-01-07 |
URL | http://arxiv.org/abs/1801.02227v2 |
http://arxiv.org/pdf/1801.02227v2.pdf | |
PWC | https://paperswithcode.com/paper/gradient-layer-enhancing-the-convergence-of |
Repo | |
Framework | |
Spatio-temporal Edge Service Placement: A Bandit Learning Approach
Title | Spatio-temporal Edge Service Placement: A Bandit Learning Approach |
Authors | Lixing Chen, Jie Xu, Shaolei Ren, Pan Zhou |
Abstract | Shared edge computing platforms deployed at the radio access network are expected to significantly improve quality of service delivered by Application Service Providers (ASPs) in a flexible and economic way. However, placing edge service in every possible edge site by an ASP is practically infeasible due to the ASP’s prohibitive budget requirement. In this paper, we investigate the edge service placement problem of an ASP under a limited budget, where the ASP dynamically rents computing/storage resources in edge sites to host its applications in close proximity to end users. Since the benefit of placing edge service in a specific site is usually unknown to the ASP a priori, optimal placement decisions must be made while learning this benefit. We pose this problem as a novel combinatorial contextual bandit learning problem. It is “combinatorial” because only a limited number of edge sites can be rented to provide the edge service given the ASP’s budget. It is “contextual” because we utilize user context information to enable finer-grained learning and decision making. To solve this problem and optimize the edge computing performance, we propose SEEN, a Spatial-temporal Edge sErvice placemeNt algorithm. Furthermore, SEEN is extended to scenarios with overlapping service coverage by incorporating a disjunctively constrained knapsack problem. In both cases, we prove that our algorithm achieves a sublinear regret bound when it is compared to an oracle algorithm that knows the exact benefit information. Simulations are carried out on a real-world dataset, whose results show that SEEN significantly outperforms benchmark solutions. |
Tasks | Decision Making |
Published | 2018-10-07 |
URL | http://arxiv.org/abs/1810.03069v1 |
http://arxiv.org/pdf/1810.03069v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-edge-service-placement-a |
Repo | |
Framework | |
A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation
Title | A Multi-Discriminator CycleGAN for Unsupervised Non-Parallel Speech Domain Adaptation |
Authors | Ehsan Hosseini-Asl, Yingbo Zhou, Caiming Xiong, Richard Socher |
Abstract | Domain adaptation plays an important role for speech recognition models, in particular, for domains that have low resources. We propose a novel generative model based on cyclic-consistent generative adversarial network (CycleGAN) for unsupervised non-parallel speech domain adaptation. The proposed model employs multiple independent discriminators on the power spectrogram, each in charge of different frequency bands. As a result we have 1) better discriminators that focus on fine-grained details of the frequency features, and 2) a generator that is capable of generating more realistic domain-adapted spectrogram. We demonstrate the effectiveness of our method on speech recognition with gender adaptation, where the model only has access to supervised data from one gender during training, but is evaluated on the other at test time. Our model is able to achieve an average of $7.41%$ on phoneme error rate, and $11.10%$ word error rate relative performance improvement as compared to the baseline, on TIMIT and WSJ dataset, respectively. Qualitatively, our model also generates more natural sounding speech, when conditioned on data from the other domain. |
Tasks | Domain Adaptation, Speech Recognition |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1804.00522v4 |
http://arxiv.org/pdf/1804.00522v4.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-discriminator-cyclegan-for |
Repo | |
Framework | |
Learning Deep Convolutional Networks for Demosaicing
Title | Learning Deep Convolutional Networks for Demosaicing |
Authors | Nai-Sheng Syu, Yu-Sheng Chen, Yung-Yu Chuang |
Abstract | This paper presents a comprehensive study of applying the convolutional neural network (CNN) to solving the demosaicing problem. The paper presents two CNN models that learn end-to-end mappings between the mosaic samples and the original image patches with full information. In the case the Bayer color filter array (CFA) is used, an evaluation with ten competitive methods on popular benchmarks confirms that the data-driven, automatically learned features by the CNN models are very effective. Experiments show that the proposed CNN models can perform equally well in both the sRGB space and the linear space. It is also demonstrated that the CNN model can perform joint denoising and demosaicing. The CNN model is very flexible and can be easily adopted for demosaicing with any CFA design. We train CNN models for demosaicing with three different CFAs and obtain better results than existing methods. With the great flexibility to be coupled with any CFA, we present the first data-driven joint optimization of the CFA design and the demosaicing method using CNN. Experiments show that the combination of the automatically discovered CFA pattern and the automatically devised demosaicing method significantly outperforms the current best demosaicing results. Visual comparisons confirm that the proposed methods reduce more visual artifacts than existing methods. Finally, we show that the CNN model is also effective for the more general demosaicing problem with spatially varying exposure and color and can be used for taking images of higher dynamic ranges with a single shot. The proposed models and the thorough experiments together demonstrate that CNN is an effective and versatile tool for solving the demosaicing problem. |
Tasks | Demosaicking, Denoising |
Published | 2018-02-11 |
URL | http://arxiv.org/abs/1802.03769v1 |
http://arxiv.org/pdf/1802.03769v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-deep-convolutional-networks-for |
Repo | |
Framework | |
Exploring RNN-Transducer for Chinese Speech Recognition
Title | Exploring RNN-Transducer for Chinese Speech Recognition |
Authors | Senmao Wang, Pan Zhou, Wei Chen, Jia Jia, Lei Xie |
Abstract | End-to-end approaches have drawn much attention recently for significantly simplifying the construction of an automatic speech recognition (ASR) system. RNN transducer (RNN-T) is one of the popular end-to-end methods. Previous studies have shown that RNN-T is difficult to train and a very complex training process is needed for a reasonable performance. In this paper, we explore RNN-T for a Chinese large vocabulary continuous speech recognition (LVCSR) task and aim to simplify the training process while maintaining performance. First, a new strategy of learning rate decay is proposed to accelerate the model convergence. Second, we find that adding convolutional layers at the beginning of the network and using ordered data can discard the pre-training process of the encoder without loss of performance. Besides, we design experiments to find a balance among the usage of GPU memory, training circle and model performance. Finally, we achieve 16.9% character error rate (CER) on our test set which is 2% absolute improvement from a strong BLSTM CE system with language model trained on the same text corpus. |
Tasks | Language Modelling, Large Vocabulary Continuous Speech Recognition, Speech Recognition |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05097v2 |
http://arxiv.org/pdf/1811.05097v2.pdf | |
PWC | https://paperswithcode.com/paper/exploring-rnn-transducer-for-chinese-speech |
Repo | |
Framework | |
Adversarial Text Generation Without Reinforcement Learning
Title | Adversarial Text Generation Without Reinforcement Learning |
Authors | David Donahue, Anna Rumshisky |
Abstract | Generative Adversarial Networks (GANs) have experienced a recent surge in popularity, performing competitively in a variety of tasks, especially in computer vision. However, GAN training has shown limited success in natural language processing. This is largely because sequences of text are discrete, and thus gradients cannot propagate from the discriminator to the generator. Recent solutions use reinforcement learning to propagate approximate gradients to the generator, but this is inefficient to train. We propose to utilize an autoencoder to learn a low-dimensional representation of sentences. A GAN is then trained to generate its own vectors in this space, which decode to realistic utterances. We report both random and interpolated samples from the generator. Visualization of sentence vectors indicate our model correctly learns the latent space of the autoencoder. Both human ratings and BLEU scores show that our model generates realistic text against competitive baselines. |
Tasks | Adversarial Text, Text Generation |
Published | 2018-10-11 |
URL | http://arxiv.org/abs/1810.06640v2 |
http://arxiv.org/pdf/1810.06640v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-text-generation-without |
Repo | |
Framework | |