• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • Crypto Currency
  • Technology
  • Contact
NEO Share

NEO Share

Sharing The Latest Tech News

  • Home
  • Artificial Intelligence
  • Machine Learning
  • Computers
  • Mobile
  • Crypto Currency

Learned Representations to understand Neural Network Training Dynamics

February 12, 2021 by systems

Coming back to our old friend Canonical Correlation Analysis (CCA), let us see how researchers were able to appreciate the differences in training dynamics of generalizing and memorizing CNNS.

CCA and its use to appreciate the difference between layers of neural networks have already been discussed here. In this article, we are focusing on different neural networks, having different training dynamics. Researchers at DeepMind and Google Brain built upon SVCCA to develop projection-weighted CCA for comparison among CNNs, where weighted means were used for canonical correlations and their relationship with the underlying representation. So a higher weight was assigned to a CCA vector that was more canonically correlated with the representation.

It must also be remembered that training data in the real world will contain noise as well. Since training dynamics are also impacted by the training data, how do the dynamics vary according to the ‘original signal’ and the accompanying ‘noise’? To answer this, the CCA similarities were compared between layer L at times t throughout training with the same layer L at the final time step T. It was found that the sorted CCA coefficients ρ continued to change well after the network’s performance had converged. It could also be assumed that the un-converged coefficients and their corresponding vectors represented the ‘noise’.

The next question that arose was if the CCA vectors that stabilized early in training remained stable. To test this, the CCA vectors were computed between layer L at the time step tₑₐᵣₗᵧ in training and time step T/2. The similarity between the top 100 vectors, which had stabilized early was found to remain stable; and the bottom 100 vectors, which had not stabilized with the representation at all other training steps, continued to vary and therefore likely represented noise. These results suggested that task-critical representations are learned by midway through training, while the noise only approaches its final value towards the end.

Filed Under: Machine Learning

Primary Sidebar

Stay Ahead: The Latest Tech News and Innovations

Cryptocurrency Market Updates: What’s Happening Now

Emerging Trends in Artificial Intelligence: What to Watch For

Top Cloud Computing Services to Secure Your Data

The Future of Mobile Technology: Recent Advancements and Predictions

Footer

  • Privacy Policy
  • Terms and Conditions

Copyright © 2025 NEO Share

Terms and Conditions - Privacy Policy