The NLP Cypher | 01.24.21

Hey Welcome back! Another week goes by and the NLP domain continues to fly beyond escape velocity… But don’t worry, there’s an awesome intuition pump on how Transformers work:

declassified

If you continue to enjoy this read, please share with your friends and don’t forget to give it a 👏👏 …. 😎

Cornell Tech came out with a huge Twitter dataset based on 7.6M tweet/25.6M retweets from 2.6M users that discussed voter fraud between October 23rd and December 16th. The analysis goes in deep on who promoted or denied “voter fraud”, visualizations of the networks, and who Twitter banned (individual tweet content were not directly shared for privacy). The results were fascinating and the dataset is available.

Networks of “promoters” and “detractors” of voter fraud. Orange color highlights suspended Twitter accounts.

GitHub:

Hey want to teach your encoder decoder models how to generate questions from answers??? Take a look at the Jeopardy archive created by the fans. Has clues and answers plus other metadata. Great data resource if it only can be harvested somewhere ….

Here it is! ✌✌

Sebastian Ruder’s 2020 recap is a blog post you can’t miss. He discussed top 10 trends (including links to papers) in NLP/Machine learning that caught his eye over the past year:

Full Blog Post

A refreshing recap discussing where graph neural networks applications are headed in 2021. Discusses recommender systems, combinatorial optimization, computer vision and physics/life sciences applications.

Remember Zero Redundancy Optimizer (ZeRO)? Microsoft’s optimizer for very large parameter models returns with an engaging Hugging Face blog post. FYI, (Hugging Face’s Trainer class gives support for DeepSpeed’s and FairScale’s ZeRO features as of the 4.2 version.) With the DeepSpeed library, they were able to get a single 24GB RTX-3090 card to train a 3 billion param T5 with a batch size of 20. 👀👀

Blog:

If you like videos and computer science educational videos 👇

The Black Vault really enjoys its FOIA (Freedom of Information Act) requests so much that it decided to request all of the YouTube videos that are listed as private or unlisted among several federal agencies!! 😁

“The time required to deploy a model is 31% lower for organizations that buy a third-party solution.”

“Organizations with more models spend more of their data scientists’ time on deployment, not less”

“The time required to deploy a model is increasing year-on-year”

Download a free copy here:

A collection of recent released repos that caught our 👁

Footer