Why Should You Know About Gpt 3? | Bayshore Intelligence Solutions

29 April
2022

Why Should You Know About Gpt 3?

What is GPT-3?

GPT-3, Generative Pre-trained Transformer 3, is the 3rd generation transformer model (after GPT-1 and GPT-2) with 175 billion parameters, making it 17 times larger than its predecessor, GPT-2.

GPT-3 was trained using a combination of 5 large datasets - Common Crawl, WebText2, Books1, Books2, Wikipedia Corpus. The final dataset resulted in hundreds of billions of words (large portions of web pages from the internet, a huge collection of books, and all of Wikipedia) to train GPT-3 to generate texts.

So basically, GPT-3 has seen more text than any human will ever see in their lifetime.

Why should you know about it or where can you use it?

Being a generative model, it is designed for text generation tasks such as question-answering, text-summarization, language translation, named-entity recognition, text classification.

Apart from these tasks, GPT-3 can be used to create anything that has a language structure in it - that means it can write essays, create code, generate topics for blogs, create humor, generate contents for ads, emails, etc., and many many more if you are creative enough.

In fact, start-ups are using GPT-3 for copywriting, code generations, etc.

Why it is better than previous state-of-the-art models?

Let's look at some history:

Before GPT-1, most of the state-of-the-art NPL models were trained using supervised learning on specific tasks like sentiment analysis, language translation, etc. But the major limitations with supervised models are-

they need a large amount of labeled dataset, which is not always available.
they perform poorly on tasks for which they are not trained.

GPT-1 proposed training the language model using an unlabeled dataset and then fine-tuning the model on specific tasks (supervised-training). For this fine-tuning on specific tasks, it required just a few epochs, which showed that the model had learned a lot during pre-training.

GPT-1 performed better than specifically trained supervised state-of-the-art models in 9 out of 12 tasks the models were compared on.

GPT-1 proved that with pre-training, the model was able to generalize well.

The major development with GPT-2 was using a larger dataset and adding more parameters to the model to get a stronger language model.

Also, GPT-2 aimed at learning multiple tasks using the same unsupervised model. For that, the learning objective was modified to the probability of output given input and task -> P(output | input, task) instead of the probability of output given only input -> P(output | input). So the model was expected to give different outputs for the same input for a different task.

GPT-2 showed that having more parameters and training on a larger dataset improved the capability of the model and also excel the state-of-the-art of many tasks in zero-shot settings.

With GPT-3, OpenAI built a very powerful language model which would need no fine-tuning and only a few examples to understand tasks and perform them. Due to its capabilities, GPT-3 has shown to write articles that are hard to differentiate from the ones written by humans.

GPT-3 performed better than state-of-the-art for few language modeling datasets. For other datasets, it improved the zero-shot state-of-the-art performance.

GPT-3 also performed reasonably well on NPL tasks like closed book question answering, translation, etc. often beating the state-of-the-art models. For most of the tasks, it performed better in few-shot learning as compared to one and zero-shot learning. And this is the reason that GPT-3 is being used for a variety of language generation tasks.

Some examples

Text to website: All you need do is provide a sample URL and some description and it will create a website. Check out the demo.

https://twitter.com/jsngr/status/1287026808429383680

Text to regex: Describe the regex you want and an example of a string that will match.

https://twitter.com/parthi_logan/status/1286818567631982593

Text to SQL query: Just describe your SQL queries in English and it will provide you with the SQL code.

https://twitter.com/FaraazNishtar/status/1285934622891667457

Text generated by GPT-3: On providing just the Title, author name, and the first word "it", the model generated the following text.

https://twitter.com/quasimondo/status/1284509525500989445?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1284509525500989445%7Ctwgr%5E%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fwww.technologyreview.com%2F2020%2F07%2F20%2F1005454%2Fopenai-machine-learning-language-generator-gpt-3-nlp%2F