Our Blogs

29 April
2022

Why Should You Know About Gpt 3?

What is GPT-3?

  • GPT-3, Generative Pre-trained Transformer 3, is the 3rd generation transformer model (after GPT-1 and GPT-2) with 175 billion parameters, making it 17 times larger than its predecessor, GPT-2.
  • GPT-3 was trained using a combination of 5 large datasets - Common Crawl, WebText2, Books1, Books2, Wikipedia Corpus. The final dataset resulted in hundreds of billions of words (large portions of web pages from the internet, a huge collection of books, and all of Wikipedia) to train GPT-3 to generate texts.
  • So basically, GPT-3 has seen more text than any human will ever see in their lifetime.

 
Why should you know about it or where can you use it?

  • Being a generative model, it is designed for text generation tasks such as question-answering, text-summarization, language translation, named-entity recognition, text classification.
  • Apart from these tasks, GPT-3 can be used to create anything that has a language structure in it - that means it can write essays, create code, generate topics for blogs, create humor, generate contents for ads, emails, etc., and many many more if you are creative enough.
  • In fact, start-ups are using GPT-3 for copywriting, code generations, etc.


Why it is better than previous state-of-the-art models?

Let's look at some history:

  • Before GPT-1, most of the state-of-the-art NPL models were trained using supervised learning on specific tasks like sentiment analysis, language translation, etc. But the major limitations with supervised models are-
  1. they need a large amount of labeled dataset, which is not always available.
  2. they perform poorly on tasks for which they are not trained.
  • GPT-1 proposed training the language model using an unlabeled dataset and then fine-tuning the model on specific tasks (supervised-training). For this fine-tuning on specific tasks, it required just a few epochs, which showed that the model had learned a lot during pre-training.
  • GPT-1 performed better than specifically trained supervised state-of-the-art models in 9 out of 12 tasks the models were compared on.
  • GPT-1 proved that with pre-training, the model was able to generalize well.
  • The major development with GPT-2 was using a larger dataset and adding more parameters to the model to get a stronger language model.
  • Also, GPT-2 aimed at learning multiple tasks using the same unsupervised model. For that, the learning objective was modified to the probability of output given input and task -> P(output | input, task) instead of the probability of output given only input -> P(output | input). So the model was expected to give different outputs for the same input for a different task.
  • GPT-2 showed that having more parameters and training on a larger dataset improved the capability of the model and also excel the state-of-the-art of many tasks in zero-shot settings.
  • With GPT-3, OpenAI built a very powerful language model which would need no fine-tuning and only a few examples to understand tasks and perform them.  Due to its capabilities, GPT-3 has shown to write articles that are hard to differentiate from the ones written by humans.
  • GPT-3 performed better than state-of-the-art for few language modeling datasets. For other datasets, it improved the zero-shot state-of-the-art performance.
  • GPT-3 also performed reasonably well on NPL tasks like closed book question answering, translation, etc. often beating the state-of-the-art models. For most of the tasks, it performed better in few-shot learning as compared to one and zero-shot learning. And this is the reason that GPT-3 is being used for a variety of language generation tasks.

Some examples

  • Text to website: All you need do is provide a sample URL and some description and it will create a website. Check out the demo.
https://twitter.com/jsngr/status/1287026808429383680
  • Text to regex: Describe the regex you want and an example of a string that will match.
https://twitter.com/parthi_logan/status/1286818567631982593
  • Text to SQL query: Just describe your SQL queries in English and it will provide you with the SQL code.
https://twitter.com/FaraazNishtar/status/1285934622891667457
  • Text generated by GPT-3: On providing just the Title, author name, and the first word "it", the model generated the following text.

 https://twitter.com/quasimondo/status/1284509525500989445?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1284509525500989445%7Ctwgr%5E%7Ctwcon%5Es1_&ref_url=https%3A%2F%2Fwww.technologyreview.com%2F2020%2F07%2F20%2F1005454%2Fopenai-machine-learning-language-generator-gpt-3-nlp%2F