How GPT-3 Works (in Simple Language) and Its Impact on Startups

Haydon Luo
4 min readJan 12, 2021

--

GPT-3 was released in mid 2020 and has attracted lots of attention since then. In this article, I will first discuss what GPT-3 is and how it works in simple words, then discuss its advantages and limitations, and finally discuss its impact to startups.

What GPT-3 is?

Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model that uses deep learning to produce human-like text. It is the third-generation language prediction model in the GPT-n series created by OpenAI.

The quality of the text generated by GPT-3 is so high that it is difficult to distinguish from that written by a human.

How GPT-3 works, in simple words?

GPT-3 is a pre-trained language model used to generate text, with optional input in text format. GPT-3 is trained with a huge amount of texts (equivalent to 300 billion words or 300 thousand books). The training process is very expensive.

What GPT-3 model does each time is to predict the next word from the preceding texts (~2,000 words), and then correct this word if wrong to improve the model. This unsupervised cycle repeated continuously. In each of the unsupervised cycle, the following steps take place:

  • Convert preceding words into vectors, with positional information in the sentence
  • Make perditions by a “black box”, which contains 96 decoder layers, each layer with ~1.8B parameter.
  • The predictions are obtained in the form of vectors, which will be converted back to words.

What can GPT-3 do?

According to multiple sources, the following are examples of what GPT-3 can do:

  • Smartly format text and even create HTML code [1]
  • Enter sentences to describe the mathematical formula, and get the standard LaTeX code of the formula [2]
  • Generate code for a machine learning model by describing the dataset and required output [3]
  • Generate a poem about a specific person from just a prompt [4]
  • Work in spreadsheet to fill the blank cells intelligently [5]
  • Plot charts by inputting text to describe what are included in the chart [6]
  • Reply an email by inputting only the bullet points [7]

Source: [1], [2], [3], [4], [5], [6], [7]

And there is ongoing curation of GPT-3 samples here.

Advantages and Limitations of GPT-3

1. In addition to many things that GPT-3 can do, what are some other advantages of it?

  • It exhibits meta-learning, and the performance continues to improve with the number of parameters.
  • It uses a small, shallow, uniform architecture, a similar way to human brain.
  • Furthermore, It is trained on low-quality internet data.

2. Well, GPT-3 can do a lot of things. So is it Artificial general intelligence (AGI)?

GPT-3 getting closer to passing the Turing Test. However, it is still not AGI, because it has no semantic understanding, no causal reasoning, and poor generalisation beyond the training set. Its other limitations include long and costly training process.

3. Nevertheless, GPT-3 provides a solid evidence for scaling hypothesis:

Once we find a scalable architecture, which like the brain can be applied fairly uniformly, we can train ever larger NNs, and ever more sophisticated behavior will emerge.

4. Although GPT-3 is not AGI, if we continue to improve along its direction, we MAY realize AGI.

The human brain has 100 trillion synapses, which is three orders of magnitude larger than GPT-3, the largest artificial intelligence model. It is possible that future versions of GPT will pass the Turing Test.

The Impact of GPT-3 on Startups (and Venture Capitals!)

GPT-3 can be commercialized in semantic search, chatbots, enhancement tools, text generation, content understanding, machine translation, etc. In my opinion, however, it may not be a good idea for a NEW venture to commercialize GPT-3, this is because:

  • OpenAI opened the API of GPT-3 for research and commercial use, so there are low barriers to entry. The low moat will lead to fierce competition.
  • It can be very hard for startups using GPT-3 to differentiate themselves from other similar startups.

Established startups can incorporate GPT-3 into their business to improve their competitive advantages, but their moat usually lies somewhere else, such as economies of scale, network effect, etc.

Although it is still early to see the startup ecosystem’s reaction to GPT-3, I have some early suggestions to venture capitals:

  • Watch out for startups whose competitive edge is built solely on GPT-3 (reason listed above).
  • Also be cautious with startups whose competitive edge is built solely on other NLP technologies, because their edge can be removed by GPT-3, GTP-4, or similar competing models from Google/Amazon/Facebook.
  • The emergence of GPT-3 will further deepen the trend of technology giants monopolizing AI technology. But because strategically NLP is still an important cornerstone of AI, VCs still need to look for investment opportunities in the Technology Layer as discussed in the last section.

Bottom Line

GPT-3 is exciting to machine learning researchers and industry experts, because it appears to be very smart and can perform many tasks without specific tuning or adjustments. It also marks a step towards AGI. For startup founders, GPT-3 can improve competitive advantages if integrated, but it may not be a good idea to start a venture directly commercializing GPT-3.

Originally published at https://www.haydonluo.com on January 12, 2021.

--

--

Haydon Luo

tech and fitness enthusiast, engineer, lifehacker, and lifelong learner. interested in AI. now actively exploring Web3. www.haydonluo.com