The internet is ablaze with influential people from a multitude of industries and disciplines, all talking about AI.
Bill Gates believes the age of AI has begun, stating advances like ChatGPT and DALL-E are as revolutionary as the smart phone and the internet itself.
So what's going on? Why AI? And why now?
These emerging technologies are known as generative models. Through constantly improving scientific research, engineering efficiency and operational scale, perceivably intelligent behaviour has emerged, progressing to human-level benchmarks in many capabilities.
There also exists rapid development within fast-moving companies like OpenAI, constantly developing new, user-friendly and widely accessible methods to interface with these models.
The combination of this innovation and rapid product development has led to an AI boom, including ChatGPT becoming the fastest growing user-base of all time.
ChatGPT is an example of a large language model, or LLM. At their core, LLMs are conceptually quite straightforward. Put simply, the model is a function that takes in a series of words and sequentially predicts the next word. There is plenty more nuance here - and an enormous amount of computation to perform - but this is functionally how it generates text.
As such, these models require an initial set of words (known as the prompt) in order to write a sentence, message or essay. The model repeatedly loops this function, adding the newly generated word into the collection of input words, until it finishes.
A key innovation that has proven critical to the development of LLMs like ChatGPT is the concept of attention. To predict the most likely next word, the model also takes into account (pays attention to) the way that the input words are ordered and structured. This not only enables the model to interpret the overall context of what is being written, but also understand how the construction of a sentence can change the meaning of individual words in subtle ways.
Consider the above example. In the first sentence, it is clear (to us) that the first “it” refers to today's weather. But following a slight restructure in the second sentence, the first “it” changes to refer to the forecast itself, while the second “it” now refers to the weather. While ChatGPT doesn't know explicitly what a verb or a pronoun actually is - the attention mechanism lets it learn and discover the nuance of natural language.
In order to predict what the next word should be, LLMs leverage a very large and complex neural network which has been pre-trained on a massive amount of text data. For a sense of scale, the training regime for GPT-4 ingested more words than the average human speaks in 1,000 lifetimes.
For the technically inclined, this article goes into great depth.
One important concept to note is that of alignment. The base LLMs, such as GPT-4, are not very useful to users. They can generate untruthful, toxic or harmful content, and are not trained to perform tasks provided by a user. In order to better align a LLM, a technique called reinforcement learning from human feedback (RLHF) is used.
Credit: OpenAIThis process leverages human trainers to generate their own ideal outputs and / or rank different machine-generated outputs given the same prompt. ChatGPT has been fine-tuned using RLHF to produce outputs that align with our concept of what a useful chatbot would provide.
LLMs (and generative models more broadly, such as DALL-E) are already predicted to have a significant impact on how we create, live and work - with several disciplines projected to see major use, if not already.
Already we're seeing a paradigm shift across many disciplines and applications, such as:
While there is exciting emerging evidence on the current and future benefits of these tools, the tech world has seen time and time again the consequences of over-eagerly pursuing these technologies without properly understanding and managing the risks. (for instance social-media, metaverse, earlier AI developments).
Some already observed problems that have arisen include:
If the training data used to develop these models is biased, that is if it contains certain information or perspectives while not including enough of others, these biases will be observed in the model. These consequences have plagued previous applications of AI in some cases to disastrous effect, and is possible in these models also. Without being able to see the training data, we might be completely unaware of biases being produced in the AI's outputs. Bias can also be introduced during the RLHF phase, where a relatively small number of humans fine-tune a model towards their notion of ideal.
An issue with being trained on an internet's worth of data is the fact that there is a plethora of entirely incorrect information on the internet. The way that these models “mix and match” information leads to them making up entirely false claims, yet stating them with apparent absolute certainty. This is known as hallucinating, and to individuals who are unaware, or do not fact check their information, introduces a new way of spreading misinformation.
Unlike hallucination which is the model “innocently” producing incorrect information, these models can easily be used by bad actors to create mass amounts of convincing misinformation. Bad actors can even extend to the very builders and providers of these tools, as they often have access to all information provided by users.
While some outputs may have a “feel” of being AI generated, continued improvements will only make it more difficult to identify what is real or AI-generated content. Different outputs (text, images, etc.) can start to appear too similar, especially when being produced from the same model. In addition, if users rely on models like ChatGPT for research and education, our collective societal knowledge base could become narrowed and lacking in variety as users trust model outputs to become informed.
Models that are trained on such large amounts of data need to source it from as many places as possible. This means that most of these models are using text and images that are available on the internet, but without the creator's permission or knowledge. This has led to outputs that heavily resemble other people's work, especially in the design and digital art space, raising many questions surrounding the concept of authorship and ownership. Whilst text and images generated may be entirely new, the nature of their creation makes it incredibly difficult to determine who (or what) should be credited as the author or artist.
So how can you make the most of this new AI landscape, while managing and navigating the risks posed?
Smash Delta is a strategic technology group. We co-create strategic programs and powerful assets in the worlds of Data, AI and Digital.
We strongly believe in the fair and ethical use of data, and champion individuals and organisations to maintain ownership of their information.