GPT & Me

Large language models and what they mean for us

The internet is ablaze with influential people from a multitude of industries and disciplines, all talking about AI.

Bill Gates believes the age of AI has begun, stating advances like ChatGPT and DALL-E are as revolutionary as the smart phone and the internet itself.

So what's going on? Why AI? And why now?

These emerging technologies are known as . Through constantly improving scientific research, engineering efficiency and operational scale, perceivably intelligent behaviour has emerged, progressing to human-level benchmarks in many capabilities.

There also exists rapid development within fast-moving companies like OpenAI, constantly developing new, user-friendly and widely accessible methods to interface with these models.

The combination of this innovation and rapid product development has led to an AI boom, including ChatGPT becoming the fastest growing user-base of all time.

What is ChatGPT really doing?

ChatGPT is an example of a large language model, or LLM. At their core, LLMs are conceptually quite straightforward. Put simply, the model is a function that takes in a series of words and sequentially predicts the . There is plenty more nuance here - and an enormous amount of computation to perform - but this is functionally how it generates text.

As such, these models require an initial set of words (known as the prompt) in order to write a sentence, message or essay. The model repeatedly loops this function, adding the newly generated word into the collection of input words, until it finishes.

A key innovation that has proven critical to the development of LLMs like ChatGPT is the concept of . To predict the most likely next word, the model also takes into account (pays attention to) the way that the input words are ordered and structured. This not only enables the model to interpret the overall context of what is being written, but also understand how the construction of a sentence can change the meaning of individual words in subtle ways.

Consider the above example. In the first sentence, it is clear (to us) that the first “it” refers to today's weather. But following a slight restructure in the second sentence, the first “it” changes to refer to the forecast itself, while the second “it” now refers to the weather. While ChatGPT doesn't know explicitly what a verb or a pronoun actually is - the attention mechanism lets it learn and

How are they trained?

In order to predict what the next word should be, LLMs leverage a very large and complex which has been pre-trained on a massive amount of text data. For a sense of scale, the training regime for ingested more words than the average human speaks in 1,000 lifetimes.

For the technically inclined, this article goes into great depth.

One important concept to note is that of alignment. The base LLMs, such as GPT-4, are not very useful to users. They can generate untruthful, toxic or harmful content, and are not trained to perform tasks provided by a user. In order to better align a LLM, a technique called reinforcement learning from human feedback (RLHF) is used.

Credit: OpenAI

This process leverages human trainers to generate their own ideal outputs and / or rank different machine-generated outputs given the same prompt. ChatGPT has been fine-tuned using RLHF to produce outputs that align with our concept of what a useful chatbot would provide.

What's changing?

LLMs (and generative models more broadly, such as DALL-E) are already predicted to have a significant impact on how we create, live and work - with several disciplines projected to see major use, if not already.

Already we're seeing a paradigm shift across many disciplines and applications, such as:

Task Automation
Leveraging these models capabilities to automatically and rapidly perform repetitive tasks that were previously carried out manually
Design & Creative Assets
Facilitating rapid development of high-quality and professional visual assets
Accessibility
Enabling human-like assistance in understanding and interacting with one's environment
Coding & Data Analysis
Reducing barriers by assisting developers and analysts with automatic code generation and bug finding, defining data structures, and suggesting what information/insights are most relevant
Research & Active Learning
Summarising large, dense and 'jargon-heavy' articles into content that can be readily understood. Also empowering active learning by engaging students in back-and-forth conversations

What are the risks?

While there is exciting emerging evidence on the current and future benefits of these tools, the tech world has seen time and time again the consequences of over-eagerly pursuing these technologies without properly understanding and managing the risks. (for instance social-media, metaverse, earlier AI developments).

Some already observed problems that have arisen include:

Biases

If the training data used to develop these models is biased, that is if it contains certain information or perspectives while not including enough of others, these biases will be observed in the model. These consequences have plagued previous applications of AI in some cases to disastrous effect, and is possible in these models also. Without being able to see the training data, we might be completely unaware of biases being produced in the AI's outputs. Bias can also be introduced during the RLHF phase, where a relatively small number of humans fine-tune a model towards their notion of ideal.

Hallucination

An issue with being trained on an internet's worth of data is the fact that there is a plethora of entirely incorrect information on the internet. The way that these models “mix and match” information leads to them making up entirely false claims, yet stating them with apparent absolute certainty. This is known as hallucinating, and to individuals who are unaware, or do not fact check their information, introduces a new way of spreading misinformation.

Bad Actors

Unlike hallucination which is the model “innocently” producing incorrect information, these models can easily be used by bad actors to create mass amounts of convincing misinformation. Bad actors can even extend to the very builders and providers of these tools, as they often have access to all information provided by users.

Homogenisation

While some outputs may have a “feel” of being AI generated, continued improvements will only make it more difficult to identify what is real or AI-generated content. Different outputs (text, images, etc.) can start to appear too similar, especially when being produced from the same model. In addition, if users rely on models like ChatGPT for research and education, our collective societal knowledge base could become narrowed and lacking in variety as users trust model outputs to become informed.

Intellectual Property

Models that are trained on such large amounts of data need to source it from as many places as possible. This means that most of these models are using text and images that are available on the internet, but without the creator's permission or knowledge. This has led to outputs that heavily resemble other people's work, especially in the design and digital art space, raising many questions surrounding the concept of authorship and ownership. Whilst text and images generated may be entirely new, the nature of their creation makes it incredibly difficult to determine who (or what) should be credited as the author or artist.

What does this mean for me?

So how can you make the most of this new AI landscape, while managing and navigating the risks posed?

Government

Get hands-on with the technology and engage with it in a meaningful way. Understand its strengths, limitations and its potential for misuse.

Move fast: Innovation is progressing at a rapid pace - there is a lot of opportunity and incentive for the private sector to accelerate development into products. They will be moving fast - policy and oversight needs to move in tandem.

Collaborate with academia and industry both domestically and internationally to get ahead of the curve. Use this opportunity to shape policy and guide the ethical/safe use of AI. Avoid the policy lag that failed to prevent harms caused by social media.

Understand the scale and nature of training data being used to train LLMs. Think about cybersecurity and the sovereignty of data sources /models and protection of user prompt data.

Business

Move carefully: This technology will change the way we work and it is likely you should be using this - but proceed with care. Moving too quickly without an understanding of the maturity and limitations of LLMs may lead to a lot of long-term pain.

Navigate the marketplace: There is a lot of hype, with an influx of new products in a new market. Navigate it carefully and be cautious about investing heavily in a generative AI tool that may not be useful.

Re-evaluate everything: How are you preparing for change in what people can do and what new skills/jobs will be created? How are you protecting yourself from the limitations and misuse? How is your data strategy holding up?

Take an ethics-first approach: Organisations have a corporate responsibility to ensure the safe and ethical use of these models. Be part of the wider conversation around these models but more importantly, lead by example.

Engage with strategic partners to develop a strategy. Test out this new capability by running pilot programs to see how LLMs can lead to improved efficiencies or new capabilities.

Individuals

Play: with the technology and have fun (it’s exciting!) - but be sceptical of what it generates.

Be aware: Be increasingly cautious of malicious actors and scammers, as they will also be empowered by this technology. Users on social media may be using LLMs to influence conversation. Phishing attempts will appear more convincing and can be executed more efficiently.

Parental guidance: Think about how your kids are engaging with this technology. ChatGPT can be an amazing teaching assistant or companion - but is currently prone to misinformation and hidden biases which children may not be aware of.

Support educators: Education will be affected in a major way. For instance, ChatGPT can be a great tool for understanding new concepts through active teaching, but can also just as easily be used by students to generate assignments. Support educators to adapt to the technology (including their plagiarism policies).

All emerging technology is a mixture of opportunity and risk.

Navigating both will be critical in using LLMs for our collective benefit.

Smash Delta is a strategic technology group. We co-create strategic programs and powerful assets in the worlds of Data, AI and Digital.

We strongly believe in the fair and ethical use of data, and champion individuals and organisations to maintain ownership of their information.

Explore what this landscape means for you with Smash Delta.