All Pro Web Designs > > Learning Tutorials > Business Strategy > What are Generative AI models?

What are Generative AI models?

December 24, 2023
Posted by: MainInstructor
Category: Business Strategy Go

Video Title: What are Generative AI models?

Over the past couple of months, large language models, or LLMs, such as chatGPT, have taken the world by storm. Whether it’s writing poetry or helping plan your upcoming vacation, we are seeing a step change in the performance of AI and its potential to drive enterprise value. My name is Kate Soule.

I’m a senior manager of business strategy at IBM Research, and today I’m going to give a brief overview of this new field of AI that’s emerging and how it can be used in a business setting to drive value.

Now, large language models are actually a part of a different class of models called foundation models. Now, the term “foundation models” was actually first coined by a team from Stanford when they saw that the field of AI was converging to a new paradigm. Where before AI applications were being built by training,

Maybe a library of different AI models, where each AI model was trained on very task-specific data to perform very specific task. They predicted that we were going to start moving to a new paradigm,

Where we would have a foundational capability, or a foundation model, that would drive all of these same use cases and applications. So the same exact applications that we were envisioning before with conventional AI, and the same model could drive any number of additional applications.

The point is that this model could be transferred to any number of tasks. What gives this model the super power to be able to transfer to multiple different tasks and perform multiple different functions is that it’s been trained on a huge amount, in an unsupervised manner, on unstructured data.

And what that means, in the language domain, is basically I’ll feed a bunch of sentences– and I’m talking terabytes of data here –to train this model. And the start of my sentence might be “no use crying over spilled” and the end of my sentence might be “milk”.

And I’m trying to get my model to predict the last word of the sentence based off of the words that it saw before. And it’s this generative capability of the model– predicting and generating the next word –based off of previous words that it’s seen beforehand,

That is why that foundation models are actually a part of the field of AI called generative AI because we’re generating something new in this case, the next word in a sentence.

And even though these models are trained to perform, at its core, a generation past, predicting the next word in the sentence, we actually can take these models, and if you introduce a small amount of labeled data to the equation, you can tune them to perform traditional NLP tasks– things like classification, or

Named-entity recognition –things that you don’t normally associate as being a generative-based model or capability. And this process is called tuning. Where you can tune your foundation model by introducing a small amount of data, you update the parameters of your model and now perform a very specific natural language task.

If you don’t have data, or have only very few data points, you can still take these foundation models and they actually work very well in low-labeled data domains. And in a process called prompting or prompt engineering, you can apply these models for some of those same exact tasks.

So an example of prompting a model to perform a classification task might be you could give a model a sentence and then ask it a question: Does this sentence have a positive sentiment or negative sentiment?

The model’s going to try and finish generating words in that sentence, and the next natural word in that sentence would be the answer to your classification problem, which would respond either positive or negative, depending on where it estimated the sentiment of the sentence would be.

And these models work surprisingly well when applied to these new settings and domains. Now, this is a lot of where the advantages of foundation models come into play. So if we talk about the advantages, the chief advantage is the performance. These models have seen so much data.

Again, data with a capital D– terabytes of data –that by the time that they’re applied to small tasks, they can drastically outperform a model that was only trained on just a few data points. The second advantage of these models are the productivity gains.

So just like I said earlier, through prompting or tuning, you need far less label data to get to task-specific model than if you had to start from scratch because your model is taking advantage of all the unlabeled data that it saw in its pre-training when we created this generative task.

With these advantages, there are also some disadvantages that are important to keep in mind. And the first of those is the compute cost. So that penalty for having this model see so much data is that they’re very expensive to train,

Making it difficult for smaller enterprises to train a foundation model on their own. They’re also expensive– by the time they get to a huge size, a couple billion parameters –they’re also very expensive to run inference.

You might require multiple GPUs at a time just to host these models and run inference, making them a more costly method than traditional approaches. The second disadvantage of these models is on the trustworthiness side.

So just like data is a huge advantage for these models, they’ve seen so much unstructured data, it also comes at a cost, especially in the domain like language. A lot of these models are trained basically off of language data that’s been scraped from the Internet.

And there’s so much data that these models have been trained on. Even if you had a whole team of human annotators, you wouldn’t be able to go through and actually vet every single data point to make sure that it wasn’t biased and didn’t contain hate speech or other toxic information.

And that’s just assuming you actually know what the data is. Often we don’t even know– for a lot of these open source models that have been posted –what the exact datasets are that these models have been trained on leading to trustworthiness issues. So IBM recognizes the huge potential of these technologies.

But my partners in IBM Research are working on multiple different innovations to try and improve also the efficiency of these models and the trustworthiness and reliability of these models to make them more relevant in a business setting.

All of these examples that I’ve talked through so far have just been on the language side. But the reality is, there are a lot of other domains that foundation models can be applied towards.

Famously, we’ve seen foundation models for vision –looking at models such as DALL-E 2, which takes text data, and that’s then used to generate a custom image. We’ve seen models for code with products like Copilot that can help complete code as it’s being authored. And IBM’s innovating across all of these domains.

So whether it’s language models that we’re building into products like Watson Assistant and Watson Discovery, vision models that we’re building into products like Maximo Visual Inspection, or Ansible code models that we’re building with our partners at Red Hat under Project Wisdom. We’re innovating across all of these domains and more.

We’re working on chemistry. So, for example, we just published and released molformer, which is a foundation model to promote molecule discovery or different targeted therapeutics. And we’re working on models for climate change, building Earth Science Foundation models using geospatial data to improve climate research.

I hope you found this video both informative and helpful. If you’re interested in learning more, particularly how IBM is working to improve some of these disadvantages, making foundation models more trustworthy and more efficient, please take a look at the links below. Thank you.

47 Comments

@timapple9580

December 24, 2023 at 9:39 am Reply

Thank you! I also commend your ability to write backwards so legibly
@selocan469

December 24, 2023 at 9:39 am Reply

Yes, really informative. Thank you.
@Ram-re5em

December 24, 2023 at 9:39 am Reply

Hey, I’ll probably be the end of mankind as we know it. I plan on retiring in Montana with Little more than a off grid, solar system and very little electronics to completely stay away from this gigantic and absurd mess. They call AI.
@NK-iw6rq

December 24, 2023 at 9:39 am Reply

Excellent explanation and breakdown by Kate, brilliant woman !
@user-lq9oi5jq3n

December 24, 2023 at 9:39 am Reply

Awesome.
@crazybastard82

December 24, 2023 at 9:39 am Reply

I’m more impressed that they mirrored the video so that her handwriting was flipped around for us.
@MrHolifeld

December 24, 2023 at 9:39 am Reply

The important question is, is she really left handed and able to write backwards?
@antoniovictor6080

December 24, 2023 at 9:39 am Reply

The emergence of LLMs that are trained unsupervised with internet data was like giving computers a Pandora's box to open
@chanchalsinghjamwal

December 24, 2023 at 9:39 am Reply

Kate, this is really highly informative and one of the best videos I came across for gen ai.
@phatfil77

December 24, 2023 at 9:39 am Reply

How is she writing like this so easily?? What kind of witchcraft am I watching??
@UCs6ktlulE5BEeb3vBBOu6DQ

December 24, 2023 at 9:39 am Reply

she's writing in reversed cursives !?
@vanshkumar3445

December 24, 2023 at 9:39 am Reply

Artificial intelligence is awesome technology
@mygic183

December 24, 2023 at 9:39 am Reply

The poetry is shitty, it takes a soul to do that, but everything else is pretty impressive.. product descriptions, etc. so gross to me
@AdamGee8

December 24, 2023 at 9:39 am Reply

Oh now o get it 😕
@Multiplexization

December 24, 2023 at 9:39 am Reply

Forget about AI. What I really want to know is how she is able to write from right to left.
@aaronleejohnson007

December 24, 2023 at 9:39 am Reply

I've been studying and getting certifications in Prompt Engineering, Mathematics, Coding, Data Science, Open Artificial Intelligence, Machine Learning, Deep Learning, and Neural Networks for a few years now. I can't find a job anywhere. When I'm in a interview and talk about the cost saving benefits and increase in productivity using Artificial Intelligence and automation, they usually end the interview right away and send a Dear John letter that they went with another candidate.
@lsnyder

December 24, 2023 at 9:39 am Reply

very nice presentation thank you
@TuxedoPanther

December 24, 2023 at 9:39 am Reply

LLMs are a primordial form of AI, they are NOT intelligent. Good luck with making them safe and truthful, when you train them on everything you can get on the internet LOL 😂😂😂
@lufiporndre7800

December 24, 2023 at 9:39 am Reply

She just example the whole AI bubble , so awesomely, Kate great job, the best video I have watched so far on the internet. 👏👏👏
@chuckpiot

December 24, 2023 at 9:39 am Reply

Do preexisting organization taxonomies have a place in AI? – Or does AI eliminate the need for an org taxonomy?
@Danutzz2010

December 24, 2023 at 9:39 am Reply

Forget about GenAI, how are you writing backwards?
@slepynewbie

December 24, 2023 at 9:39 am Reply

Outstanding explanation, I'm even more impressed for your hability to invert your writing effortlessly… mindblowing!

congratulations!
@4jjutube

December 24, 2023 at 9:39 am Reply

Very well explained
@100traceme

December 24, 2023 at 9:39 am Reply

Awesome
@ferosha99

December 24, 2023 at 9:39 am Reply

Is she writing backwards?
@eugiblisscast

December 24, 2023 at 9:39 am Reply

using this to study for university, thank you!
@bx3556

December 24, 2023 at 9:39 am Reply

Stop trying to control language with "toxic / hate speech", etc., That's like an unsolvable problem. You can't control human language.
@DarkSkay

December 24, 2023 at 9:39 am Reply

This "blackboard" is so good 🙂
@user-km4vf9uw2x

December 24, 2023 at 9:39 am Reply

Excellent presentation Kate ! Thank you !
@amparoconsuelo9451

December 24, 2023 at 9:39 am Reply

Is there an assembly LLM kit sold in Amazon that I could assemble and understand?
@loveutube04

December 24, 2023 at 9:39 am Reply

Is she writing backward? I mean how does that thing work if she is writing on a normal glass?
@DA-ou7hv

December 24, 2023 at 9:39 am Reply

I think Kate must be right handed.
@jasonkemp

December 24, 2023 at 9:39 am Reply

I think the real question here, is did she write backwards or is there a mirror in the camera? Everything else was very straightforward =p
@jordan_ong

December 24, 2023 at 9:39 am Reply

Forget AI, look how well she's drawing everything backwards.
@philippeko-IBM

December 24, 2023 at 9:39 am Reply

Are the 2 latest foundation models you mentioned, molformer and Earth Science for climat change, available as demos?
@romshes77

December 24, 2023 at 9:39 am Reply

I know someone else who wrote inverted..he also painted well. impressive
@davelamothe2953

December 24, 2023 at 9:39 am Reply

Up until you mentioned climate I felt good about what IBM is doing, but as we all know, there is no climate crisis
@vishalmishra3046

December 24, 2023 at 9:39 am Reply

In future, large language models can generate high quality training data for a small language model and replicate their capabilities in a small model. Many children grow up and become smarter, wiser, richer/wealthier than their parents by growing up and learning in a relatively newer and better world. Similarly, LLMs will reproduce advanced capabilities into smaller models which will grow and eventually reproduce, leading to significant break-throughs.
@youngsci

December 24, 2023 at 9:39 am Reply

Thank you
@mouradtaqui6881

December 24, 2023 at 9:39 am Reply

Great presentation. Make such a complex topic seems affordable, means that there a lot work behind! Thanks
@semitope

December 24, 2023 at 9:39 am Reply

Why don't you call them Machine Learning models? wasn't that what it was called before chatgpt?
@thetjt

December 24, 2023 at 9:39 am Reply

IBM is concerned if AI gives information that is "biased, contained hate speech or toxic information" – shouldn't you rather be concerned whether the information is CORRECT instead of politically correct.
Around half of answers AI gives me are incorrect. The information it gives simply can't be trusted.
…I suspect same problems will arise with climate models, which AI should be well suited for… wrong conclusions may not be tolerated and code will be tweaked to match that. Point being; one should let AI reach whatever conclusion it will, not what we want it reach according to our own biases.
Waste of resources and focus. I see somewhat limited applications for this fad since its answers can not be trusted, it's still a dumb mahcine emulating to be an intelligent one.
@rickpower88

December 24, 2023 at 9:39 am Reply

Kate, this was awesome. It is so refreshing to find presenters who can take complicated material and explain it, in just a few minutes, in a fashion that makes it so reachable.
@spawnofdawnacle

December 24, 2023 at 9:39 am Reply

i feel … diminished.
@abdelrahmane657

December 24, 2023 at 9:39 am Reply

Excellent. Thanks
@CasioArtist

December 24, 2023 at 9:39 am Reply

Very Well Explained !
@professoradenisevargas1573

December 24, 2023 at 9:39 am Reply

This way of writing in class is really cool. What method did they use? Is there a glass between the camcorder and the presenter?