Giter VIP home page Giter VIP logo

how-to-use-generative-ai-with-watsonx's Introduction

How to use Generative AI with IBM WatsonX

Hello everyone, today we are going to use IBM watsonx .

In this blog post we are going to explain how to use Generative AI to summarize, classify, extract, generate and answering questions from a text.

Introduction

AI refers to the ability of computer systems to attempt to mimic the problem solving and decision making capabilities of the human mind.

  • Computer vision
  • Data science
  • Natural language processing
  • Robotics

AI models have evolved significantly in the past decade

  • Advanced analytics - Step by step logic and instructions coded by human developers. Very Deterministic.- eg. anomaly detection etc.
  • Machine Learning - Human crafted features with supervised learning to analyze data for specific task. eg. prediction price, optimization etc.
  • Deep Learning - Unsupervised learning where AI is fed outcomes and data to create rules and algorithms. eg. Image recognition, autonomous driving etc.
  • Foundation models- Unsupervised AI that ingest massive amount of data, to then generate net new human-like text, art, images, video, etc. eg. DALL-e ChatGPT, BERT, T5, LaMDA etc.

When to use Traditional AI Capabilities

  • Predictive - Structured data analysis, predictions, forecasting, etc.
  • Directed Conversational AI - Deterministic dialog flows for API driven conversational AI
  • Computer Vision AI - Machine Vision for object and anomaly detection
  • Process Automation - Robotic Process Automation, Process reengineering and optimization

When to use Generative AI capabilities

  • Summarization eg. documents such as user manuals, asset notes, financial reports etc.
  • Conversation Search, eg. SOPs troubleshooting instructions etc.
  • Content creation eg. personas, user stories, synthetic data, generating images, personalized UI, marketing copu, email/ social responses etc.
  • Code creation. eg. Code co-pilot, code conversion, create technical documentation, test cases etc.

What you can do with Generative AI

  • Question answering- Model responds to question in natural language
  • Generation - Model generate content in natural language
  • Extraction- Model extract entities facts and info, from text.
  • Summarize - Model creates summaries of natural language
  • Classification- Models classifies text .eg. sentiment, group, projects etc.

Generative AI

Class of Machine Learning techniques whose purpose is to generate content or data of many kinds such as audio, code, images, text, simulaitons, 3D objects, and videos.

The foundation models

An AI model that can be adapted to a wide range of downstream tasks. Foundation models are typically large-scale (eg. billions of parameters) generative models trained on unlabeled data using self-supervision.

In this blog post, we are going to test watsonx.ai

Step 1 Creation of account

First we need to create our IBM cloud account here

https://www.ibm.com/products/watsonx-ai

You can start you free trial

image-20230801164934809

Step 2. Open Promt Lab

After you are in, you can open a Promt Lab

image-20230801165358650

With the Foundation Model Libraries, you can have easy access to IBM propertary and open source Foundation models.

The Prompt Lab you ca experiment with zero/ few-shot learning for enterprises tasks , and with Tunning studio you tailor pre-trained Foundation models for complex downstream tasks on enterprise data.

image-20230801170009885

Select a foundation model

We can select a model that best fits your needs. All models support English text.

image-20230801175751932

Among available models we have:

  • The Flan-UL2 model is an encoder-decoder model based on the T5 architecture. It has 20 billion parameters . It was fine-tuned using the “Flan” prompt tuning and dataset collection. The original UL2 model was only trained with a receptive field of 512, which made it non-ideal for N-shot prompting where N is large. The Flan-UL2 checkpoint uses a receptive field of 2048 which makes it more usable for few-shot in-context learning. For more details, please see sections 3.1.2 of the paper.

    image-20230801180257465

  • BLOOMZ & mT0, a family of models capable of following human instructions in dozens of languages zero-shot. We finetune BLOOM & mT5 pretrained multilingual language models on our crosslingual task mixture (xP3) and find our resulting models capable of crosslingual generalization to unseen tasks & languages.

  • GPT-NeoX-20B is a 20 billion parameter autoregressive language model trained on the Pile using the GPT-NeoX library. Its architecture intentionally resembles that of GPT-3, and is almost identical to that of GPT-J- 6B. Its training dataset contains a multitude of English-language texts, reflecting the general-purpose nature of this model. More idetails about model architecture (including how it differs from GPT-3),training procedure, and additional evaluations see this reference.

  • T5, FLAN-T5 is just better at everything. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages.

  • MPT-7B-Instruct2 is a retrained version of the orignal MPT-7B-Instruct model is a short-form instruction following decoder-only model.

    See model cards for MPT-7B and MPT-7B-Instruct for more information.

In this post we are going to choose the Flan-UL2 model. For other supported languages we have check each models.

Model Parameters

Nucleus sampling is a technique used in large language models to control the randomness and diversity of generated text. It works by sampling from only the most likely tokens in the model’s predicted distribution.

The key parameters are:

  • Temperature: Controls randomness, higher values increase diversity.
  • Top-p (nucleus): The cumulative probability cutoff for token selection. Lower values mean sampling from a smaller, more top-weighted nucleus.
  • Top-k: Sample from the k most likely next tokens at each step. Lower k focuses on higher probability tokens.

image-20230801181010022

In general:

  • Higher temperature will make outputs more random and diverse.
  • Lower top-p values reduce diversity and focus on more probable tokens.
  • Lower top-k also concentrates sampling on the highest probability tokens for each step.

So temperature increases variety, while top-p and top-k reduce variety and focus samples on the model’s top predictions. You have to balance diversity and relevance when tuning these parameters for different applications.

Step 3 - Choose the sample prompt you want to use.

Summarization

image-20230801170139495

  • Meetings transcript summary - It is summarized the discussions from a meeting transcript.

image-20230801170851255

  • Earnings call summary - Summarize financial highlights from earning call

image-20230801171015067

Classification

image-20230801170159414

  • Scenario classification - Classify scenario based on project categories.

    image-20230801171345284

  • Sentiment classification - Classify reviews as positive or negative

image-20230801170750892

Generation

image-20230801170213284

  • Marketing email generation - Generate email for marketing campaign

image-20230801173131157

  • Thank you note generation - Generate thank you note for workshop attendees.

image-20230801173238909

Extraction

image-20230801170227562

  • Name entity extraction - Find and classify entities in unstructured text.

image-20230801174214418

  • Fact extraction - Extract information from K sentences.

image-20230801174301254

Question answering

image-20230801170239365

  • Questions about an article - Answer questions about a body of text.

image-20230801174326796

Congratulations! We have used Generative AI with IBM WatsonX

how-to-use-generative-ai-with-watsonx's People

Contributors

ruslanmv avatar

Stargazers

ys avatar

Watchers

 avatar

Forkers

aairom

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.