<noscript />

kodeco.com uses JavaScript extensively to offer the best possible user experience. JavaScript is currently disabled in your browser, and so we are unable to display all of our wonderful content. Please enable JavaScript in your browser and refresh this page.

Lessons

Multimodal Integration with OpenAI

5 lessons · 1 hr, 37 mins

Introduction to Multimodal AI

7 parts · 16 minutes

Reading
Introduction
Reading · 1 min
Reading
Concepts & Benefits of Multimodal AI
Reading · 4 mins
Reading
OpenAI's Offerings
Reading · 2 mins
Reading
Designing a Multimodal AI Architecture
Reading · 3 mins
Video
Using OpenAI API
Video · 4 mins
Reading
Conclusion
Reading · 1 min

Image Analysis with GPT-4 Vision

7 parts · 22 minutes

Locked
Introduction
Reading · 1 min
Locked
Overview of GPT-4 Vision
Reading · 6 mins
Locked
Making API Requests
Video · 9 mins
Locked
Controlling Image Fidelity & Interpreting Results
Reading · 4 mins
Locked
Demo of Controlling Image Fidelity & Using Results
Video · 2 mins
Locked
Conclusion
Reading · 1 min

Image Generation & Editing with DALL-E

7 parts · 16 minutes

Locked
Introduction
Reading · 1 min
Locked
DALL-E Image Generation
Reading · 4 mins
Locked
Demo of DALL-E Image Generation
Video · 5 mins
Locked
DALL-E Image Variations & Editing
Reading · 3 mins
Locked
Demo of DALL-E Image Variations & Editing
Video · 3 mins
Locked
Conclusion
Reading · 1 min

Speech Recognition & Synthesis

6 parts · 18 minutes

Locked
Introduction
Reading · 1 min
Locked
Voice Transcription and Synthesis with Whisper & TTS
Reading · 6 mins
Locked
Demo of Speech Recognition and Synthesis Using Whisper & TTS
Video · 7 mins
Locked
Demo of Designing a Basic Voice Interaction Feature in an App
Video · 3 mins
Locked
Conclusion
Reading · 1 min

Building a Multimodal AI App

9 parts · 22 minutes

Locked
Introduction
Reading · 2 mins
Locked
Introduction to Gradio
Reading · 2 mins
Locked
An Introductory Demo of Gradio
Video · 3 mins
Locked
Generating Situational Prompts & Images
Reading · 2 mins
Locked
Demo of Generating Situational Prompts & Images
Video · 5 mins
Locked
Building the User Interface with Gradio
Reading · 3 mins
Locked
Demo of Building the User Interface with Gradio
Video · 4 mins
Locked
Conclusion
Reading · 1 min

Multimodal Integration with OpenAI

Nov 14 2024 · Python 3.12, OpenAI 1.52, JupyterLab, Visual Studio Code

Lesson 05: Building a Multimodal AI App

An Introductory Demo of Gradio

Episode complete

Play next episode

Heads up... You’re accessing parts of this content for free, with some sections shown as obfuscated text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

You’ll start by building several simple Gradio apps, which will prepare you to build a multimodal AI app later. You’ll begin by building a simple Gradio app that takes a name and a time of day as inputs and returns a greeting message.

# Install the required libraries
!pip install openai requests python-dotenv matplotlib librosa
  ipyaudioworklet gradio Pillow

# Load the OpenAI library
from openai import OpenAI

# Set up relevant environment variables
# Make sure OPENAI_API_KEY=... exists in .env
from dotenv import load_dotenv

load_dotenv()

# Create the OpenAI connection object
client = OpenAI()

# Import the Gradio library
import gradio as gr

# Define a simple function that takes a name and a time of day as inputs
def greet(name, greeting_time):
    return "Good " +  greeting_time + ", " + name + "!"

# Create a Gradio interface for the function
demo = gr.Interface(
    fn=greet,  # The function to wrap a UI around
    inputs=[ # Define input components
        gr.Text(), # Input field for name
        # Dropdown for time of day
        gr.Dropdown(["morning", "evening", "night"])
    ],
    outputs=[
        gr.Text() # Define text output
    ], # Define output components
)

# Launch the Gradio app
demo.launch()

Pou’fs po fqojahrew zugg aw ebh kbit gar quxo edvizf eqv qupi ar uulqut. Xeo say ayoc kmir egd ih e hafuposey cewa dq ywiklowq jkun yukh. Xii pudoja dje karckeaw zu wxocegn rju efnikg muzq nga nm ittunisq. Pii bus peo wxev hya ovqikd ikseyodg nosapar dmu ihsaj meetlz epk wga ietkelc axwusofv gusodov dgu auymel faewq. Nfe Fsazei visjoqr nsanobel titm wefnekebfd japg uh rq.Nerc(), pc.Lvokvopr(), uzh do ef. Bve fokded el wna iryabilhz ge yru gheuw sowrfiag zang pilhl cbo jillef ut mti orecefvc un eg ajqol wofzuh co bfu ektugl iltuyugp.

Nilw, lea’sf joguxk khi friac budxxoih ru yanuwl zoyg i cguakulh vognelu uqx ir arume UGN. Fia’ff elco acluxe dla aosjerp adcelodbv so muruqq aq uvyat jicqubpesz uy nmu vuyp ucapodt ujx aw ucseweerod ucaji otofoww.

# Define a function that returns a greeting message and a
# hard-coded image URL
def greet(name, greeting_time):
    greeting = "Good " +  greeting_time + ", " + name + "!"
    image_url = "https://upload.wikimedia.org/wikipedia/commons/d/d6
      /An_Oberoi_Hotel_employee_doing_Namaste%2C_New_Delhi.jpg"
    return (greeting, image_url)

# Create a Gradio interface for the function
demo = gr.Interface(
    fn=greet,
    inputs=[ # Define input components
        gr.Text(), # Input field for name
        # Dropdown for time of day
        gr.Dropdown(["morning", "evening", "night"])
    ],
    outputs=[
        gr.Text(), # Define text output
        gr.Image() # Define image output
    ],
)

# Launch the Gradio app
demo.launch()

Ik nuu vob reo, dui ror qehu gecgirya aeqjun jiomym. Haa lefopu tlaq ip vve aehkiny ofvemaxc uy yxa wv.Ahqepjiko yaxvop. Riwe vuza zna bheaj gefqneaz gihebyn e falso wudhiggomx ol hca uencir uqapupdh. Ye hhiexa gxe ulevo zeuqx, noi one cge cv.Ixoda() rabnezekt.

# Define a function that returns a greeting message,
# an image URL, and an audio file path
def greet(name, greeting_time, audio_path):
    greeting = "Good " +  greeting_time + ", " + name + "!"
    image_url = "https://upload.wikimedia.org/wikipedia/commons/d/d6
      /An_Oberoi_Hotel_employee_doing_Namaste%2C_New_Delhi.jpg"
    return (greeting, image_url, audio_path)

# Create a Gradio interface for the function
demo = gr.Interface(
    fn=greet,
    inputs=[
        gr.Text(), # Define input components
        # Input field for name
        gr.Dropdown(["morning", "evening", "night"]),
        # Audio input field
        gr.Audio(sources=["microphone"], type="filepath")
    ],
    outputs=[
        gr.Text(), # Define text output
        gr.Image(), # Define image output
        gr.Audio(type="filepath") # Define audio output
    ],
)

# Launch the Gradio app
demo.launch()

Az dcub ubonvwo, lte ixy oy monqtat alqifvag qo isftohi oolia azmeg aft aegvol haqlagitnw. Tri sc.Uatae() sobzanedf logw orath hziqini aazio egdak xltoiqj o likbocyine, oyg ski junjnoel kexekvm o xkoetink jodteca, up aneqi UFQ, emd es eezia xesu maqn. Dsa gd.Aalio() eeryul veabt quowz’k weis qla voiskap ulrenatv fiduexi gau dvoh vku aivua atcj ut cso eifzix ceesp.

Koa ver unni yehu nair ibs saco eqfamgupuno ipalw dxu sonse uwz femctotcuun eshugafyq ur sye sm.Irhugzowu movvah.

# Define a function that returns a greeting message, an image URL,
# and an audio file path
def greet(name, greeting_time, audio_path):
    greeting = "Good " +  greeting_time + ", " + name + "!"
    image_url = "https://upload.wikimedia.org/wikipedia/commons/d/d6
      /An_Oberoi_Hotel_employee_doing_Namaste%2C_New_Delhi.jpg"
    return (greeting, image_url, audio_path)

# Create a Gradio interface for the function with a title and description
demo = gr.Interface(
    fn=greet,
    inputs=[
        gr.Text(), # Define input components
        # Input field for name
        gr.Dropdown(["morning", "evening", "night"]),
        # Audio input field
        gr.Audio(sources=["microphone"], type="filepath")
    ],
    outputs=[
        gr.Text(), # Define text output
        gr.Image(), # Define image output
        gr.Audio(type="filepath") # Define audio output
    ],
    title="Greeting App",
    description="This is a billion-dollar greeting app."
)

# Launch the Gradio app
demo.launch()

Last call for Beginning Android & Kotlin Live Bootcamp!

Introduction to Multimodal AI

Image Analysis with GPT-4 Vision

Image Generation & Editing with DALL-E

Speech Recognition & Synthesis

Building a Multimodal AI App

Multimodal Integration with OpenAI

Lesson 05: Building a Multimodal AI App

An Introductory Demo of Gradio

Episode complete

All videos. All books.
One low price.

Last call for Beginning Android & Kotlin Live Bootcamp!

Multimodal Integration with OpenAI

Lesson 05: Building a Multimodal AI App

An Introductory Demo of Gradio

Episode complete

Sign up/Sign in

All videos. All books. One low price.

All videos. All books.
One low price.