Chat Application
WorqHat Multimodal Chat App: Documentation
This document outlines the functionality of the WorqHat Multimodal Chat App, a Streamlit-based application that allows users to interact with an AI using both text and file uploads. The app communicates with the WorqHat AI API, enabling the analysis of multiple data types (text, audio, video, images) in a single conversation. It is designed to facilitate seamless interactions and enhance user experiences through multimodal inputs.
Purpose:
The Multimodal Chat App leverages the WorqHat AI API to respond to user queries with contextually relevant answers. The tool supports both text and file uploads, allowing users to submit various media formats (e.g., audio, video, images) for analysis and receive informative, friendly responses from the AI.
Key Features:
- Text Input: Users can submit text-based questions or prompts directly into the app to receive AI-generated responses.
- Multimodal Support: Users can upload multiple file types, including audio (mp3, wav, ogg), video (mp4, avi, mov), and images (jpg, png, gif), which are analyzed alongside the text.
- File Preview: Images uploaded by the user are previewed directly in the chat window, while other file types are acknowledged in the conversation.
- AI-Powered Response: The WorqHat AI processes both text and files, offering insights and answers based on the input provided.
- Conversation Context: The app supports ongoing conversations by maintaining a conversation ID, allowing the AI to keep track of the context throughout multiple interactions.
- User-Friendly Interface: The Streamlit interface allows users to easily upload files, input text, and view AI responses in a structured chat format.
Benefits:
- Enhanced User Experience: The multimodal capability allows for richer conversations where users can upload and analyze various media formats in addition to text.
- Customizable Responses: The AI tailors responses based on user inputs, ensuring friendly and informative answers in line with the provided context.
- Real-Time Interactions: Receive prompt feedback from the AI, making the app ideal for dynamic content generation, discussions, and analysis.
- API Flexibility: With the ability to handle different data types, the API enables the creation of diverse use cases, including content review, media analysis, and interactive storytelling.
How it Works:
- Text and File Uploads: Users can submit a text-based prompt or question, and optionally upload an audio, video, or image file.
- API Call: The app packages the user inputs and sends them to the WorqHat AI API for analysis. The AI model processes the text and file data in the context of the ongoing conversation.
- Response Generation: The API returns a JSON response containing the analyzed result, which is displayed in the chat window.
- Conversation Management: The app retains the conversation history, allowing the AI to maintain context and provide coherent responses across multiple inputs.
- File Handling: Uploaded files are previewed in the chat when applicable (e.g., images) or acknowledged with file details (e.g., for audio and video files).
Procedure to Follow:
- API Key: Obtain a WorqHat API Key by signing up at the WorqHat API Signup Page and input it in the app’s sidebar.
- Conversation ID: Enter a unique conversation ID to maintain context across interactions.
- Text Input and File Upload: Provide a message or question in the input field, and optionally upload an audio, video, or image file for analysis.
- Generate Response: Click the “Send” button to submit your input to the API. The app will display the AI’s response in real time.
Example Code Snippet:
def call_api(text_input, api_key, file_data=None, conversation_id=None):
url = "https://api.worqhat.com/api/ai/content/v4"
headers = {
"Authorization": f"Bearer {api_key}",
}
payload = {
"question": f"Please analyze the provided text and files. Respond to this: {text_input}",
"model": "aicon-v4-nano-160824",
"response_type": "json",
"conversation_id": conversation_id
}
try:
if file_data:
response = requests.post(url, headers=headers, data=payload, files=file_data)
else:
headers['Content-Type'] = 'application/json'
response = requests.post(url, headers=headers, data=json.dumps(payload))
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
st.error(f"API request failed: {e}")
return None
This function handles the API call, sending text and files to the WorqHat AI API and processing the response.
Conclusion:
The WorqHat Multimodal Chat App is a versatile tool that enables rich conversations using text and multiple file formats. Its user-friendly interface and real-time API interactions make it a powerful application for AI-driven content generation, media analysis, and enhanced user engagement.
EndNote:
Try out the app and explore the possibilities of multimodal interactions with WorqHat AI.
Check out the code on our GitHub: GitHub Link