Skip to content

Conversation

@Richard-Weiss
Copy link
Contributor

@Richard-Weiss Richard-Weiss commented Aug 13, 2023

Add groundwork for implementing the image recognition feature.
The only missing component is setting the parameters in the message dynamically.
I would appreciate any assistance.
--Edit--
I've added receiving the necessary values from the body of the /conversation request.
The imageURL or imageBase64 are added dynamically by for example PandoraAI.
Tested with image URL and direct base64 string.

@Richard-Weiss Richard-Weiss marked this pull request as ready for review August 21, 2023 14:33
@matibm
Copy link

matibm commented Aug 30, 2023

I've tried your code by copying it directly to the library I downloaded using npm i @waylaidwanderer/chatgpt-api, and pasting the code you added where it belongs. Initially, the uploadImage didn't work because the base64 was a promise, so I got it to work by adding await in the body

image
and in

...imageUploadResult && { originalImageUrl: imageBaseURL + imageUploadResult.processBlobId } 

I see that the variable should actually be imageUploadResult.processedBlobId.
image

console.log(imageUploadResult);

image

I made those changes and apparently, the image upload works correctly. However, I notice that it affects the context when I make a prompt. In other words, the chat doesn't recognize that it has received an image. Could you provide an example of how to use this function?

image

@Richard-Weiss
Copy link
Contributor Author

Hey @matibm,
thank you for your keen observations.
Some mistakes snuck in, because I manually had to move the code from my main branch to the new branch based on this repo.
I didn't actually test the functionality in this specific branch, so that's on me. 😅
I'll look into these issues and mention you again, when I've created the new commit.
Maybe I can even add some code in the demo folder, but I'll have to look into how to actually use the class detached from the current implementation.
Thank you for doing such detailed work in finding and documenting these issues. :)

@Richard-Weiss
Copy link
Contributor Author

Hey @matibm,
me again.
I'm having difficulties with testing it on this branch, because I keep getting the Captcha error, not sure why. However I tested it on my current main branch by setting the imageURL in PandoraAI like this:

const imageURL = 'https://a.cdn-hotels.com/gdcs/production164/d1916/a3c64a1f-b238-4947-896f-27d7a1f51c89.jpg?impolicy=fcrop=w800&h=533&q=medium'; const imageBase64 = undefined; const data = { ...conversationData.value, message: input, stream, clientOptions, ...imageBase64 && { imageBase64 }, ...imageURL && { imageURL }, }; 

In the sendMessage function in the Chat.vue file and I did no other modification in the API server code, except for what is already on my main branch.
I built the Pandora server, refreshed the website and cleared the cache and when I just wrote "Hello", I got this message:

Hello, $User. I'm $AI, an AI assistant created by you. I'm happy to chat with you and help you with anything you need. 😊 I see you sent me a beautiful image of Manhattan. Is that where you live or where you want to visit? 

I'll have to do some more debugging, but a fix will probably come later this week.

@vaibhavard
Copy link

HI!
I also noticed When jailbreak is on , the image recogntion does not work

@Richard-Weiss
Copy link
Contributor Author

Hi @vaibhavard,
currently both modes don't work.
I'll consider testing both variants, once I get the branch debug ready.

@vaibhavard
Copy link

vaibhavard commented Sep 1, 2023

No, without jail break it works great for me ..
I cloned your fork and when I use imageurl to upload image it works..not tested for base64

@Richard-Weiss
Copy link
Contributor Author

Richard-Weiss commented Sep 2, 2023

Hey @matibm and @vaibhavard ,
I got this branch debugging ready and added the missing code from my main branch, to get this PR working.
I've tested it with vanilla Bing and with the jailbreak mode and it worked in both modes.
Let me know if you have any more questions or issues. :)

@vaibhavard
Copy link

Tried and Tested.✅.

Image recognition works perfectly 😃

@vaibhavard
Copy link

Edit: I am now facing a minor bug with image recognition / searching:
API OUTPUT IS:
Analyzing the image: Faces may be blurred to protect privacy. is a beautiful image of blue butterflies. Do you like butterflies? I think they are fascinating creatures. They symbolize transformation, freedom and joy.

The initial word after API OUTPUT Gets cutoff:
i.e : Here the word this get cutoff.Same issue with searching web for...*(-cutoff)

@Richard-Weiss
Copy link
Contributor Author

Hey @vaibhavard,
this is something related to #451.
For now you can take the code from line 481 onward from commit 9e9d814, so the internal messages won't be displayed and won't overwrite the actual message.
Please let me know if that works for you. :)

@matibm
Copy link

matibm commented Sep 5, 2023

Hi @Richard-Weiss
I have tested it, and it works correctly up to a certain number of attempts.

If I send several messages, I get this error, approximately on message number 10 or 15 when sending images. If I don't send an image, I don't get the error. (the messages are not correlative)

 Analyzing the image: Faces may be blurred to protect privacy.{ value: 'InternalError', message: 'Unhandled Exception', error: 'Unhandled Exception\n' + " ---> XapImageRetriever's federation response failed with status: Failed\n" + 'errorMessage: 400 Bad Request\n' + 'Xap requestId: 789d04a80d2345f0b8e66d7b3011efed', exception: 'Microsoft.TuringBot.Common.ServiceInternalError: Unhandled Exception\r\n' + " ---> System.Exception: XapImageRetriever's federation response failed with status: Failed\n" + 'errorMessage: 400 Bad Request\n' + 'Xap requestId: 789d04a80d2345f0b8e66d7b3011efed\r\n' + ' at BotClientLibrary.ServiceClients.ContentProviders.XapImageRetriever.Run(Conversation conversation, Message message, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\ServiceClients\\ContentProviders\\ImageRetriever.cs:line 207\r\n' + ' at BotClientLibrary.ServiceClients.ContentProviders.GptvClientDecorator.GetHttpContent(Conversation conversation, Message message, CancellationToken cancellationToken, MetricsCollection metrics, BatchRequest batchRequest) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\ServiceClients\\ContentProviders\\GptvClient.cs:line 131\r\n' + ' at BotClientLibrary.ServiceClients.ServiceClient.Run(Conversation conversation, Message message, CancellationToken cancellationToken, BatchRequest batchRequest, ServiceClientOptions options) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\ServiceClients\\ServiceClient.cs:line 429\r\n' + ' at BotClientLibrary.Extensions.DeepLeoOrchestrator.UpdateConversationWithContentDescriptions(ExtensionRequest request, ExtensionResponse result, Conversation conversation, DeepLeoMessageFactory messageFactory, Message message, TuringBotConfiguration config, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\Extensions\\DeepLeoOrchestrator.cs:line 421\r\n' + ' at BotClientLibrary.Extensions.DeepLeoOrchestrator.Run(ExtensionRequest request, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\Extensions\\DeepLeoOrchestrator.cs:line 338\r\n' + ' at BotClientLibrary.Extensions.ExtensionRunner.RunExtension(ExtensionRequest request, Conversation conversation, ExtensionConfig extension, ExtensionRequestOptions customOptions, ParsedToolInvocation action, String modelName, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\Extensions\\ExtensionRunner.cs:line 652\r\n' + ' at BotClientLibrary.Extensions.ExtensionRunner.RunExtensions(Conversation conversation, CancellationToken cancellationToken, ComponentPriority minPriority, ComponentPriority maxPriority, ExtensionRequestOptions customOptions, ExtensionRequest request, ParsedToolInvocation action, String modelName, Classification modelClassification) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\Extensions\\ExtensionRunner.cs:line 266\r\n' + ' at BotClientLibrary.BotConnection.GetContentResponsesAsync(Conversation conversation, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\BotConnection.cs:line 700\r\n' + ' at BotClientLibrary.BotConnection.InternalRun(Conversation conversation, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\BotConnection.cs:line 792\r\n' + ' at BotClientLibrary.BotConnection.ExecuteBotTurn(Conversation conversation, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\BotConnection.cs:line 415\r\n' + ' --- End of inner exception stack trace ---\r\n' + ' at BotClientLibrary.BotConnection.ExecuteBotTurn(Conversation conversation, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\BotConnection.cs:line 415\r\n' + ' at BotClientLibrary.BotConnection.ExecuteBotTurn(Conversation conversation, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\BotConnection.cs:line 415\r\n' + ' at BotClientLibrary.BotConnection.Run(CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\BotClientLibrary\\BotConnection.cs:line 140\r\n' + ' at Microsoft.Falcon.TuringBot.ChatApiImplementation.Run(BaseRequest request, BaseResponse response, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\Service\\Implementation\\ApiImplementation\\ChatApiImplementation.cs:line 164\r\n' + ' at Microsoft.Falcon.TuringBot.RequestProcessor.Run(BaseRequest request, BaseResponse response, IRequestContextInitializer contextInitializer, IRequestValidator requestValidator, IApiImplementation apiImplementation, IAsyncApiEndStep apiEndStep, String apiName, CancellationToken cancellationToken) in C:\\a\\_work\\1\\s\\services\\TuringBot\\src\\Service\\Implementation\\RequestProcessor.cs:line 232', serviceVersion: '20230905.14' } 

the message comes from this part of the code

image

any suggestions?

thanks

@Richard-Weiss
Copy link
Contributor Author

Hey @matibm,
there isn't really much to go on.
Just going from what is provided, I would say that the internal process to fetch the images using the imageUrl and originalImageUrl is saying that the request is malformed in a way. For example this occurred to me, when I flipped the two blobIds.
So it would be good to know the values of these parameters at the time of the error occurring.
I'm also not sure if I understood the sequence correctly.
Do you mean that you for example only send an image in the first message and only text afterwards and at this point the error occurs after 10-15 messages, or do you mean something else?
I don't really have much time in the week, but I can take a deeper look on the weekend.
Let me know if you have more information on the issue.

@matibm
Copy link

matibm commented Sep 5, 2023

I realized that right now it doesn't work even on the official bing chat page when you upload an image. It's probably an internal problem, and it's going to work again eventually.
Thanks for replying @Richard-Weiss and my previous comment, we can skip.

@Richard-Weiss
Copy link
Contributor Author

Yeah, I also tried it first and had no issue, but after some retries it seems to have some issues and it retries it a bunch of times in vanilla Bing.
But still thank you for sharing your issue. :)

@veigamann
Copy link

any updates?

@Richard-Weiss
Copy link
Contributor Author

@veigamann It's ready to be merged, just waiting for a review and merge.
You can fork the repo and merge it in your fork if you want to use it before it gets merged in this repo's main branch.
Bing can help you pretty well with that, if you need some assistance.

@veigamann
Copy link

awesome, thanks! hope it gets merged soon 😉

veigamann added a commit to WAppAI/assistant that referenced this pull request Oct 5, 2023
using a fork of node-chatgpt-api as [this PR](waylaidwanderer/node-chatgpt-api#481) hasnt been merged yet. will revert to the main package as soon as this gets merged. this fork also implements the feat in [this PR](waylaidwanderer/node-chatgpt-api#452) but it's not yet implemented
veigamann added a commit to WAppAI/assistant that referenced this pull request Oct 5, 2023
using a fork of node-chatgpt-api as [this PR](waylaidwanderer/node-chatgpt-api#481) hasnt been merged yet. will revert to the main package as soon as this gets merged. this fork also implements the feat in [this PR](waylaidwanderer/node-chatgpt-api#452) but it's not yet implemented
veigamann added a commit to WAppAI/assistant that referenced this pull request Oct 5, 2023
using a fork of node-chatgpt-api as [this PR](waylaidwanderer/node-chatgpt-api#481) hasnt been merged yet. will revert to the main package as soon as this gets merged. this fork also implements the feat in [this PR](waylaidwanderer/node-chatgpt-api#452) but it's not yet implemented
@Richard-Weiss Richard-Weiss deleted the image_recognition branch January 21, 2024 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants