Azure ComputerVision project test

For the all day presentations on MS Azure AI they gave a voucher for 7 days for trying out the services.

I am interested to see if I can get something up and running. So I’m going to start with the Computer Vision. Of course, this is not as easy as the demo’s make it look.

Project Idea

I have recently been playing with GlideApps and found that for the Computer Vision it is a call to an API, so I thought I could build a simple app that you could take a picture on your camera and upload it and get some information from the photo. So that is what I’m setting out to do.

The first step it to spin up a Azure instance with all the bits required to run the Computer vision process, then figure out the API call so that I can add it to the app.

Step 1- Setting up Azure Computer Vision to run

I’m not going to go through the “create environment” & “get endpoint and keys, there are a few video’s out there that do that, eg:

Step 2- Testing operation- remote image from URL

After setting up an Azure Account and selecting a Service ” Computer Vision API – v2.0″ + “Describe Image”

you can give it a URL of an image and it will give you a response:

Step 3- Trying api call in postman (no success)

So thats in Azure environment. I then tried to take those settings across to PostMan to test, with no luck at all.

Step 4- Trying to use Web page image code with Azure Quick Start code

After playing around with that for a while with no success I went back to YT and found this video:

This links to this quickstart page with HTML code:

I tried using it but had a few errors and couldn’t get it to work, so that was some more time wasting. If you look in the Comments section you will see one from me describing the issues that I had, apart from it also needed Char type defined in header too. <meta charset=”UTF-8″>

see below for how far I got, but error.

Text Recognition Sample

Read text from image:

Enter the URL to an image of text, then click the Read image button.

Image to read:


Source image:

Step 5- Tried OCR web reader, with some success- Computer Vision API – v2.0 – OCR

I found this text on the web (here) and tried running it through the OCR reader, it tried to translate it into words rather than characters, so I got a result but not a particularly good one, it also doesn’t give probabilities of accuracy (or maybe it does but I haven’t found the variables to input for them yet)

So a partial success.

Back to PostMan , I’m not sure why!

I went back to postman and finally got one working, but not quite sure why I need postman, this is for a test. I still need to bind it to work somewhere. It lets you import but I can’t find an export.

So maybe I can re-write the Post request for an ajax or php call, but after perusing a few vids it does not look like I can use Google Sheets API to do it for me.

So back to what you can do with Computer Vision API, there are a few items but nothing that leaps out to use. The only stand out amusing one I saw demonstated was it trying to work out age of a person based on their face. You could find if they are smiling, sad etc but not sure what you’d use that for.

It does extract text from an image but only as a JSON file, so each word is seperate, so you have to be able to string all the words together, based on lines so an iteration loop as you do not know how many words or lines there will be. Its easier to use the online OCR that I use already or Google Keep. It does apparently do handwriting but I have no great demand for that.

End comment

Nice to have a play, but I need to make an HTTP post request, get the information in JSON, strip out the text and somehow put it into Google Sheets. Each step would be a challenge.

As mentioned in the other post on Azure, if you are in the MS ecoSystem then you canm join the dots easier. But it does not extend too readily to outside services. I have been doing a lot of research on the Post HTTP requests and how to set those up. A useful bit of learning for interrogating API’s but a bit tiresome nevertheless. Postman was handy for that.

For a simple play with cognitive services, I don’t think I got very far with them. I can see a use case for my daughters project, but would have to spec and build it based on the platform and pay for it. For simple playing, they do things, but I cannot get information where I want it. I don’t think its as easy as they portray.

So although I have a few more free bits to play with, thank you but no thank you.