Knime 3. API access and workflows

  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  

In Knime you can go to the EXAMPLES from the Knime Server  in the KNIME EXPLORER Tab and open/save to local workspace some examples to play with. You may need to see about getting some API keys for the examples.

I tested  Translate using Google API Example. I needed an Google Translate API key and if you follow this Article on how to obtain one. Note, apparently its a paid API so beware about using it too much.

Anyway, the workflow works and you get the output. That is fine, but its not the data I want. So, how to set up something for your own data.

In my case I look for an example that has a similar process to what I want. In the Examples the “Data API using the new REST nodes” looks like what I want (screenshot below).

Just a note on those nodes, the GET requests work but they call:

http://apps.who.int/gho/athena/data/GHO/MORT_400.xml?profile=simple&filter=AGEGROUP:*;MGHEREG:WBDCP_HIINCOME;GHECAUSES:*;SEX:*

but when you try to put this in the browser it convers the Http to a Https

https://apps.who.int/gho/athena/data/GHO/MORT_400.xml?profile=simple&filter=AGEGROUP:*;MGHEREG:WBDCP_HIINCOME;GHECAUSES:*;SEX:*

and gives you a 404 error of not found. But it works in Knime, weird!

The top process (1) calls one URL and the bottom example (2) calls from a table a number of URL’s.

I am interested in seeing about calling from a number of URL’s. I will think about using the Guardian API which will allow me to create multiple queries to test. I would also like to put the information in an Excel file and this to be read into the GET request.

For the single instance  I will use the Stats NZ data . At this point I am not bothered about the Graphics at the end but of:

1/ Getting the Data via GET request

2/ Restructuring the data so that it is tabulated and structured so that I can export it to a CSV or excel file.

Process

First, I will copy the 2 workflows in the above example.

Then I will modify the single GET node to the Stats.NZ information. I will use the test information I used in Postman an copy it across to the Knime Node. Both the URL (with parameters).

and the Request Header API key.

Then I will right click the node to test the node by Executing it (F7)

Then I will Right click the node to Get results to see what the output is. In this example it is one Cell of JSON output.

So, good. We have the expected output.   

With the list in Excel

I have tested the Guardian Get requests in Postman and then cut/paste URL’s into an Excel Sheet.

I then connected the Excel sheet to the Get request (note, first time it didn’t connect, so I did a Connection to Table writer and stored the output to a file, then used a Table Reader node to read the output from the Table Writer node (a bit convoluted but it worked))

The 2nd time around it connected (I think I may not have executed the Excel Reader node so the Get Request could not find the URL’s column).

So in the Excel Reader setup, after choosing the file, tick Table first row is column headers (item 4) and refresh the data(item 5) to see what the table will look like. Then Execute  the Excel Reader Node and then connect to Get Request Node.

As you can see, after running the Get Request Node there were errors about the

Anyway there were errors. The errors are shown in the console (items 10,11 & 12 in image below). Now it says illegal character at index 130. I hate counting up the characters to find index 130, so after doing this a few times I found this site for counting string lengths.So it identified the problem as a whitespace , so you need to replace whitespace with %20  and the string works fine in Knime.

  The Get request node configures to the URLs column & also go to the Request Headers tab and put in your API key.

The Get Request Output table shows the JSON column data from each of the rows of URL requests.

Configuring Output from  GET Requests

So we have now got the Request data in Knime. The next part is to order the data in a useful manner.

Now the next part I will just hop to, where I convert the JSON to XML. The reason is I have played around with the JSON export previously but could not get the JSON to modify to tabular data that I could put into a CSV or Excel table. If you can, well done.

So I take the JSON data and send it to a JSON to XLM node and then use a XPath Node. You can configure the Xpath node to selectively take data from the XML data.

In the JSON to XML node, although you can append an XML column, I find it easier just to replace the JSON column.

With the Xpath node you need to configure what information you want to extract from the XML file. You use the Add Xpath and configure it to what you want to extract. If you have multiple groups of data (in this case the <item> data) you can iterate through it using the [*]. If you only want, say an attribute such as IDCATEGORY_Name from the first <item> group then you’d use [1] (maybe [0], not sure if JS notation where index is 0 or 1 but you’d work it out).

So you do the above for all the data that you want to pull out of the Get request in a structured manner. Writing an Xpath Query for each column of data that you want.

So process for the Stats NZ data as follows:

Output to Excel file : 

Using a Pivot Table to display the data (similar to Stats NZ Site) . I have noticed that there is a pivoting node and a database pivot too,  so the pivoting may have been able to be done in Knime.

Also, instead of pushing to Excel, I could have pushed the Data to a DataBase, or else looked at some sort of Graphic display such as a Barchart.

End thoughts

When I started playing with API’s this is where I started, playing with Knime and trying to process the data, unfortunately I did not get too far at the time.

This time around, with the help of the Postman App I found it a lot easier to set up and use.

I do like Knime, as one you have the workflow set-up you just need to hook your source data to it and it usually runs pretty smoothly.

The next part will be to look at some graphical output the next time around. Also to explore some other API’s to see what data is out there to gather and play with.

Add a Comment

Your email address will not be published. Required fields are marked *