I use Knime for re-structuring data. What is Knime? From their website:
At KNIME, we build software for fast, easy and intuitive access to advanced data science, helping organizations drive innovation.
Our KNIME Analytics Platform is the leading open solution for data-driven innovation, designed for discovering the potential hidden in data, mining for fresh insights, or predicting new futures.
It is a Free, open source analytics programme. I use its simplest processes and use it as a blunt instrument. For some methods I use it as an alternative to Excel, both in terms of basic worksheets and also VBA macros.
All of the methods have their place. I generally find Excel worksheets and formulas a bit delicate. An unintentional cut/paste can throw calculations out quite dramatically, such that you need to continually do a QA (quality assurance) check on previous steps. I never quite trust them, as I do not use Excel everyday and am poor at leaving comments on what I’m doing, so revisiting the sheets makes me nervous of their robustness.
Knime allows you to check output at different nodes, so that its quite easy, after you have a basic process operating, to do QA checks by checking data output at each node. So when you change input datasets you can verify that output is what you expect, and if its not easily track back to a specific node condition that may not have filtered sufficiently.
For test data I usually use text, csv or excel files as input data, this mainly because I have had great difficulty connecting to Databases. But that is a pain as most of my data resides in databases. So for initial steps I have had to export the tables to csv, then process it.
Now my data resides in different databases:
- For Data Capture with RTV Reporter Pro, this is linked to MS SQL Express 2012 this sits on my PC.
- For OpenMaint, this is linked to Postgres. This is on my PC an also my VPS and Amazon AWS ECS instance
- For WordPress, this is linked to MySQL, this is on a hosted webserver and my VPS.
Now, taking data from one SQL database and transferring it to another is not easy. I think the easiest way is to pay for a migration mapper (I have not found any free ones to date). So best if you can read the data by connecting to the original data source.
Within each environment you can create a backup and restore it to a different SQL instance somewhere else. So, for example, I can take a backup of the postgres instance on the amazon vps and restore it on the instance of my PC.
So linking database to Knime you need Database drivers that are appropriate for the Database type.
Some of these drivers are preloaded and in the knime setup. To find where they reside go to:
If you go to this page on their website you will find links to where to go to find the different drivers:
When you are looking to get drivers be aware , if you are in the windows environment whether you are running a 32 or 64 bit version of the database. I am running a 64 bit version of everything if I am able to (although I’m not sure about MS Office). There are different drivers for the 32/64 bit versions. Make sure you pick the correct one otherwise there is a world of pain ahead of you.
Download the drivers of your chosen Database system and install them in Knime.
Once installed,. you can then go to a new Knime workflow and add a Database Reader Node.
In the above example I have chosen to set up the MS SQL database connection.
The major issue with this is the Windows authentication. I have had massive problems getting this to work. I spent 2 days trying to get it to work with code on the Database URL such as:
So where the MS SQL needs:
& loading about 10 different drivers for MS SQL based on the Java SDK etc but I just cannot get it to work. It just does not accept the integrated security of the database which is based on Windows Authentication.
MS SQL Express
When I originally setup the MS SQL Express 2012 instance I has windows authentication only. When you do the preliminary setup there is a password, although you do not use it later. The original password is required if you want to alter anything in your setup later. So if you want to have SQL Server Authentication (user login & password ) you need the original password to allow this to happen.
NOW! In MS SQL Express you can have “windows authentication” or “windows authentication & SQL Server Authentication” (ie both). But if you want both, you need the original password. So I created a new Database instance with dual login. A learning experience (GRRRRR!!!).
Also once I’d made the new instance, I copied the specific database across from one instance to the other, and then had to give permissions for accessing that database
So that when you connect via “SQL Server Authentication” that that login has authority to access that particular database. Test in Server Management to see it works before testing it in KNIME.
So with my situation, with JRE 1.8 (Java 64 bit) and Microsoft JDBC Driver 6.2 for SQL Server the following Database URL works:
with login and password.
So, once I’d changed from “integrated security” I found that there was no issue with connecting to the database.
This one , I’d just use the driver that came with KNIME. It worked, out of the box for my instance of Postgres on my PC.
Two Items I’d like to comment on:
1. URL: jdbc:postgresql://localhost:5432/om (/database works)
2. In SQL statement, table needs to be in “” quotes.
Not got this to work yet. The only instance I have running is on my VPS.
I will have to see if I can get this to run somehow, or else make a MySQL instance on my PC and transfer the database across to see if that works.