Category: Data Analysis

How to edit MetaData in Images for Building Survey Information

        I had a thought of embedding metadata into images after a building survey. This would capture the surveyors comments on the survey and store them in the the actual image files. So the next time the photo is inspected in the office the information on that image is easily inspected. Rather than looking for 2

Weka 2. Weka Machine Learning “Explorer” alternative interfaces “Experimenter”, “Knowledge Flow” & “Command Line”.

        Following on from the first Weka post, which was based on information gleaned from the Data Mining with Weka course that I followed. This post is based on the following More Data Mining with Weka videos. Some of  the screenshots below from the video’s that have been developed and are presented by Ian Witten of

Free WEKA machine learning algorithms for data mining tool

          In exploring the data analytics tools (Knime, Rapid Miner, FME, Orange..) there has been references to WEKA. Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression,

Regular Expressions/Regex for data cleaning

        A regular expression, regex or regexp A regular expression, regex or regexp is, in theoretical computer science and formal language theory, a sequence of characters that define a search pattern. Usually this pattern is then used by string searching algorithms for “find” or “find and replace” operations on strings, or for input validation. From Wikipedia.

RapidMiner Studio free Data Science Tool

        It provides a wealth of functionality to speed & optimize data exploration, blending & cleansing tasks โ€“ reducing the time spent importing and wrangling your data. RapidMiner provides an integrated environment for data preparation, machine learning, deep learning, text mining, and predictive analytics. RapidMiner Studio (Some information see item 18 of list).  This programme keeps

Orange 3. Text Mining basic exploration

        A few words of jargon in the Text Mining area. Corpus. In linguistics, a corpus or text corpus is a large and structured set of texts. They are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory. Token. Tokenization is the process of demarcating and

Orange free Data Mining Tool

         I was looking through 101-useful-websites article and came across AlternativeTo.net and used it to look up alternatives to say “Revit” and “AutoCad” and other tools I use. I then typed in KNIME which I use for data mining, data analysis and it came up with Orange as a free alternative.  So I looked at