Web page capture and DMS

In the last article Revit links- Relative & URL I was looking at linking to data with URL’s which is either a linkage to an external file or a web page. In one video the gentleman was showing Revit links to manufacturers web pages, which is useful in the construction phase of a project. If you are embedding the links in the data in the design stage, then the construction phase starts shortly after, so the links should all be valid during the building phase.

But what about the Facility Management/operation phase? If the building lasts 50 years then most probably those links will become invalid, but for the operational and maintenance life of the building that information can be invaluable.

Link rot and average lifespan of a website are a couple of ways of looking at how long websites will exist. It was brought home to me when I was looking up a sliding door by Fletcher Aluminium who then rebranded to Altus (I think). Their trade literature may still link to the old URL’s with forwarding the links to the new site. This may work on a couple of iterations but somewhere in the process a link will break.

So how do you get valid Technical data on building elements that you can keep? In the last article I was trying to use a relative reference from a PDF file. The relative reference would have a folder with a PDF and it would link to a file in a sub-directory in the same folder. I was having difficulties getting this to work.

What got me thinking about that was this article by U.S. General Services Administration. They have a specific file structure for project standards and you put the model in the top directory and then link to all the sub-folders for information, as shown below, a lot of the sub-directories have multiple folders in them as well.

That is great for the Revit file and cross referencing, but what about if you have PDF’s? It also says you should use relative rather than absolute paths, so easy to copy the directory to other computers and allow links to work. Another aspect of the file structure is that it has to be replicated for the rest of the design/construction team otherwise the links wont work, so the other team members have to put in the data in their specific areas of the folder tree.

There is an inference that the data is in a file, so if the you are not linking to an absolute URL on the web. So if you are linking to a product manufacturers web page you need to download that page and store it in the folder structure as files.

An interesting article on different methods of saving web pages is The Best Tools for Saving Web Pages, Forever which has a nice diagram

image taken from this article: https://www.labnol.org/internet/archive-web-pages/20192/

How do you save web pages?

The simplest method is when you have a browser open and a tab with the link you want, then you can choose File>SavePageAs. This saves the html file to your computer, including a folder by the same name with all the associated files such as .js & .css files.

The top directory has the web page file but it also creates a sub directory with a lot more files. For that one web page there are 86 files in the sub directory

So, you can quickly get a lot of bloat if you download web pages using the
File>SavePageAs method.

Another method is to save it as a PDF using the print function in your browser. This is the web page:

This is the normal print file for that tab- there are 4 pages and this is the first page,

If you use the simplify button it looks a bit more rational about displaying the content,

The one below is a Firefox PDF print add-in, it looks a lot more like the actual web page, but split into 6 pages. To do formatting you have to buy the plugin so you can setup page and other features. There are other plugins, I haven’t tested them yet.

None of these methods are very pretty. I will need to do a bit more research on this.

How do you manage files and web pages ?

I went to a Wossat talk where a University PhD student was using Zotero to capture information for his thesis. It is a free tool and shows links to sites and files for reference in bibliographies. I downloaded this and had a wee play with it, it captures links and you can add notes, but it is really only a referencing tool.

When I was looking at OpenMaint it had a Document Management Server plugged into it called Alfresco that has a free community version. You have to set it up as a server, which I did on my VPS and I found it quite a useful tool. Powerful, but there was a learning curve attached to it. I was focused on using it with OpenMaint and within it you can store your actual documents. So a good repository for your information that you’d gather over a project.

End comment

I wanted to d o a quick post on collecting data for a Facility Management Project. Data that gets created at construction stage should be accessible throughout the life of the building. Looking at methods to capture transitory information is worth looking at.

I tried using OneNote Clipper and the Chrome Save to Drive , both create images, which are not that useful. The one method I thought was quite good was using the archive.org/web method that saves a copy of that URL to its server, so that it can be accessed later. The alternative site in the article I couldn’t access, but it would have been handy to download zip files to store.

Add a Comment