The following are changes to the VOSON System (since version 0.5.13.0) which might impact on users.
version 0.6.0.1 - 28Jul2011
- the Bing Search API is now used for finding inbound hyperlinks and also for keyword searching for seeds sites
version 0.6.0.0 - 22Jul2011
- improved dialog for coding node attributes
version 0.5.17.10 - 17Jul2011
- crawler can be set to find inbound hyperlinks to internal pages discovered during the crawl (not just the seed URL)
- there is now a basic accounting system for use of the crawler. Users are assigned a certain number of "web crawl unit" (WCU) credits per month, and these get used up during crawls. You can see your current WCU credit at the Uberlink website and also by selecting Info->User in the VOSON System.
- on clicking the "ready to crawl" checkbox, an alert box showing the estimated WCU for the crawl and your available WCU credit is displayed. If you do not have enough WCU credit for the crawl, you will not be able to check the "ready to crawl" checkbox, and your crawl will not run. You can change the parameters of the crawl (e.g. number of seed sites, depth of crawl) to fit within your WCU credit.
version 0.5.17.7 - 05Jul2011
- bug fix relating to pagegrouping: now if you create a pagegroup such as "green.org", it will not include sites like "livinggreen.org".
- now you can change webminer parameters in the "add seed sites" dialog (previously you could only change them at the time when you create a new voson database).
- there is a "ready to crawl" checkbox in the "Add seed sites" dialog - check this once you have finished entering the seed sites and you are ready for the crawler to run (it will run after a delay which depends on how many other jobs are in the queue).
- every time the crawler runs, three databases are automatically created. If your voson database is called "test", on completion of the crawler the following will be created: testAN (voson-analysis database [nodes are pagegroups rather than pages]), testSeedsAN (subnetwork of just the seed sites), testSeedsImpAN (subnetwork of the seed sites plus "important" sites [degree of 2 or greater]).
version 0.5.17.5 - 01Jul2011
- you can now control the depth of the crawler i.e. the level it crawls to within a given seed site. The seed URL is assumed to be level 1. If you set the "depth of crawl (level)" parameter to be 1 (the default) then only the seed URL (typically, the homepage) will be crawled, while if you set this parameter to 2 then the seed URL and the internal pages that it links to will all be crawled (assuming other webminer parameters don't stop the crawler before hand). There is another webminer parameter "depth of crawl (pages)" - this has been available in VOSON for a long time, and sets the total number of pages to be crawled within a given site.
version 0.5.17.2 - 14Jun2011
- non-ascii text (e.g. characters from Français and 日本語) should now render properly in the Show Databases dialog, DataBrowser and Text Crosstabs.
version 0.5.17.0 - 13Apr2011
- all menu items are now visible but are enabled/disabled depending on whether voson or voson-analysis database loaded.
- can dowload GraphML network files.
- non-ascii characters in the database comment field (e.g. characters from Français and 日本語) will now render properly.
- database comment shows in "Current Database" dialog.
version 0.5.13.6 - 09Dec2009
- you can choose whether you want the text content collected from seed sites to be parsed (using the perl module Lingua::LinkParser) i.e. the extraction of words and word-pairs (colocated words). If you are not going to do any text analysis, it is good to uncheck the text parser option checkbox (in the "Create voson database" window) since it will make your crawl take longer to complete.
- when you create a voson database, a "default" voson-analysis database called xAN (where x is name of the voson database) is automatically created. Every time the crawler runs, this "default" voson-analysis database is automatically re-created. This saves you from having to creating the voson-analysis database every time you run the crawler (i.e. if you keep adding new seed sites).
- there is now a single window for adding seed sites (previously there were two windows, and two menu items). So from the "Add seed sites to voson database" window, you can either add seed sites you have typed in yourself or you can do a keyword search (via Yahoo) on seed sites.
- a bug that was preventing text content (body words, and word-pairs) from being extracted from the crawled webpages was fixed. If you find that body words and word-pairs are missing in your database, even though you set the webminer to parse the text content using Lingua::LinkParser, and your need words/word-pairs for your research, contact us and we will attempt to repair the database.
version 0.5.13.4 - 01Dec2009
- you previously had ensure there was a trailing "/" to the URL if the URL is for a directory, rather than a specific page (i.e. http://voson.anu.edu.au/ was legitimate, but http://voson.anu.edu.au was not). This is no longer the case. Now if the URL is a hostname only (e.g. http://voson.anu.edu.au), then the trailing "/" is automatically added, but otherwise it is not.
version 0.5.13.3 - 29Nov2009
- when entering seed sites, you no longer need to include the "http://" at the start of the URL - the VOSON System will add this automatically
- you now have two ways of entering seed sites: "manually input seeds" (this is where you already know the URLs) and "search for seeds". The latter does a Yahoo search on a particular search term, returning the first 100 URLs matching the query. So, if you use the search query "climate change" you will get the first 100 pages returned by Yahoo. You can then input these as seeds (you can edit or delete any irrelevant URLs first).
version 0.5.13.2 - 24Nov2009
- now you can control the webminer parameters e.g. choose whether to collect inbound links and/or outbound links, set the depth of the crawl etc. You set these parameters when you create a new voson database.
version 0.5.13.1 - 21Nov2009
- ugly navigation arrows (for panning, zooming etc.) have been removed from the maps - now you can enter multiple seed sites at the one time (in the "Add seed site" window, place each seed on a separate line)