Wednesday, 18 March 2009
Meeting With Mike
Today I received quite major news regarding the project. The IP rights issues are progressing, however aspects of the project have changed. The indexing part of it has been pretty much scrapped but the data mining part is still there. The idea now is to routinely crawl through the help file document set and identify which bits have changed from the last crawl through, the changes are the output in an RSS feed. This means that the direction of the project has changed various differencing applications will need to be investigated, filters for input and output data will need to be looked at as well as versioning and how to measure the amount a document has changed will need to be thought about (maybe some weighted tolerances in the config file). Mike has provided me with a junk XML creator, the first stage will to be to get this generating junk XML and deleting random elements and attributes in the XML.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment