ECON RELEASE NOTES ================== Release date: 2008-03-18. AUTHORSHIP ---------- Econ is written by Gisle Hannemyr, University of Oslo. Copyright (c) 2008 Gisle Hannemyr This is free software released under GPL ver. 3.0. License: http://www.gnu.org/licenses/gpl.html The latest version will at all times be kept at: http://hannemyr.com/enjoy/ VERSION HISTORY --------------- Ver. 0.00 2008 Mar 18 [gh] Wrote it. BACKGROUND ----------- Econ consists of a set of files (see MANIFEST.txt for list of names). Its purpose is to automatically check the documents resulting from a conversion between the OOXML and ODF format and give a rough estimate of how similar they are. Econ is not a very sophisticated tool. It is less than 200 code lines (excluding comments), and only took me about an hour to write. However, it seems to work well enough. So far there has been no false negatives (i.e. files reported OK by econ with information loss - if you come across an example, please let me know), and only 18 % false positives (based upon the testset of 44 documents listed in the file dl_list.txt). As for the name of the tool, econ is of course a recursive acronym for "Econ Counts Officeconversion Neglects". For avoidance of doubt: The name "econ" has nothing whatsoever to do with the Norwegian research company "Econ" (who in a 2007 report claimed that it would take about 10 minutes to verify a single conversion between the OOXML and ODF formats), nor does it interfere with Econs trademark, as "econ" (the tool) deals in other realms (informatics, as opposed to economics, and facts, as opposed to fiction) than "Econ" (the company). UNPACKING AND SETTING UP ------------------------ The tool has been written and tested on a fairly standard Red Hat GNU/Linux installation and assumes that fairly recent versions of the following software is present in your environment. - bash - make - java - perl (LWP::Simple;) - gcc - saxon (http://saxon.sourceforge.net/) - gunzip (only for unpacking) - tar (only for unpacking) My apologies for not providing a Microsoft Windows version. I don't do Windows. The tool runs fine in my environment (Red Hat GNU/Linux), but I haven't tested its portability. YYMV. To install and use the tool, unpack the archive into a directory. You do this by typing (the "$" is the bash prompt - you don't type that): $ tar -xvf econ00.tar.gz This will create a directory named econ0.0 that should hold all the files in the distribution. Enter the directory by typing: $ cd econ0.0 The tool relies on a small C program and a subdisrectory named Unpack that need to be created before you can use the tool. You do this by typing: $ make One of the applications you need to run econ is not part of a standard GNU/Linux set up (I think). This is an XSLT processor. I tool is set up to use saxon, which is free, but I think any old XSLT-processor will do. The path to find saxon is set by the variable XSLT in econ.sh. After setting up saxon (or equivalent), you're done. You do all the work in the current directory, so make sure that it is in your path. To test your installation, type: $ make test You should see a response like this: econ.sh hello.odt hello.docx econ.sh ver. 0.0 Original format: odt. creating ooxml.txt creating odf.txt Neglect count: 0 The documents seem to be similar. If the test worked OK, your setup is done. USAGE ----- It is up you were you find the documents you test. However, the package contains the stand-alone Perl-script getdocs.pl for batch-downloading 44 sample documents from the web. The package doesn't contain any conversion tools, but you may want to download some conversion tools from these sites: http://www.sun.com/software/star/odf_plugin/index.jsp http://sourceforge.net/projects/odf-converter Provided that you have two files in the ODF and OOXML-format respectively, econ.sh will compare those two files and give you a number that indicate how likely it is that the conversion went well. Zero means it is very probable that all is well, a positive integer indicates a potential problem with the conversion. Example, say you have a documents in the ODF-format named foo.odt and you've converted it to the OOXML-format. The converted document is named foo.docx. To compare them, type: $ econ.sh foo.odt foo.docx Yes, the order matters. Since hello.odt is the original and hello.docx is the conversion, we list hello.odt first. You may, however, omit the name of the converted file if both files shares the same basename, so a shorter version of the command above is $ econ.sh foo.odt If econ see a potential problem, the output typically look like this: econ.sh ver. 0.0 Original format: odt. creating ooxml.txt creating odf.txt Neglect count: 103 Diffs: 101 FORMCHECKBOX: 2 Diffs are in: foo_d.txt. A "neglect count" greater than zero tells you that there may be problems with the conversion. Here, it is made up of two components. The first (diffs) is a general count of differences found in the PCDATA in the two files. The second (FORMCHECKBOX) tells out that there are some instances of an object type (i.e. "FORMCHECKBOX") I happen to know does not convert well. You may to look in the file foo_d.txt, which may provide clues about the cause of the problem. (At least after you gain some experience.) BUGS ---- No manpage. Sorry about that, my troff skills are getting rusty. The heuristics embedded in the tool reflects the current state (March 2008) of the converters available. They will need to be revised as the converters improve. There will be false positives, and there may be false negatives. Testing has not been extensive. Always do a visual inspection as well. DISCLAIMER ---------- Econ is just a collection of simple scripts I created to help myself when testing out various converters. It is a quick hack created to get a particular job done. It is not elegant, nor effective, and it might not even be error-free. THIS FREE TOOL IS PROVIDED AS-IS AND THE AUTHOR MAKES NO WARRANTIES OF ANY KIND, EXPRESS, IMPLIED, STATUTORY OR OTHERWISE, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR THE ABSENCE OF ERRORS OR OTHER DEFECTS, WHETHER OR NOT DISCOVERABLE. IN NO EVENT WILL THE AUTHOR BE LIABLE TO YOU FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, PUNITIVE OR EXEMPLARY DAMAGES ARISING OUT OF THE USE OF THIS FREE TOOL. ------------------------------------------------------------------------ # EOF Local IspellDict: british Local IspellPersDict: ~/.emacs_ispell/british.dict LocalWords: Econ gh txt Officeconversion Econs YYMV xvf sh odt ooxml odf pl LocalWords: getdocs foo basename Diffs FORMCHECKBOX diffs manpage LocalWords: DISCOVERABLE