MediaWiki

Converting Documents into Wiki Pages

In addition to an overall organizational structure, a wiki also should have some organization within its individual pages. The use of headings, subheadings, bold, bullet points, and the like make an individual article or page easier to scan, navigate, and read. Most of the internal page organization can be easily accomplished through the use of formatting and Wiki Markup. However, a problem that I’ve noticed is that a lot of users do not write their content directly in the UBC Wiki; instead they most likely use a word processor to draft and format their text and then copy and paste that text into the wiki. Unfortunately, word processor formatting does not often transfer well into MediaWiki. Thus, the user has to take additional time to reformat their content on the wiki. Often, though, they just leave their content as is, which can look pretty bad and be hard to parse.

I wanted to quickly point out a couple of tools make getting content from a document into wiki page a bit smoother. The first is an extension for Open Office, the free, open-source word processor program which is a pretty great alternative to MS Word. The Sun Wiki Publisher extension allows a person to type up a document in Open Office just as they would normally, and then save it in a MediaWiki format. All formatting, such as links, bullet points, and headers, is automatically converted to Wiki Markup. In my basic testing, this extension works really well and can handle even moderately complex tables.

I have yet to find anything that works as well for Microsoft Word. The easiest strategy seems to be to save the document as an HTML file and then to copy and past the HTML into an online Wiki Syntax converter and then copy and paste that output into the wiki. Novak recently recommended this HTML to Wiki converter and it works well, especially if you are using HTML directly from a website (which is how this UBC Wiki page was created). Unfortunately, when converting from Word to HTML to Wiki Mark-Up, the process is not quite as smooth and some reformatting or tweaking of the Wiki Markup seems to be necessary. Still, its better than having to create a table from scratch or insert a ton of links into a list.

I’ve created a couple of Help pages on these topics: Converting Documents to Wiki Syntax and Converting HTML to Wiki Syntax.

Update:: Brian points out in the comments saving a Word document into HTML and then converting it doesn’t work all that well. I agree and I’ll keep looking for better solution.

Standard

3 thoughts on “Converting Documents into Wiki Pages

  1. Yes, the process of transferring from Word is clunky at best. Both the tools linked off the “Converting HTML to Wiki Syntax” page didn’t work for Word HTML (and the “i love wiki” one stuck stuck Google Adsense code in for good measure.

    I also tried http://toolserver.org/~diberri/cgi-bin/html2wiki/index.cgi which resulted in this rather avant garde display: http://wiki.ubc.ca/OER_Lit_Review

    The Open Office export approach works near-perfectly!

    http://wiki.ubc.ca/OER_Lit_Review_Open_Office_Export

    Another great post, many thanks!

  2. Will says:

    Hi Brian – thanks for the feedback. I obviously didn’t test the Word conversion thoroughly enough (I don’t use Word normally). You’re right – the “saving as HTML then converting” method doesn’t work for any documents that any structures as complex as a paragraph. Word sticks a lot of code in that the HTML converters can’t parse.

    I did some additional testing this morning – you can see my results here:
    http://wiki.ubc.ca/Sandbox:Testing_MS_Word_to_Wiki_Conversion

    The best results I had were:
    -Saving the Word file as an RTF file, then saving the RTF File as a HTML File, then converting the HTML to WikiMarkup
    – Using the Word2MediaWiki add-in from here: http://word2mediawikidotnet.codeplex.com/. It works reasonably well with basic documents in Word 2007 (the other add-in’s I found weren’t supported past Word 2003), but, unfortunately, doesn’t convert tables as well.

    I’m going to take down the MS Word to WikiMarkUp Help article I had because the Word to HTML method is worse than copying and pasting. Right now, I don’t see any elegant solutions that are easy for a user to grok and can work with reasonably complex formatting like tables (well, except, for asking people to use Open Office).

  3. Will says:

    One other quick thing – I haven’t been able to replicate the i love wiki converter adding Google Adsense to any of the code I converted. Is it possible that HTML you were converting already contained the Adsense code?

Comments are closed.