Table of Contents

My previous CMS was written in GetSimpleCMS, which was effective but used its own markup syntax. The following script was quickly written to migrate pages to the new CMS.

The Procedure

This procedure is a bit manual, but for a page with only a handful of pages it is sufficient.

  1. Open the HTML source of the existing page rendered by GetSimpleCMS
  2. Copy the content section into an HTML to Markdown translator (https://www.browserling.com/tools/html-to-markdown)
  3. Copy the output markdown to C:\Temp\gravscript\scriptin.md
  4. Run the following script (python 3)
  5. Migrate the output markdown into Grav CMS
  6. Verify output and make adjustments as needed
import os
import datetime as dt
import re
from datetime import datetime

def TreeCrawler(FolderLoc):
    FileList = []
    try:
        os.chdir(FolderLoc)
        for root,dirs,files in os.walk(FolderLoc):
            for filename in files:
                if filename.endswith('.md'):
                    FileList.append(filename)
    except IOError:
        print("Issue opening folder")
    return FileList

def main():
    InFile = 'C:\\Temp\\gravscript\\scriptin.md'
    OutFile = 'C:\\Temp\\gravscript\\scriptout.md'
    fout = open(OutFile, 'w')

    fin = open (InFile,'r')
    for line in fin:
        if "[![" in line:
            lineedit = line.replace("/",")").split(")")
            line = "![](" + lineedit[-3] + "_" + lineedit[-2] + "?lightbox&cropResize=600,600) {.center}\n"
            # lineedit = line.replace("]","[").split("[")
            # line = "![](" + lineedit[2] + "?lightbox&cropResize=600,600) {.center}\n"

        re_date = re.compile(r"\d{4}\.\d{2}\.\d{2}")
        found_date = re_date.match(line)
        if found_date:
            date = found_date.group().replace(".","-")
            line = line.replace(found_date.group(),date)

        fout.write(line)    

if __name__ == "__main__":
    main()

Previous Post Next Post