Has anybody considered using LLM's to reverse engineer legacy (mainframe) code to generate documentation? Github copilot is quite good at providing low-level documentation, but we would love to take a large volume of mainframe code and generate the requirements/technical designs directly from the code. IE What business processes/designs are embedded in the code? My limited experimentation and research tells me I am being overly optimistic, but understanding legacy code is alas not an unusual requirement. So I would love to get insights from my peers.

6.2k viewscircle icon5 Comments
Sort by:
head of IT Architecture in Insurance (except health)4 months ago

we are searching for something similar. We found some vendor that tried to realize tool to relize documentation with AI/LLM solutions, and also system integrator that made valuable tool for their migration project, that they are extending as standalone product. But so far we are evaluating solutions to realize a PoC.

IT Analyst8 months ago

Any updates on this? Have you tried anything yet? I am doing similar research but haven't gotten very useful results yet.

Director of IT in Services (non-Government)a year ago

IBM Watsonx Code Assistant for Z has capability to generate documentation and/or transform COBOL to Java on Z. https://www.ibm.com/products/watsonx-code-assistant-z

Lightbulb on2
IT Manager in Educationa year ago

We worked with IBM last year to document a legacy PHP application using a combination of LLM (I think we settled on Llama in the end after testing quite a few) and traditional software documentation tools.  The output was a HTML based wiki which was a very helpful as we want to rearchitect but so much of the business logic was buried in thousands of lines of code.  At that time GitHub copilot was still in its infancy and wasn't the right tool for our task but if your code is in GitHub it's the first option to consider.  One of the challenges with this approach is that it can take a lot of effort (and a good data scientist) to set this up for a point in time snapshot of your code base and ideally you would want the documentation to be updated every time you push a code release.  I'd be interested to see if GitHub copilot addresses the lifecycle maintenance challenges of documenting legacy code too.

Finance & HCM Solutions & Data Architecture Senior Directora year ago

I believe it is possible, using market LLM models may not perform as expected, I would recommend funetune your own LLM modules using small specialized LLM (see StarCoder: A State-of-the-Art LLM for Code (huggingface.co)) or Mixtral (huggingface.co). or if you do not have a capacity, and because it is a mainframe code may be see Watson from IBM, they should have solutions to your needs. 

Content you might like

Highly specialized AI roles (e.g., small number of experts)29%

Distributed AI skills across all roles (e.g., some AI skills required for most roles)64%

We won’t use AI at all5%

Not sure2%

View Results