Enhanced Information Extraction from Legacy Technical Drawings

D. Mooney, S. Farrell, G. Brown
Illumination Works, LLC, Ohio, United States

Keywords: 2D Engineering Drawing, Knowledge Extraction, Vision Language Model, Computer Vision, Optical Character Recognition

The Department of Defense (DoD) has millions of scanned legacy 2D technical drawings that capture important information needed for part manufacture and weapon system sustainment. Unfortunately, scanned drawing images do not have the required information in machine understandable form to support digital sustainment activities. At present, engineers manually review and extract data from the drawings. However, large scale manual extraction is infeasible due to the sheer volume of technical drawings to review. Illumination Works (ILW) created a product, Linnea, that extracts text and geometry from 2D technical drawings for determining the suitability of a part for additive manufacture. Linnea applies cutting-edge computer vision segmentation models to locate tables and text for extraction with high accuracy and geometry for part dimensions. Tabular data on the drawing includes the title block and parts lists. Non-tabular text includes information such as materials, part dimensions, notes, and processing and finishing specifications. Linnea uses Large Language Models (LLM) and Vision Language Models (VLM) to identify specific entities (e.g., drawing number, CAGE code, materials, finishing specifications) from extracted text and tables. Linnea then outputs results to a database to speed information retrieval and enable data linkages (e.g., part and assembly drawings).