Unveiling the inventive process from patents by extracting problems, solutions and advantages with natural language processing

Patents are the main means for disclosing an invention. These documents encompass many steps of the inventive process starting with the definition of the problem to be solved and ending with the identification of a solution. In this study we focus on three fundamental concepts of the inventive process: (A) technical problems; (B) solutions; and (C) advantageous effects of the invention, which, based on the WIPO guidelines, any patent should include. We propose a system based on Natural Language Processing (NLP) pipeline that uses transformer language models to identify technical problems, solutions and advantageous effects from patents. We use a training dataset composed of 480,000 patents sentences contained in sections manually labelled by inventors or attorneys. Our model reaches a F1 score of 90%. The model is evaluated on a random set of patents to assess its deployability in a real-world scenario. The proposed model can be used as a novel tool for prior art mapping, novel ideas generation and technological evolution identification and can help to disclose valuable information hidden in patent documents.

Unveiling the inventive process from patents by extracting problems, solutions and advantages with natural language processing

Giordano, Vito;Puccetti, Giovanni;Chiarello, Filippo;Pavanello, Tommaso;Fantoni, Gualtiero

2023

Abstract

Patents are the main means for disclosing an invention. These documents encompass many steps of the inventive process starting with the definition of the problem to be solved and ending with the identification of a solution. In this study we focus on three fundamental concepts of the inventive process: (A) technical problems; (B) solutions; and (C) advantageous effects of the invention, which, based on the WIPO guidelines, any patent should include. We propose a system based on Natural Language Processing (NLP) pipeline that uses transformer language models to identify technical problems, solutions and advantageous effects from patents. We use a training dataset composed of 480,000 patents sentences contained in sections manually labelled by inventors or attorneys. Our model reaches a F1 score of 90%. The model is evaluated on a random set of patents to assess its deployability in a real-world scenario. The proposed model can be used as a novel tool for prior art mapping, novel ideas generation and technological evolution identification and can help to disclose valuable information hidden in patent documents.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Settore Scientifico Disciplinare (validi fino a 24/06/2024)
	
				Settore INF/01 - Informatica
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
			
	Titolo Rivista
	
				EXPERT SYSTEMS WITH APPLICATIONS
			
	DOI
	
				https://dx.doi.org/10.1016/j.eswa.2023.120499
			
	Parole chiave
	
				Information Retrieval; Inventive Process; Language Model; Natural Language Processing; Patent Analysis;
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Unveiling the inventive process from patents by extracting problems, solutions and advantages with natural language processing - 1-s2.0-S0957417423010011-main.pdf Accesso chiuso Tipologia: Published version Licenza: Tutti i diritti riservati Dimensione 1.28 MB Formato Adobe PDF Richiedi una copia	1.28 MB	Adobe PDF	Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/131224

Citazioni

ND

21

16

22

social impact