I would like to have a software application in order to be able to extract text from some files with extension .xps
Please find attached a sample from this typical file.
Each article has to be a distinct text ( in .htm is a possibility or in other format but to be able to identify each article)
Each article from the document starts with the expression : "SOCIETATEA COMERCIALA" and end with a number between brackets . Ex. (1/[url removed, login to view])
Each file contains at least 50 articles.
The background stamp " autentic monitor " should be eliminate.