Monthly Archives for August 2013

Entity Extraction – URL

  For this entity extraction task my goal is to write a simple regex rule  to identify the most common URLs from the text documents. Example: , , ,    As said earlier I took time to understand the structure, that URL is composed of. Every URL consists of the following units: The schema name (commonly called […]